Paul Melson's Blog: logs

Showing posts with label logs. Show all posts

Thursday, May 20, 2010

The SIEM Market Discussion Continues

Bill Roth of LogLogic commented on my Twitter exchange with Rocky DeStefano of Visible Risk where we talked about LogLogic's announcement that they were discounting their SIEM product. I then wrote a reply, and it got a little long. So I made it a blog post instead.

Rocky, Paul:
The ClueTrain Manifesto calls markets "conversations", so here goes.....

I think you're falling into a the trap of "conventional wisdom". First off, the basic assumption that the world falls neatly into the SIEM categorization is just plain false. I stand by LogLogic's model....it all starts with log management as the crucial piece, without that key use cases like network forensics are not even possible. Second, the notion that dropping the price is bad is just plain weird. Is LogLogic dropping the price to sell more? Sure we are. Are we dropping the price to take market share? Sure we are. Are we seeing a great response? Sure we are. Since when is saving people money a bad thing?

And we're always interested in a podcast. :)

Bill Roth, EVP
LogLogic

Hi Bill,

Thanks for the comment! And thanks for participating in the dialogue. I think it's awesome that LogLogic is out front and engaging on its business decisions. Very refreshing!

As to your point about log management being that crucial initial component of a SIEM implementation, I agree completely. Log management has also developed as its own market segment as well, independent of SIEM. But I don't need to tell you that. :-)

On the topic of LogLogic's decision to discount its SIEM product, I didn't mean - and I don't believe Rocky did either - that charging less for SIEM is bad, or even a bad business move.

That said, I do believe that for some significant portion of potential customers log management is a commodity technology. However, from my own experience and from everything I've seen to date, SIEM is not a commodity technology, and I'm not convinced it will be. As such, I don't see price as a strong competitive differentiator in the SIEM market.

Following the recent recession, where IT capital budgets still haven't caught up to the (hopefully sustained) economic upturn, I imagine the feedback on LogLogic's price cut has been positive, and that you'll see some SIEM sales where you wouldn't have but for the discount. But in the mid- to long-term, I have my doubts as to whether there is any meaningful gain in market share to be had for LogLogic - or any SIEM vendor for that matter - simply by competing on price with other SIEM vendors.

Let's be frank, if price were a big piece of why companies choose a particular SIEM, Cisco MARS would have the lion's share of the market and ArcSight would be folding. Instead, it's the other way around.

Twitter Killed the Blog Star

I've been really busy both in my personal and professional life for the past year or so, with no signs of slowing down soon. But I have to acknowledge that the main reason my blog posts have fallen off is Twitter. Now, all of the ideas that I have that I might have developed and expanded into a blog post are prematurely evaluated for length. If they can be abbreviated to a couple of 140-character haikus or less, they go on Twitter. Which means they never grow up to be blog posts. They're like the high school dropouts of ideas.

But every once in a while, a Twitter exchange becomes so interesting that, despite the compressed and fleeting nature of Twitter, it turns into something worthy of framing. The other night, Rocky DeStefano of Visible Risk and I had an exchange on SIEM that I thought the wider world might find interesting. The background to the conversation is this post from Rocky's blog about the recent announcement from LogLogic that they were discounting their SIEM product, and then this responding blog post from LogLogic.

rockyd
The LogLogic response ->> http://bit.ly/bAQSZO to my discounting SIEM Post ( http://bit.ly/aiW3kB )
8:47 PM May 18th via TweetDeck

rockyd
I need to noodle on the LogLogic response more. I appreciate the conversation, I think I may see the opposite end of the customer spectrum.
9:02 PM May 18th via TweetDeck

pmelson
@rockyd I think you nailed the issue. If you *NEED* SIEM, you won't compromise features/functionality for capital cost savings.
9:06 PM May 18th via TweetDeck

pmelson
@rockyd If Cisco couldn't make "Free SIEM With Purchase" work, it's not ever going to work.
9:07 PM May 18th via TweetDeck

rockyd
@pmelson let's be honest how could they possible respond any differently than they did? time for a podcast on the subject ?
9:50 PM May 18th via TweetDeck

pmelson
@rockyd They could just fess up. "We're shipping log management appliances, but SIEM isn't moving. So we put it on clearance sale." :-)
9:52 PM May 18th via TweetDeck

pmelson
@rockyd I think with Gartner's SIEM MQ being released, we're about to see another round of SIEM casualties as VC pulls out.
9:54 PM May 18th via TweetDeck

rockyd
@pmelson There has to be quickening soon, there is way too much of the same thing in the market.
9:57 PM May 18th via TweetDeck

pmelson
@rockyd Right. I've been thinking about the key SIEM differentiators and I've only got three.
10:00 PM May 18th via TweetDeck

rockyd
@pmelson which three?
10:06 PM May 18th via TweetDeck

rockyd
@pmelson Like - Sources, Scalability, Analytical Usage, Correlation / Statistical Evaluation, and getting Intelligent information out?
10:08 PM May 18th via TweetDeck

pmelson
@rockyd 1) performance/scalability 2) UI and drill-down 3) supported sources.
10:07 PM May 18th via TweetDeck

rockyd
@pmelson there are some others like context of Host, Vuln, Registry, Applications and Users that lead you towards more advanced usage
10:09 PM May 18th via TweetDeck

pmelson
@rockyd OK, so asset data model(s) makes 4, pre-defined content is 5? That's still not a lot.
10:15 PM May 18th via TweetDeck

rockyd
@pmelson each is several years of development and refinement with customers.
10:32 PM May 18th via TweetDeck

rockyd
@pmelson this comes down to a compliance check box sale versus a security team needing to integrate a tool into their process.
10:35 PM May 18th via TweetDeck

pmelson
@rockyd Agree. But a handful of differentiators == a handful of potential market leaders. Time to thin the herd. Again.
10:42 PM May 18th via TweetDeck

rockyd
@pmelson now I see where you're headed. BTW I think you'll see 3 more acqusitions by end of year.
10:45 PM May 18th via TweetDeck

rockyd
I was thinking about creating a "vegas odds" website for SIEM Quickending and donate some portion of the funds to HFC.
10:47 PM May 18th via TweetDeck

pmelson
@rockyd A SIEM futures market? Very DARPA!
10:49 PM May 18th via TweetDeck

So there, for your parsing and edification, some thoughts on the SIEM product space, the recent Gartner MQ for SIEM, and the near-term ramifications of Gartner's paper on the market.

Also, if you aren't already, you should be reading Rocky's blog, especially if you're interested in SIEM and security ops. Rocky's a guru in this space, and in addition to his blog he has already put together some great podcasts since launching his latest venture, Visible Risk.

Wednesday, November 18, 2009

ArcSight Logger VS Splunk

You are here because you are searching for information on Splunk vs. ArcSight Logger. I actually wrote this post months before posting it, but sat on it for reasons that may become apparent as you read on.

If you want to hear me talk about my experience with Logger 4.0 through the beta process and beyond, you can check out the video case study I did for ArcSight. In short, Logger is good at what it does, and Logger 4.0 is fast. Ridiculously fast.

But that's not what I want to talk about. I want to talk about the question that's on everyone's mind: ArcSight Logger vs. Splunk?

Comparing features, there's not a strong advantage in either camp. Everybody's got built-in collection based on file and syslog. Everybody's got a web interface with pretty graphs. The main way Logger excels here is in its ability to natively front-end data aggregation for ArcSight's ESM SIEM product. But if you've already got ESM, you're going to buy Logger anyway. So that leaves price and performance as the remaining differentiators.

Splunk can compete on price, especially for more specialized use cases where Logger needs the ArcSight Connector software to pick up data (i.e. Windows EventLog via WMI, or database rows via JDBC). And if you don't care about performance, implying that your needs are modest, Splunk may be cheaper for you for even the straightforward use cases because of the different licensing model that scales downward. So for smaller businesses, Splunk scales down.

For larger businesses, Logger scales up. For example, if you need to add storage capacity to your existing Logger install, and you didn't buy the SAN-attached model, you just buy another Logger appliance. You then 'peer' the Logger appliances, split or migrate log flows, and continue to run search & reporting out of the same appliance you've been using, across all peer data stores. With Splunk? You buy and implement more hardware on your own. And pay for more licenses.

My thinking on performance? Logger 4.0 is a Splunk killer, plain and simple. To analogize using cars, Splunk is a Ford Taurus for log search. It gets you down the road, it's reliable, you can pick the entry model up cheap, and by now you know what you're getting. Logger 4.0, however, is a Zonda F with a Volvo price tag.

To bring the comparison to a fine point, I'd like to share a little story with you. It's kind of gossipy, but that makes it fun.

When ArcSight debuted Logger 4.0 and announced its GA release at their Protect conference last fall, they did a live shoot-out of a Logger 7200 running 4.0 with a vanilla install of Splunk 4 on comparable hardware and the same Linux distro (CentOS) that Logger is based on. They performed a simple keyword search in Splunk across 2 million events, which took just over 12 minutes to complete. That's not awful. But that same search against the same data set ran in about 3 seconds on Logger 4.

This would be an interesting end to an otherwise pretty boring story if it weren't for what happened next. Vendors other than ArcSight - partners, integrators, consultants, etc. - participate in their conference both as speakers and on the partner floor. One of these vendors, an integrator of both ArcSight and Splunk products, privately called ArcSight out for the demo. His theory was that a properly-tuned Splunk install would perform much better. Now, it's a little nuts (and perhaps a little more dangerous) to be an invited vendor at a conference and accuse the conference organizer of cooking a demo. But what happened next is even crazier. ArcSight wheeled the gear up to this guy's room and told him that if he could produce a better result during the conference that they would make an announcement to that effect.

Not one to shy away from a technical challenge, this 15-year infosec veteran skipped meals, free beer, presentations, more free beer, and a lot of sleep to tweak the Splunk box to get better performance out of it. That's dedication. There's no doubt in my mind that he wanted to win. Badly. I heard from him personally at the close of the conference that not only did he not make significant headway, but that all of his results were worse than the original 12 minute search time.

You weren't there, you're just reading about it on some dude's blog, so the impact isn't the same. But that was all the convincing I needed.

But if you need more convincing; we stuffed 6mos of raw syslog from various flavors of UNIX and Linux (3TB) into Logger 4 during the beta. I could keyword search the entire data set in 14 seconds. Regex searches were significantly worse. They took 32 seconds.

Wednesday, September 23, 2009

Queries: Excel vs. ArcSight

Since ArcSight ESM 4.0, reports and trends have been based on queries. Considering that ESM runs on top of Oracle, a query in ESM is exactly what you think it is. Queries are an extremely flexible way to get at event data. But as the name implies, they go against the ARC_EVENT_DATA tablespace, and therefore you can't use them to build data monitors or rule conditions, since those engines run against data prior to insertion into the database.

Anyway, I've got a story about how cool queries are. And about how much of an Excel badass I am. And also about how queries are still better. Last month, I got a request from one of our architects who was running down an issue related to client VPN activity. Specifically, he wanted to know how many remote VPN users we had over time for a particular morning. Since we feed those logs to ESM, I was a logical person to ask for the information.

So I pulled up the relevant events in an active channel and realized that I wasn't going to be able to work this one out just sorting columns. So, without thinking, I exported the events and pulled them up in Excel. So here's the Excel badass part:

If you want to copy it, here it is:
=SUM(IF(FREQUENCY(MATCH(A2:A3653,A2:A3653,0),MATCH(A2:A3653,A2:A3653,0))>0,1))

So A is the column that usernames are in. This formula uses the MATCH function to create a list of usernames and then the FREQUENCY function to count the unique values in the match lists. You need two MATCH lists to make FREQUENCY happy because it requires two arguments, hence the redundancy. It took about an hour for me to put it together, most of that was spent finding the row numbers that corresponded to the time segment borders.

But as I finished it up and sent it off to the requesting architect, I thought, there must be an easier way. And of course there is. So here's how you do the same thing in ESM using queries:

So, it's just EndTime with the hour function applied, and TargetUserName with the count function applied, and the Unique box (DISTINCT for the Oracle DBA's playing at home) checked. And then on the Conditions tab you create your filter to select only the events you want to query against. That's it.

Once the query is created, just run the Report Wizard and go. All told, it's about 90 seconds to the same thing with a query and report that it took an hour to do in Excel.

Sunday, September 20, 2009

The 'Cyberwarfare' Problem

Last week I attended ArcSight's annual user conference in Washinton DC. More about that in a later post. During the conference, ArcSight hosted a panel discussion on cyberwarfare. In DC, where many of ArcSight's biggest customer are based, this is a hot topic, and there will be a lot of time spent discussing it and a lot of money spent on defending against it, maybe.

What struck me about the panel discussion were two comments, both made by James Lewis, one of the panelists, and a director at the Center for International and Strategic Studies. At one point, Mr. Lewis invoked Estonia as an example of state-sponsored cyberwarfare, and made the comment that, "the Russians are tickled that they got away with it." Not ten minutes later, an audience member asked a question about retaliation against cyber-attacks. Mr. Lewis responded to the question by pointing out the problem of attribution. That is, from the logs that the victim systems generated, the IP address(es) recorded can't reliably be used to identify the actual individual(s) responsible for the attack.

Now, I don't intend to pick on James Lewis. It just so happened that one person on the panel expressed the paradox of cyberwarfare. The attribution problem is a big problem for all outsider attacks, not just cyberwarfare. A decade ago, security analysts were calling it "the legal firewall" because US-based hackers would first hack computers in China, Indonesia, Venezuela, or another country that doesn't openly cooperate with US law enforcement, and then hack back into the US from there, causing an investigative barrier that would hinder or prevent an investigation being able to get back to the attacker's actual location.

So knowing that there's a very real problem with being able to identify the source country for Internet-based attacks, it stands to reason that using the same limited forensic data to not only identify the actual source of an attack, but to determine that it is in fact state-sponsored, and not, say, a grassroots attack armed by a teenager, is a stretch. And for that reason, the question of cyberwarfare is an open one. Until a government actually comes forward and claims responsiblity for an attack, it's unprovable.

So as the government spends $100M on cyberdefense over the next six months, it's important to try and answer the question, "What is the military actually defending against?" At the very least, it's fair to say nobody knows for certain.

Wednesday, August 12, 2009

Inbox 3

Teguh writes,

Hi Paul,
could you give some guide to administering logger? i searched thru
google, but found nothing significant. How to(s) and tutorial would be enough i
guess. Does it have to have syslog server for the logger to be able to read data
from?
Thanks..

The documentation for Logger is available from ArcSight's download center. Only registered customers have access, but I assume that if you've got a Logger box, that generally qualifies you.

With regard to your second question, yes Logger has a syslog server. It actually has a few. In Logger nomenclature these are "receivers." Logger supports UDP and TCP syslog, FTP and SSH file pull, NFS and CIFS remote filesystem. Logger also supports some ArcSight-specific receivers including a SmartMessage receiver for events forwarded from ESM and CEF-over-syslog (OK, ArcSight wouldn't agree that this is specific to their products, but despite the C standing for Common, CEF is anything but. At least right now.)

Configuring Logger to act as a syslog server is pretty straightforward.
From the web interface, navigate to Configuration, Event Input/Output.
On the "Receivers" tab, click the Add button.
Name your connector and set the type as "UDP Receiver" then click Next.
The defaults for Compression Level and Encoding are fine. Select the IP address you want the listener to reside on, and set the port number. The default syslog server port is UDP/514.
Click Save.
On the "Receivers" tab, click the little no-smoking image next to the new receiver to enable it.

Thursday, June 11, 2009

From The Inbox 2

lmran writes:

Hi Paul,
Do you know any reason why ArcSight ESM does not support the Cisco MARS? Right now, all my firwalls send the syslog feeds into Cisco MARS and I'm trying to set the Cisco MARS to send thoes raw feeds data to ArcSight local connector but I just found out that ArcSight does not support the Cisco MARS. Thanks in ADV for any info reading this subject.

Starting in 4.x, MARS can forward events to another remote syslog listener. ArcSight has a syslog connector. So you ought to be able to forward events from MARS to ArcSight via syslog assuming MARS doesn't change the format of the log events too much. Even if MARS does mangle the event format, ArcSight will still receive them, but then most or all of the event will be parsed into the CEF Name field and categorization and prioritization won't be accurate.

If you are unable to upgrade your MARS appliance to 4.31 or later (I think that's the rev you need), another option would be to use a syslog-ng server out front. It supports forwarding events by source to other syslog servers. You could use this to send the stuff you want in ESM to ArcSight's syslog Connector and the stuff you want in MARS to MARS.

Or, you could do the environmentally conscious thing and unplug then recycle your MARS appliance. ;-)

Tuesday, June 9, 2009

From The Inbox

Anonymous writes,

Hi Paul, I am one of those who, as you say, found your blog by googling ArcSight, trying to do some recon on the product for my employer. (I think I see that the most recent posts here are from 2007 so who knows if you or anybody will be seeing my question.) I'm trying to find out, can Arcsight's data be queried programmatically; i.e. is it stored in a relational database, hopefully SQL Server or Oracle, or if not, is there an API or ADO.NET provider that can allow it to be queried, preferably with SQL? Thanks for any info anyone reading can provide.

ArcSight ESM uses Oracle 10g for its back-end database. At one point, and this may still be true, DB2 was also supported. You can query the database directly, and the schema is pretty straightforward. The table ARC_EVENT_DATA is where most of the event data lives, for example. But depending on your use case, that might not be the best way to get data out of ESM.

Also, since you didn't specify, it may be worth mentioning that the same is not true of the ArcSight Logger platform, which is flat storage. Instead of querying the log store directly, Logger can be configured to forward events based on source, type, etc. to another destination, if you need them in real-time. There is a PostegreSQL database on Logger, but it's my understanding that it supports the reports engine, and doesn't store the raw or CEF events in any comprehensive way.

The interesting thing is that the storage technology behind Logger 3.0, because of its performance and relative "cheapness" may become the data store for ESM down the road. It would only make sense, since you could handle MUCH higher event rates with less disk and no Oracle license fee. If it can be done while maintaining the stability and feature set that the Oracle-based data store has, it's a walk-off home run for ArcSight.

Tuesday, September 9, 2008

ArcSight User Conference 2008

I'm on the floor of the ArcSight "Protect '08" conference this morning. Tim and I gave our talk on ArcSight ESM Tools yesterday, and I will post some version of those slides and some of the code after I return from the conference.

Right now I'm listening to Hugh Njemanze give his keynote on product lines. There's a lot of interesting stuff in the release pipe; Logger 3.0, ESM 4.5, a new Connector appliance, IdentityView content for ESM, and something called "McLovin."

Anyway, here's what's been good so far:

Customer presentations (other than mine, I mean) - I missed out last year, these are the best talks so far.
Location - the new hotel is within walking distance of stuff (and by stuff I mean not trees and the NSA.)
Networking - Always the best part of this conference. I love standing around with free beer, talking to other folks about what they're doing with their SIM, and sharing ideas. Looking forward to more tonight.

Here's what's been not-so-good:

Wireless - the hotel wireless has been unreliable and overloaded. Frankly, I'm surprised I've been able to stay on long enough to get this post up.
Vendor/sponsor floor - no offense to these guys, but the freebies this year are unimpressive. I've already got a pen, thanks.
No bag - Instead of a "conference bag," everyone was issued a plastic file folio thing. Not that I needed another bag, but I can't smoosh the one foam squeezy thing I did get from a vendor booth into this blue plastic thing.

And I would be remiss if I didn't drop a product scoop or two:

Logger 3.0 has adopted a more-ESM-like boolean filter interface. Big improvement over the chained-regex search in 2.5 and earlier.
Demo of Logger 3.0 shows that searches of data (no details on data set) are roughly 80x faster than a similar sized search on 2.5. (The claim is 100x faster, but I counted. Still, that's a significant improvement.)
Hugh has hinted that the slick, high-performance append-only storage stuff that Logger has is going to be integrated into ESM in some release beyond 4.5. That could mean the end of the Oracle / PartitionArchiver storage model. It won't be missed.

Friday, June 27, 2008

I'm Floored: Raffael Marty declares that SIM is dead.

No really, he said it. He would've been on the short list of people I assume would never say it. But there it is.

Here's the thing; I think that this is a lot like Gartner's IDS declaration (which he cites). IDS went through some product positioning changes (IPS, UTM, DLP, etc.) but the core idea and technology is still there, and guess what? The original IDS use case is still viable. Sure the attacks have changed, but having a sniffer that can search for known-bad and known-strange traffic on the wire is very, very useful.

So I assume that we are in the midst of a product positioning shift around SIM. Raffy's point that SIM schema are IP-centric and rules are based around correlating firewall and IDS events is true. But most of the vendors have already acknowledged this and are developing content to focus on other log sources. Either way, the use case is here to stay - the ability to search and correlate log events is highly useful, and will continue to be. You may call it "SIEM" or "IT Search" or "log management," but it's the same core concept, repurposed to address the constantly changing security environment.

One final note for vendors from the SecOps trenches: I am not open to a replace/resell on the basis that SIM is old and whatever-you-call-it-now is new and better. My SIM, like my IDS, contains custom content that our team has developed to keep on top of changing threats, including application attacks. SIM, like IDS, succeeds when you put talented security professionals in front of it and let them tune it and manage it like a tool. But it will fail miserably if you are hands-off with it.

Monday, June 16, 2008

ArcSight Logger Face-plate-lift

Not only did ArcSight deliver on the improved UI and feature set for Logger 2.5, but like any good appliance vendor, they've popped their collar. And by collar I mean front bezel.

The red with blue LED scroller is definitely a better look than the previous model. Still no backlit logo, though. :-)

Friday, June 13, 2008

ArcSight's new Logger Apps

ArcSight is releasing the Logger 2.5 software here soon, and along with it new appliances with some interesting variations. You can check out the vitals on the ArcSight website here.

Prior versions of Logger were available in small, large, and SuperSized, where the SuperSized box was the same spec as the large box with artificial limitations removed via license key. So really only two boxes, all self-contained, all CentOS, all MySQL.

Now, there's a whole new batch. It would appear by the naming designations that they are going after PCI compliance heavily with the L3K-PCI, which must have retention policies and capabilities that make it easier to comply with PCI-DSS 5.2. Another model supports SAN-attached storage and Oracle, so you can grow your Logger with SAN instead of NAS. And finally, there are two new L7100 models with 6x750MB drives. If I'm doing my math right, that works out, after compression, to about ~~40TB~~ 36TB of log storage. That's a significant increase over the ~~15TB~~ 12TB that the large/SuperSized L5K boxes shipped with.

Update: Talked with Ansh at ArcSight today, and aparently the 2.5 software adds columns to the CEF event view. That's a big deal for folks using CEF events in Logger, and may make CEF-only the preferred format for most Logger users. The new software also includes real-time alert views (like Active Channels in ESM), as well as a number of other enhancements to alerts and search filters and more. Current customers can download 2.5 from the software site.

Sunday, June 1, 2008

From My Inbox... (ArcSight Connectors & Logger)

SC left a comment on an earlier post on ArcSight Logger and CEF vs. Raw formats...

Hi Paul,

Do you know if its possible to "insert" logs into the logger or SmartConnector if the logs are on a physical storage, e.g. DVD or external storage?

Thanks.

Kind Regards,
SC

There are probably a number of ways to do this, but I've only tested one. In earlier configurations of our syslog infrastructure, there were a couple single points of failure. In order to meet log analysis commitments, we would reload lost syslog data from file.

Start by configuring a 'Syslog Pipe' Connector. Since the connector only has to be online with the Manager when you're manually inserting logs from file, you have greater flexibility about where this Connector will live. It lived on my laptop for a while. When you set it up, point to a path that isn't already used for anything else. Then you can simply start the Connector and pipe the raw log file(s) to the named pipe:

# cat oldsyslogs.txt >> /var/spool/my_arcsight_pipe

Depending on how far the date/time stamps of the events in those files are from $Now, ArcSight will probably throw some errors. It will maintain the "End Time" from the raw log events, and apply "Manager Receipt Time" as the time the manager collects the parsed events from the Connector. This will absolutely screw up any correlation rules you wanted these events to be subject to. Sorry, no easy way around that.

Wednesday, April 30, 2008

Can the Media Move the Ball on HIPAA?

I finally have a serious prediction for 2008: I predict that unauthorized access of medical records will be the new lost laptop story.

Reporting on the compromise of data through laptop loss/theft over the past few years has raised public awareness around data breaches and disk encryption. The upswing in incidents involving hospital employees accessing celebrity medical records will have a similar affect on awareness. I mention this because a former UCLA Medical Center employee was indicted yesterday on charges stemming from similar activity. What made this a criminal case and not just another firing is that the employee sold these records to a "media outlet" (tabloid).

The reason this is significant is that stories like this in the media raise public awareness about HIPAA requirements and medical provider capabilities. Those capabilities being the ability to review who accessed a patient's medical record and when, and that the hospitals have a way of determining whether or not the access was appropriate. The end result will likely be two-fold. First, more patients will be aware of these capabilities, and will start doing things like asking doctors and hospitals for this information. And secondly, the hospitals that aren't currently reviewing the logs from their EMR systems will feel some pressure to start doing so.

Friday, April 4, 2008

ArcSight Logger: CEF vs. Raw

Here's something for potential ArcSight Logger customers to ponder. The issue is whether you should use CEF formatted logs (post-Connector) or raw logs (pre-Connector) or both in your Logger environment. In this case, a picture is worth at least a few hundred words:

If you look carefully at that image, you can see that it shows the same event in both its raw syslog format and it's Connector-ized CEF format. From my point of view, it boils down to use case. Analysis versus troubleshooting. Reporting versus response. The CEF formatted message is chock-full of metadata-and-labeling goodness. It's also overkill on the eyes. Log messages are already cryptic to the point of questionable usefulness. CEF amplifies that. The raw format, on the other hand, is easier to read due largely to the fact that it's what your UNIX admins are used to seeing. But that's where the positives end. Raw syslog is all but unformatted and trying to write a small chain of regexes that do a good job of parsing large quantities of syslog is a headache and a half.

Of course, you may have already realized that there is a right answer to this problem: Do both. Sure there's some overhead to consider, since you're going to pass syslog to a Connector that will then send raw events to Logger, CEF events to Logger, and CEF events to ESM if you have it. Or you could send raw syslog to Logger, have Logger forward it to a Connector and then configure the Connector to send CEF to Logger and ESM. There are probably many other complicated flows that you could implement as well, but you get the idea.

Sunday, February 24, 2008

Anonymous writes... (ArcSight Resources)

Anonymous writes,

Hi Paul,

I would like to ask if you know of any resources I can reference for ArcSight correlation rules authoring.

In particular, I am looking for Web App and VOIP Security. Thanks in advance.

So first of all, there's an unfortunate shortage of sources on building content for ArcSight. It's part of why I blog about it, because there are only a few people putting information out there. And if SIM's in general are going to mature, then best practices and an open community are part of that maturation. Besides blogs like mine, the ArcSight forums are a good place to get questions answered and share content. Beyond that, I would highly recommend the annual User Conference that ArcSight puts on. For those that can't attend the User Conference, the slides are published to your software site, and definitely worth downloading. And of course ArcSight's own training offerings. But that is pretty much the extent of resources available at the moment.

As far as ways to monitor Web Apps and VoIP Security with ArcSight, it's going to boil down to the log sources you have available. Here are a couple of ideas I have off the top of my head.

For Web App there are tons of optiions. ArcSight works with several web security proxies, IIS and Apache, most IDS/IPS products under the sun, web app servers like Weblogic and WebSphere, and the more popular commercial databases like Oracle, MS-SQL, and DB2. Depending on what's in your web environment and which sources you're drawing from, you have lots of options here. An easy idea might be to create a filter to sift through web server logs for special characters (like < > ' or - ) or requests where the web server returned a 500 or some other obscure error (not 403 or 404).

VoIP is a trickier one to go after since there's no ArcSight connector for CallManager or whatever SIP gateway you use. You could write one with the Flex Connector SDK, but I'm not sure how great your SIP gateway logs are to begin with it comes to security. I think switches, IDS/IPS, and firewall are your best bets here. You'd want to filter firewall logs for packets sourced from your VoIP VLAN address space that might indicate a rogue device connected to your voice network. (Which reminds me, a new version of voiphopper just came out.) You might also want to filter IDS logs for traffic sourced from your VoIP VLANs as well. Hopefully you've already got "switchport port-security maximum 2" set on all of your VoIP ports (and all of your userland switch ports in general) to prevent ARP spoofing/poisoning attacks. In which case, if you send your switch syslogs to ArcSight, a rule to alert on 'NOMAC' messages could be very useful. These can be regular errors, but also occur when someone attempts ARP-based MITM attacks in a port where port-security has been configured.

Anyway, I hope that helps, Anonymous. Good luck with your projects.

Monday, February 4, 2008

ArcSight / CEF Patch for Snort / Barnyard

Last week Colin Grady released a patch to the Snort output tool, Barnyard 0.20, that allows you to output in ArcSight CEF format. I was going to post something here about it last week because of its sheer coolness, but then decided to hold off until I had a chance to play with it myself.

It built flawlessly, and it was easy enough to set up. It creates a new module in Barnyard named "alert_cef" that shovels CEF format messages to a syslog server (like ArcSight Logger). An example barnyard.conf might look something like this:

config daemon
config localtime
config hostname: arnold:eth1
config interface: eth1
config sid-msg-map: /opt/snort/rules/sid-msg.map
config gen-msg-map: /opt/snort/rules/gen-msg.map
config class-file: /opt/snort/rules/classification.config
output alert_cef arnold arcsight_logger.mydomain.local 16 1 514

Those arguments to the 'output alert_cef' line are, in order:
hostname
syslog server
syslog facility (integer format 16=local0)
syslog severity (integer format 1=alert)
syslog server port (UDP)

So here are my thoughts on Colin's patch. On the plus side, Barnyard is fast and lightweight, especially in comparison to the ArcSight Connector bundle which is several hundred MB on disk and in memory because it's Java. On the neutral side, its use case is pretty specific - you have ArcSight Logger collecting syslog data and forwarding to ESM. (Or you do your own thing with CEF and not ArcSight.) And on the down side, bypassing the ArcSight Connector means that you lose the categorization/prioritization stuff that ArcSight does for you with its AUP updates. And it also means no packet payload. For that you need to be running the ArcSight SnortDB Connector and logging Snort to a SQL database (preferably via Barnyard).

So that's still the architecture that I recommend for an enterprise Snort-to-ArcSight deployment. Snort doing unified logging, Barnyard shoveling logs into a SQL server on another host, ArcSight SnortDB Connector (also not on the Snort sensor host) querying that data from SQL and handing it directly to ESM. That gets you maximum scalability and functionality.

Nonetheless, Colin's patch is pretty cool, and definitely on the table for folks that have Logger and ESM. Additionally, if you use Snort and Barnyard, you should look at some of the other patches Colin has on his site. In particular, I think we will be testing and deploying his schema patch for Barnyard. Wish I had known about it 2 years ago.

Friday, October 26, 2007

The Heartbreak of Nondisclosure

Look what I've got in the test lab this week:

It's a little more recognizable with the front bezel on it:

That's right, it's ArcSight Logger 2.0 beta! Alas, the non-disclosure agreement prevents me from telling you any more than that. OK, I'll also tell you that, much to my disappointment, the cool logo bezel does not light up.

Wednesday, October 3, 2007

Is Your IP Address Personal Info?

According to a German court it is. (via Eric Fitzgerald's blog)

The remedy that this ruling implies - not logging IP addresses to a web site beyond the duration of the user's session - is either unsustainable or crippling to site security.

If it becomes standard practice in Germany to not log IP addresses anywhere for any length of time, they will essentially be declaring open season on themselves. There will be no network evidence trail and therefore no case to prosecute. I can't imagine it'll come to that, but it is interesting to ponder.

Friday, September 28, 2007

Firewalls, SIM, and Visualization

Saudi asks for help on the loganalysis mailing list:

"Looking for help in identifying meaningful/actionable reports that we can get from Firewall log analysis."

Normally, I would've replied to the list, but attaching a bunch of jpeg files that will be sent to hundreds of people is poor etiquette. So instead, I'll spam the list with a link to this blog post. :-)

Reports are great and all, and you've gotten some excellent suggestions so far. But I'm a believer in mjr's artificial ignorance model for log analysis, so I put a high value on finding things that I don't know that I'm looking for. And when you want to do that with millions of events, visualization is the way to go. So here are some ArcSight data monitors that I have that are specific to firewall data.

This is a pair of moving average graphs. The green one is 'accept' messages and the red one is 'drop' or 'reject' messages. Big spikes or dips in these graphs are interesting. The other thing you can't see in these is that there's a second line along the bottom. That line is the failover firewall. When it fails over, both graphs draw a pretty 'X' with intersecting lines.

This is another moving average graph. I love these things! This one isolates workstation VLANs (so this is user-land only) and pairs srcaddr/dstport. Big spikes and long plateaus are usually interesting. The plateaus have traditionally been malware trying to scan or send spam. We've gotten better at catching this stuff on the front end, though, so I rely on this less today than I did 2 years ago. Also, if multiple lines are doing the same thing, that's interesting, too, since it can mean multiple infections.

This data monitor shows, to-scale, firewall events by hour, by severity. Any place you have visible orange or red or green is probably interesting. Also an abnormally high or low event count per hour is also interesting. This one above shows the overnight, so the yellow, orange, and red appear more prevalent because there are fewer events in those buckets.

This data monitor is a pie graph that shows last-hour firewall events by target country code. This probably doesn't work for all organizations, but my company is based and does business exclusively in the US. That means that any large amount of traffic destined for RU or CN is probably the start of a bad day for me.

This data monitor is just a chart that displays the Top 10 sources of blocked traffic. I've whited-out the actual IP's, but you can see the zone details. (The top 3 DMZ servers are due to a recent change in the firewall that the servers haven't caught up to.)

One of the cool things about SIM visualization gadgetry like ArcSight's data monitors is that these displays are in near-realtime. So it's like a report that's always running, and that's really easy to operationalize - "Here, stare at this for a few minutes every so often. If it looks weird, click on it and find out why."