Paul Melson's Blog: June 2009

Tuesday, June 23, 2009

Nobody Sells Laptops for The Price of Silver

If you haven't already, I recommend that you take 20 minutes and read "Nobody Sells Gold for the Price of Silver" by Cormac Herley and Dinei Florencio. (PDF Link) This is an excellent analysis of the research into and press coverage of the underground economy. It's a fascinating read, and they make a cogent argument that the underground economy is more myth than reality. I don't want to say more because it will ruin it for you.

Now I have an excercise for you. First, read the Herley/Florencio article. Then, read Bruce Schneier's experiences with trying to sell a laptop on eBay. Now think about the implications of the "Ripper Tax" on eBay. Now ask yourself why you haven't already sold any stock you own in eBay.

Thursday, June 18, 2009

PCI-DSS and Encrypting Card Numbers

OK, I'm about to do something dumb and talk about cryptography and cryptanalysis. I'm an expert in neither of these things. But despite the fact that somebody smarter than me should be telling you this, you're stuck with me, and I think I have a point. So here goes.

I had a bit of an "A-ha!" moment earlier today around PCI-DSS, specifically requirement 3.4 from v1.2 of the standard. Here's the relevant language from that requirement:

3.4 Render PAN, at minimum, unreadable anywhere it is stored (including on portable digital media, backup media, in logs) by using any of the following approaches:
One-way hashes based on strong cryptography
Truncation
Index tokens and pads (pads must be securely stored)
Strong cryptography with associated key-management processes and procedures

The bottom line is that this requirement fails to provide adequate protection to card numbers. Here's why.

Truncation and tokenized strings with pads have limited use cases. In the case of truncating card numbers, PCI-DSS recommends only storing the last 4 digits of the card number. You wouldn't choose truncation for a program that validates a card number because there would be too great a potential for false matches. It would only be helpful for including in receipts, billing statements, and for use in validating a customer identity in conjunction with other demographic information. Database tokens only provide adequate protection in environments where there is a multi-user or multi-app security model, and if there are flaws in the applications that have access to the pads, then your data is pwned.

So for the sake of maximum versatility and security, you're likely (or your software vendor is likely) to opt for hashing or encryption. But you still have a serious problem. While one-way hashes like SHA and block ciphers like AES can provide good protection to many forms of plaintext, credit cards aren't one of them. That's right, the problem isn't actually in the way you encrypt credit card numbers, it's that credit card numbers make for lousy plaintext to begin with.

Take for example the following row of data from my hypothetical e-commerce application's cardholder table:

LNAME,FNAME,CTYPE,EXP,HASH,LASTFOUR
Melson,Paul,DISCOVER,06/2009,e4b769607856a2f30b57fd26079dfefb,1111

In this case, we have what we need to use the card, except the card number is hashed with MD5. (Ignore what you know about MD5 collisions for a moment, since this problem also exists for SHA or any other method of encrypting the card number.) If we calculate the possible number of values that could be on the other side of that hash, it would be 10^16, or about 10,000 trillion for the 16-digit card number. That's roughly twice as many possibilities as an 8-character complex password (96^8), which is an acceptable keyspace size, but also completely doable for a tool like John The Ripper.

But if you know credit card numbers, then you've already realized that it's even worse than that. The first 4-6 digits of the card number are a misnomer in calculating keyspace. There aren't 1 million actual possible values. Since that row from my e-commerce app's database told me the card issuer, I know within 4-5 guesses the first two to four digits of the card number, and the last four are right there as well for inclusion on statements, etc. In this case, since it's a Discover card, we already know that the card number is 6011XXXXXXXX1111. Now we've cut the possible values we must guess in half, from 10^16 down to 10^8, which is a mere 100 million possibilities. There are other clever things we can do if it's encrypted with a stream cipher like RC4 or FISH, because we know the beginning and end values of the plaintext. But guess what? It's cheaper and easier to brute-force it even if lousy crypto is used. Even on the scale of millions of records. Even with salting, it's still worth it to brute-force the middle digits.

But wait, there's more! As if publicly known prefix values weren't enough, credit card numbers are also designed to be self-checking. That is to say, the numbers contain something like a checksum that, when a known algorithm is applied to the 7-digit account number, 3 digits of which we know from our last-four field, can be used to validate the card number. This was designed as an anti-fraud mechanism that would allow cards to be checked without a need to communicate with a clearinghouse. But this algorithm allows us to only generate valid account numbers, combined with partially-known prefixes, to reduce the keyspace significantly. And since this is a known algorithm I can (and someone already has) very easily write a tool that combines a brute-force password cracker with a credit card generator.

The bottom line is that, because of the already-partially-known nature of credit card numbers, simply encrypting card numbers inside a database or extract file is insufficient protection. The PCI Security Standards Council should revisit this requirement and modify it to, at the very least, require symmetric-key block ciphers and disallow stream ciphers and one-way hashes. But even then, I suspect, encrypted card numbers will be at risk. Certainly row-level encryption of card numbers should not qualify for "safe harbor" when it comes to breach notification laws.

PS - Extra credit if you crack the full card number from the hash above and post it below.

Thursday, June 11, 2009

From The Inbox 2

lmran writes:

Hi Paul,
Do you know any reason why ArcSight ESM does not support the Cisco MARS? Right now, all my firwalls send the syslog feeds into Cisco MARS and I'm trying to set the Cisco MARS to send thoes raw feeds data to ArcSight local connector but I just found out that ArcSight does not support the Cisco MARS. Thanks in ADV for any info reading this subject.

Starting in 4.x, MARS can forward events to another remote syslog listener. ArcSight has a syslog connector. So you ought to be able to forward events from MARS to ArcSight via syslog assuming MARS doesn't change the format of the log events too much. Even if MARS does mangle the event format, ArcSight will still receive them, but then most or all of the event will be parsed into the CEF Name field and categorization and prioritization won't be accurate.

If you are unable to upgrade your MARS appliance to 4.31 or later (I think that's the rev you need), another option would be to use a syslog-ng server out front. It supports forwarding events by source to other syslog servers. You could use this to send the stuff you want in ESM to ArcSight's syslog Connector and the stuff you want in MARS to MARS.

Or, you could do the environmentally conscious thing and unplug then recycle your MARS appliance. ;-)

Tuesday, June 9, 2009

From The Inbox

Anonymous writes,

Hi Paul, I am one of those who, as you say, found your blog by googling ArcSight, trying to do some recon on the product for my employer. (I think I see that the most recent posts here are from 2007 so who knows if you or anybody will be seeing my question.) I'm trying to find out, can Arcsight's data be queried programmatically; i.e. is it stored in a relational database, hopefully SQL Server or Oracle, or if not, is there an API or ADO.NET provider that can allow it to be queried, preferably with SQL? Thanks for any info anyone reading can provide.

ArcSight ESM uses Oracle 10g for its back-end database. At one point, and this may still be true, DB2 was also supported. You can query the database directly, and the schema is pretty straightforward. The table ARC_EVENT_DATA is where most of the event data lives, for example. But depending on your use case, that might not be the best way to get data out of ESM.

Also, since you didn't specify, it may be worth mentioning that the same is not true of the ArcSight Logger platform, which is flat storage. Instead of querying the log store directly, Logger can be configured to forward events based on source, type, etc. to another destination, if you need them in real-time. There is a PostegreSQL database on Logger, but it's my understanding that it supports the reports engine, and doesn't store the raw or CEF events in any comprehensive way.

The interesting thing is that the storage technology behind Logger 3.0, because of its performance and relative "cheapness" may become the data store for ESM down the road. It would only make sense, since you could handle MUCH higher event rates with less disk and no Oracle license fee. If it can be done while maintaining the stability and feature set that the Oracle-based data store has, it's a walk-off home run for ArcSight.

Monday, June 1, 2009

New Rules

After many months off, I'm jumping back in to the blog with both feet. Mostly in a Howard Beale sort of way. Didja miss me? Anyway, stealing a meme from Bill Maher, I've got something to say to security vendors. Without further ado, New Rules.

If you are a vendor, especially a vendor of security products or services, these are the rules I expect your product to follow. These are common sense, and I feel a little condescending telling them to you. But if recent experience is any indicator, you need to hear them. And you deserve the condescension.

Do not store credentials in clear text! Seriously, you can get free libraries to hash credentials or store them in a secure container file that requires a secret key. There's no reason for a password to be in a text file or HKLM Registry key. None.
Do not hardcode passwords! If I can't change every single password associated with your product simply and easily, then there should be a law that strips all of your developers of any degree they hold and forces them to go back to college and learn file IO methods.
Do not use HTTP/Telnet/FTP/LDAP for authentication! Seriously, more than enough free libraries for SSH, TLS, IPSec exist. Use one. Or buy the one you really like. It beats having to issue a "patch" to sell to government and regulated industry.
Don't run as root/SYSTEM/sa/DBA! Your product is not so special that it actually needs administrative privileges to run on the server or database that hosts it. Unless by "special" you mean "coded by lazy fools that don't want to define even the most basic security model." OK, then it is special.
Don't use broken crypto algorithms! Sorry, but if you are shipping new product that uses 56-bit DES, RC4, or ROT13, please see rule #3.
Don't send passwords in e-mail! Remote password reset is easy enough to do properly, there's no reason to be lazy and just send me my password if I forget it. Also, it means you're breaking rule #1. Busted.

There are no excuses for any product to not follow these rules, but especially security/compliance products. Gee, thanks. I just spent six figures on a product to help me manage or achieve compliance, and the product itself can't comply with the regulation I'm trying to address.