Paul Melson's Blog: compliance

Showing posts with label compliance. Show all posts

Thursday, June 18, 2009

PCI-DSS and Encrypting Card Numbers

OK, I'm about to do something dumb and talk about cryptography and cryptanalysis. I'm an expert in neither of these things. But despite the fact that somebody smarter than me should be telling you this, you're stuck with me, and I think I have a point. So here goes.

I had a bit of an "A-ha!" moment earlier today around PCI-DSS, specifically requirement 3.4 from v1.2 of the standard. Here's the relevant language from that requirement:

3.4 Render PAN, at minimum, unreadable anywhere it is stored (including on portable digital media, backup media, in logs) by using any of the following approaches:
One-way hashes based on strong cryptography
Truncation
Index tokens and pads (pads must be securely stored)
Strong cryptography with associated key-management processes and procedures

The bottom line is that this requirement fails to provide adequate protection to card numbers. Here's why.

Truncation and tokenized strings with pads have limited use cases. In the case of truncating card numbers, PCI-DSS recommends only storing the last 4 digits of the card number. You wouldn't choose truncation for a program that validates a card number because there would be too great a potential for false matches. It would only be helpful for including in receipts, billing statements, and for use in validating a customer identity in conjunction with other demographic information. Database tokens only provide adequate protection in environments where there is a multi-user or multi-app security model, and if there are flaws in the applications that have access to the pads, then your data is pwned.

So for the sake of maximum versatility and security, you're likely (or your software vendor is likely) to opt for hashing or encryption. But you still have a serious problem. While one-way hashes like SHA and block ciphers like AES can provide good protection to many forms of plaintext, credit cards aren't one of them. That's right, the problem isn't actually in the way you encrypt credit card numbers, it's that credit card numbers make for lousy plaintext to begin with.

Take for example the following row of data from my hypothetical e-commerce application's cardholder table:

LNAME,FNAME,CTYPE,EXP,HASH,LASTFOUR
Melson,Paul,DISCOVER,06/2009,e4b769607856a2f30b57fd26079dfefb,1111

In this case, we have what we need to use the card, except the card number is hashed with MD5. (Ignore what you know about MD5 collisions for a moment, since this problem also exists for SHA or any other method of encrypting the card number.) If we calculate the possible number of values that could be on the other side of that hash, it would be 10^16, or about 10,000 trillion for the 16-digit card number. That's roughly twice as many possibilities as an 8-character complex password (96^8), which is an acceptable keyspace size, but also completely doable for a tool like John The Ripper.

But if you know credit card numbers, then you've already realized that it's even worse than that. The first 4-6 digits of the card number are a misnomer in calculating keyspace. There aren't 1 million actual possible values. Since that row from my e-commerce app's database told me the card issuer, I know within 4-5 guesses the first two to four digits of the card number, and the last four are right there as well for inclusion on statements, etc. In this case, since it's a Discover card, we already know that the card number is 6011XXXXXXXX1111. Now we've cut the possible values we must guess in half, from 10^16 down to 10^8, which is a mere 100 million possibilities. There are other clever things we can do if it's encrypted with a stream cipher like RC4 or FISH, because we know the beginning and end values of the plaintext. But guess what? It's cheaper and easier to brute-force it even if lousy crypto is used. Even on the scale of millions of records. Even with salting, it's still worth it to brute-force the middle digits.

But wait, there's more! As if publicly known prefix values weren't enough, credit card numbers are also designed to be self-checking. That is to say, the numbers contain something like a checksum that, when a known algorithm is applied to the 7-digit account number, 3 digits of which we know from our last-four field, can be used to validate the card number. This was designed as an anti-fraud mechanism that would allow cards to be checked without a need to communicate with a clearinghouse. But this algorithm allows us to only generate valid account numbers, combined with partially-known prefixes, to reduce the keyspace significantly. And since this is a known algorithm I can (and someone already has) very easily write a tool that combines a brute-force password cracker with a credit card generator.

The bottom line is that, because of the already-partially-known nature of credit card numbers, simply encrypting card numbers inside a database or extract file is insufficient protection. The PCI Security Standards Council should revisit this requirement and modify it to, at the very least, require symmetric-key block ciphers and disallow stream ciphers and one-way hashes. But even then, I suspect, encrypted card numbers will be at risk. Certainly row-level encryption of card numbers should not qualify for "safe harbor" when it comes to breach notification laws.

PS - Extra credit if you crack the full card number from the hash above and post it below.

Monday, February 4, 2008

AB1298

A colleague of mine sent me this article, which should be of interest to pretty much everyone in the health care or human resources fields. AB1298 is an assembly bill that updates SB1386, California's much-copied breach disclosure law. The bottom line is that now an individual's health insurance ID number (which is hopefully not also their SSN) is considered PII much the same way a credit card number is. And when that data along with the corresponding name is breached, you must notify the victim.

It makes perfect sense. That number, combined with proper billing information, is enough to receive health care services from any participating medical provider. And, while I have pretty decent credit, I don't have a platinum card with a six-figure limit. But, if it were medically necessary, my insurer could be charged that kind of bill. And I would be responsible for the deductible. And, unlike my credit card's maximum personal loss, my deductible is not $50. So as an individual I stand to suffer greater financial loss if my medical identity is stolen versus my credit card.

In an America where health coverage is a problem for 47M people and the rising cost of health care is a problem for the rest, it doesn't seem at all far-fetched that trading in stolen health insurance information could become a lucrative criminal enterprise. And that would make health care data a real target.

Thursday, December 13, 2007

Deloitte Data Disclosure Study

So, I can't decide what this study really means. The short version is that Deloitte did a survey of security & privacy staff from the US about data breaches and disclosures, and 85% of respondents had at least one incident, and 63% of respondents had six or more in the past 12 months.

But I don't know if this is the sky falling, or just the entropic nature of data. Clearly 85% of companies are not having TJX-sized breaches. But the 85% is apparently incidents where notification ocurred. Unfortunately, the report doesn't expand on what constitutes notification and whether that means specifically that individuals were notified.

Either way, this study raises a good point around incident response. Specifically, due to the ubiquitous nature of mandatory disclosure laws, it's time to revisit your incident response procedures and include language for determining if notification is necessary, and then coordinating and documenting notification efforts so that you can prove that you followed applicable laws.

Wednesday, November 7, 2007

And They Were All Yellow

Symantec bought Vontu. Never heard of Vontu? They are an established player in the data-leakage security niche. Primarily deployed on networks that fall under the purview of the Gramm-Leach-Bliley Act, Vontu's flagship product works like an IPS, but instead of loading it up with vulnerability signatures, you load it up with keywords and snippets of your confidential data.

For $350M, this is is a gamble for Symantec for a couple of reasons. First, the expansion of the data-leakage market is very much a question-mark. Sure Vontu's poised to dominate if it does blow up, especially with Symantec's Panama Canal of a channel. But Symantec is a desktop client company. They've killed every network device they've ever acquired, and some that they built themselves. Sure Vontu has a desktop client as well, but it's not their leader.

What I find most interesting about this acquisition is that Symantec is known for paying pennies for secondary niche players and trying to pump them on their brand recognition against primary niche players. Their whole product strategy can be summed up as "one brand, one vendor." In this case, they bought one of the best-of-breed players in the niche, if not the top dog. And they paid good money for them, too. Recent acquisitions like Altiris and Revivio were more of the old Symantec trying to find a bargain buy into a new market. So the Vontu purchase leaves me confused. I would've expected Symantec to buy somebody like Tizor and stay away from Vontu and PortAuthority.

By the way, there's an excellent Forrester paper on Symantec's ongoing shopping spree. If you work for a Forrester subscriber, or own a lot of Symantec stock, it's worth reading. (I am the former and, at not the latter, for what that's worth.) If you're keeping track, Symantec has acquired no fewer than 31 companies since 2000.

Also, Vontu co-founder (and recent multimillionaire!) Joseph Ansanelli testified before a House subcommittee about combating identity fraud. (PDF Link) Another interesting read, but when you contrast this with the recent ID theft study that Bruce Schneier blogged about today, you have to wonder if there's a decent sales line for these products beyond GLBA compliance.

Wednesday, October 10, 2007

On George Clooney and HIPAA

Palisades hospital in New Jersey has suspended 27 employees for accessing actor George Clooney's medical record after he was treated there following a motorcycle crash. I don't disagree with the employees' suspension, but the hospital spokesperson told reporters, "What these individuals did was violate a HIPAA regulation. We can not say that they actually released any of this information to the media."

It's clear that someone did leak to the media information from his medical record, but the hospital doesn't know who. Additionally, these employees had access to patient EMR data as employees of a covered entity (the hospital). So I'm picking a nit here, but I do believe the hospital has admitted that it doesn't know which of the 27 employees suspended, if any, actually violated HIPAA. As far as I can tell they were, under the law, authorized to view Clooney's medical record. Of course, what they did was still inappropriate, unprofessional, unethical, and probably a violation of hospital policy.

But perhaps the best-slash-worst part of this whole situation is that a union rep defending some of the suspended employees has been quoted as saying, "There are hospital obligations to have security systems so that a breach can't occur -- obviously that failed."

Wednesday, October 3, 2007

Is Your IP Address Personal Info?

According to a German court it is. (via Eric Fitzgerald's blog)

The remedy that this ruling implies - not logging IP addresses to a web site beyond the duration of the user's session - is either unsustainable or crippling to site security.

If it becomes standard practice in Germany to not log IP addresses anywhere for any length of time, they will essentially be declaring open season on themselves. There will be no network evidence trail and therefore no case to prosecute. I can't imagine it'll come to that, but it is interesting to ponder.

Thursday, July 19, 2007

Play-By-Play: I Get Into It w/ Richard Bejtlich Over Metrics

So I commented yesterday about a post Richard made about outcome-based security metrics.

In short, Richard likes outcome-based security metrics because they "mean something." I like them, too, but they can be hard to define and even harder to gather good data for. So I guess I don't like them that much.

He replied in the form of a new blog post. And I just had to comment.

This time, Richard takes issue with my point that it's possible to have bad security and outcome-based metrics that don't realistically represent the poor state of your security. He's probably right that if breaches are really bad or even moderately bad very frequently, that you can't help but detect them. Eventually. But in my opinion, metrics don't help you here. And that was my point.

And then he rags on compliance metrics. And this is where I draw the line. OK, not really. Compliance metrics suck, but we do them because they have value. Actual business value. Contrived, soulless, perhaps even pointless value. But I can tie dollars to them, so they have value. But Richard doesn't believe in ROI for security, either, so... :-)

Anyway, I respect Richard and enjoy his books and his blog. This dialog is healthy for infosectarians to have. If by some freak accident you read my blog but not his, definitely check it out.

Good HIPAA Resource

HIPAA isn't new, but - and maybe because I work in an environment where it's the primary regulatory standard - I regularly have conversations with colleagues and vendors about how we adhere to HIPAA standards and specifically the nuances of how we believe it translates into actual best practices on the ground. Like anything that is both legal and technical, HIPAA is riddled with self-referencing jargon, and defining these terms is useful to any serious conversation about HIPAA compliance. To that end, I stumbled on a really nice encyclopedia of HIPAA terms at U of Miami's med school. Too useful not to share.

Wednesday, April 4, 2007

Guilty Pleasures, Social Networks, and Event ID's

So, one of my guilty pleasures is I like to read and answer the Information Security questions people ask on LinkedIn. It's like infosec Jeopardy without having to go to Vegas. Sometimes I even know the answer and have time to post it. The other day was one, and I'll share it here in a little more detail.

Venkatesh asks: "What are you monitoring on Active Directory/SQL Server as part of IT compliance?"

The cool thing about Microsoft EventLog format is the Event ID field, which for the most part tells you what is happening, and the details are things like who or what is doing that thing to who- or what-else. An example is Event ID Security:628. Any time you see that code, you know that A changed the password of B, and it is possible that A == B or A != B.

So get your left pinky finger ready for Ctrl-C & Ctrl-V action. Here's my big list of Security EventLog ID's that you should monitor as part of your log review processes.

EventID == Security:535
EventID == Security:624
EventID == Security:628
EventID == Security:629
EventID == Security:630
EventID == Security:631
EventID == Security:632
EventID == Security:634
EventID == Security:636
EventID == Security:637
EventID == Security:639
EventID == Security:641
EventID == Security:642
EventID == Security:644
EventID == Security:647
EventID == Security:659
EventID == Security:660
EventID == Security:671
EventID == Security:685
Target Account Name == Administrator
Target Account Name == [real admin user name] (you renamed Administrator, right?)

Here's how it looks in the ArcSight filter editor:

In our environment (200+ Windows servers, another 80-100 UNIX servers that authenticate against AD, and 1200+ Windows workstations), this represents about 70-100 events per day out of roughly a half million EventLog entries that we collect per day. That's so totally manageable. The rest of it you can subject to trending, thresholds, and so on to find weirdness worth investigating.

It's also a good idea to go through your EventLog data every couple of months and look for new Event ID's that you haven't seen before. I use a filter that matches all of the Event ID's that I've already identified and excludes them. Then it's just a matter of researching the new Event ID's and determining their cause and relevance.

If you've got other ideas of good EventLog content that you focus on, post it up here. I'd love to hear about it!

Wednesday, March 21, 2007

Health & Human Services Is Teething

Lots and lots of people have declared HIPAA irrelevant and ineffective because of the lack of direct federal oversight and the perception that the penalties it could potentially level at an organization were weak in comparison to things like SOX.

But, OH SNAP!! The Health and Human Services Inspector General is auditing providers. I've got in my inbox a copy of the FAX sent to _____ Hospital in Georgia about their audit. No mention of complaint or prior incidence. Just a friendly, "Hi, we're coming to audit you," letter complete with data collection document.

It has always been my stance that if what's lacking in compliance is enforcement, then it's important to comply, because enforcement is only a budget line item away. So I guess I'm saying, "I told you so!" to everyone who has greatly exaggerated the rumors of HIPAA's death.

So, uh... think warm thoughts doc, cuz that thermometer is mighty cold.

Tuesday, March 20, 2007

Review: Chuvakin's Database Log Analysis Paper

Dr. Anton Chuvakin was kind enough to share with me a copy of his 2-part paper on database log analysis, which was published in CSI Alert Newsletter, Feb 2007. I enjoyed his paper(s), but I have to acknowledge that this is a topic near and dear to my current work. In Part I of his paper, Dr. Chuvakin gets right to the heart of database log analysis: You should do database logging better than your database does by default because it's probably a compliance issue for you.

Chuvakin also explores some of the pitfalls of database logging:

Performance hits from turning on auditing
A databases stores its different log data in multiple formats (in a table, in a redo file, in a text file, etc.)
Finding data in database logs that is meaningful to security ops and incident response

"Introduction to Database Log Analysis, Part II" lays out some practical considerations for any organization preparing to implement database logging. It's hard to keep this topic general, since the differences in audit logging between Oracle and Microsoft SQL Server, for example, are significant enough that the strategies for tackling log gathering, storage, and analysis are very different.

As you might expect from a guy in his line of work, Chuvakin goes on to describe why you might want to use a SIM or similar tool to aggregate and read database logs. I happen to agree with him that you need something more than LogMiner if you want to use your logs for anything more than troubleshooting. But Chuvakin gives a list of example reasons for using a SIM for compliance and operations purposes around database logging, not all of which wash with me:

Change management: modern RDBMS is a complicated piece of software with plenty of configuration files and other things to change. And being aware of all the changes constitutes one of the control objectives for IT governance as well as essential for protecting the environment.

True that COBIT and ITIL both rely on change management and that it is a common compliance requirement for things like SOX and SAS. However, I don't know that a SIM is a good candidate for this. Operationally, I use our change control documentation to explain, for instance, why a bunch of files were changed or new users created. I don't use the SIM's ability to detect these events as the impetus for documenting the change - the cart's way out ahead of the horse by the time the SIM gets to it, and if you do it right, change docs always precede the change. So, at best, this is a secondary control that the SIM is providing. Not that it's a bad idea, it's just not a business driver for log analysis.

Authentication, Authorization and Access: logging access control decision such as login failures and successes as well as access to data and various database objects is of interest for auditors as well as important for security, such as insider privilege abuse. While logging all access to data is less common, it is one of the emerging trends in logging.

Good idea. Do this with your directory and server OS, too. And RADIUS. And client VPN. And prox cards. And voicemail. And... oh, you get the idea.

Threat Detection: while both unauthorized changes and access control decisions are essential for security, database logs may be used to discover and analyze direct exploitation attempts as well, at least in some cases. Even accessing the database system at an unusual time will likely be of interest to as security team.

Log analysis is probably your best option here, unless you stick an IDS sensor between your database servers and everything else on your network. Which if your database supports a web app, you should definitely do.

Performance: DBAs are tasked with monitoring database performance and logs provide one of the avenues for doing so. This is especially important to those orgs with strict service level agreements (SLA) for database performance

While a SIM may be able to create averages and other statistics in near-real time, this is not the tool for database performance monitoring. There's lots of stuff out there that does this already and much better than your SIM can.

Business Continuity: knowing of database software starts and stops is essential since business depend on databases for their revenue streams and this a downed database directly leads to losing money

Again, lots of stuff that can do this better than a SIM. Moreover, SIM's use data analysis instead of active checks. If your database shuts down, a SIM can tell you about it once it sees a log entry for it. If it hangs, then maybe it will detect a drop in activity and determine that the event per second delta is abnormal. Of course, this can occur during normal use. IMHO, SIMs suck for tracking service up/down states.

Overall, I enjoyed reading this paper. It's clear that Dr. Chuvakin has some good ideas on this topic, and I would recommend that anyone who is considering database log auditing read this paper prior to the initial design meeting. It's not a comprehensive guide, but it will get you thinking in the right direction about the issues your organization will likely face.