A Possible Solution to the Issue of Test Data

Posted by Keith McMillan

August 10, 2016 | Leave a Comment

Larger enterprises usually have several environments. There’s obviously the production environment, and usually a testing and QA environment. Many will also have a stress testing/staging environment, which is a close facsimile of production, used to characterize the performance of the solution being built/maintained.

A common problem is testing data. As a matter of good hygiene, it’s a good idea to use testing data in environments other than production, and there may be strong regulatory or other motivations to do that (think HIPAA requirements, Payment Card Industry (PCI) requirements, Personal Health Information (PHI) and Personally Identifying Information (PII)).

Opposing this desire for scrubbed, faked or otherwise testing-only data is the idea that the best data to test with is production data, because of the volume and diversity of the data. How then do you reconcile the desire for consistent, production volume data in lower environments while still preventing access to sensitive data by people who really have no need to see it? Enter Format Preserving Encryption, or FPE.

Format preserving encryption can be used to encrypt sensitive pieces of data, such as dates of birth, social security or credit card numbers, but in a way that preserves the format of the data. A Visa or MasterCard credit card number consists of 16 digits (4 groups of 4 digits), and the result of traditional encryption would be a byte stream that is almost guaranteed to not consist of 16 digits. With format preserving encryption, you get output in the format that looks correct, but is not the original data. You get 16 digits. Based on how it’s deployed, it even allows partial access, say allowing you to encrypt the last four digits of the credit card separately from the middle 8, allowing you to decrypt only those last four for the help desk when a customer calls.

I caught on to FPE recently in a talk on BrightTalk, and it seems to me that it would be a very valuable tool in creating test data from production data, but preserving all the typical referential integrity requirements a system typically needs. Best of all, FPE is a NIST standard, which means no vendor lock-in.


RSS feed | Trackback URI

Comments »

No comments yet.

Name (required)
E-mail (required - never shown publicly)
Your Comment (smaller size | larger size)
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> in your comment.