June 8, 2022
Need Data Masking? Here’s What to Look for in a Solution
One of the primary tools companies can use to address the challenges involved in analyzing data, while still protecting user privacy, is data masking. To gain the most from it, however, it’s important to know what to look for in a data masking solution.
What is Data Masking?
In its simplest form, data masking involves taking sensitive information and replacing it with other data. This ensures that data patterns can still be analyzed, and applications can still be built around the data, but in a way that protects sensitive information and meets regulatory requirements.
There are two primary methods of data masking: static and dynamic.
Static masking involves changing the information in the source database. Once the information is changed, the database is then used, or copied and used, in the various applications in which it is needed.
Dynamic masking, on the other hand, involves masking the data on-demand. The original data remains unchanged, but sensitive information is altered and masked on-the-fly. This allows those who need to work with it to do so without seeing the sensitive information they are not cleared to access.
For example, when working with medical records, a doctor or nurse needs to access their patient’s information. Just because they need to access their health information, however, does not mean they have to see the patient’s Social Security number. Data masking can hide the sensitive information, while still providing the pertinent fields the doctor or nurse needs to see.
Similarly, a company may have an agreement with a third party to provide data for analysis. Data masking would allow the company to anonymize private data, but still keep the data relationships intact so meaningful analysis can be conducted.
There are several things to look for in a data masking solution.
What to Look for in Data Masking Solutions
Not all data masking solutions are created equal. Some masking tools are very basic, while others are quite sophisticated. And while some come included with certain platforms, others are third-party solutions created by specialists.
While considering your options, be on the lookout for these features:
Low Risk of Re-Identification
One of the biggest things an organization should look for in any data masking solution is proof that the masking technique has a very low chance of being reversed—that is, once the data is masked, a bad actor should not be able to “unmask” the data to reveal sensitive information. This can happen if a clever hacker cross-references the data (for example) with other publicly available data in order to re-identify individuals from their metadata.
Legislation, such as HIPAA, the EU’s GDPR, or California’s CPRA, require methods that offer minimal data exposure risk.
NIST-Approved FIPS 140-2 Certification
NIST (the National Institute of Standards and Technology) has a standard, the Federal Information Processing Standards Publication (FIPS) 140-2, Security Requirements for Cryptographic Modules, which specifies security requirements that need to be followed by any digital system protecting sensitive information. Any security solution, including a masking solution, should be FIPS 140-2 certified in order to prove that it complies with these rigid standards.
Robust Access Rules
Another important factor in any data masking solution, especially for dynamic masking, is the ability to set robust rules about who can see what data. In particular, a solution should allow the creation of rules for role-based, user-based, program-based, and location-based access.
For example, a company’s accounts payable department would need access to bank account numbers and other sensitive information in order to pay employees and vendors. Accounts receivable, on the other hand, would not need that information. A solution that provides role-based rules would make it easy to set who can and cannot access those sensitive fields.
Preservation of Data Relationships
The whole point of masking data is to provide a dataset that can be used for analysis or application design, but without exposing sensitive information. This works only if the masked data is representative of the original data, maintaining the same structure and format. Data’s geographic distribution, gender distribution, numbers distribution, readability, and the general structure of the data should be preserved while still masking sensitive information.
For example, a record containing information for Julie, born 3/5/78, could be changed to Terri, who was born 3/7/78. Similarly, while Julie’s ZIP Code remains the same, her street address can be changed. As a result, with her name changed, birthday altered, and street address substituted, Julie’s information is sufficiently masked so as not to be linked back to her—but that data is still representative of her information, and so can still be used for analysis.
Similarly, any masking techniques must maintain referential integrity, especially where certain information is concerned. For example, if a certain data field is used as a primary key in a database, that field must be masked exactly the same way every time to ensure functionality is not impacted.
Mage: A Trusted Solution
Given the stakes involved in data masking, it’s important to use a trusted solution.
Mage has a long history of helping organizations deal with these challenges, providing an innovative solution for both static and dynamic data masking that complies with regulatory requirements.