Mage Data

Author: Alex Ramaiah

  • Why Data Breaches Are So Costly…And So Difficult to Prevent

    Why Data Breaches Are So Costly…And So Difficult to Prevent

    No one in a large organization wants to hear the news that there has been a data breach, and that the organization’s data has been compromised. But many are reluctant to spend a significant portion of their budget on appropriate preventative measures. Why?

    The reason usually comes down to two misconceptions. Either the leadership of the organizations assumes that a data breach is unlikely, or that, if a breach were to happen, their risk exposure would be minimal and the problem easily fixed.

    The truth is that, today, data breaches are inevitable…and much more costly. Companies are often much more exposed than they know, which means that the potential costs of data compromise are much higher than assumed—and so is ROI of preventive measures.

    Data Breaches Are Inevitable

    In 2022, there were over 1,800 data compromises of U.S. companies alone, impacting some 422 million individuals. This is four times the number of compromises reported just a decade ago.

    Think about this risk as you would a similar risk, such as a fire at a building or a plant. As the saying goes, companies don’t carry insurance because they think something bad might happen—they get insurance because bad things do happen. On a long enough timeline, it’s a virtual guarantee that something bad will strike your business. Yes, fires are rare, but they happen, and they are devastating. The same goes for data breaches.

    But here is one important way in which a fire is different from a data breach. The risk of a fire scales linearly with the number of locations you have; the risk that unsecure data poses to your business scales exponentially, even if you have a small number of total records. As a result, many companies’ data management practices may create millions or hundreds of millions of dollars of risk. Most are not even aware of it.

    Systems Are Complex, and There is More Risk Than You Imagine

    Gone are the days when a company has a server or two in a server closet, housing their data. Today’s companies have multiple connected systems, many of which are spinning up cloud environments and transferring data on a daily basis.

    In these scenarios, data duplication creates a huge risk for companies should their systems be compromised. For example, a single company might have both client records and employee records, all of which are duplicated in a live “production” environment, a testing environment, and a data lake for analytics purposes. A single breach could potentially expose all of this data, multiplying the risk.

    (For a more full accounting of the math here, see our whitepaper on the ROI of Risk Reduction, now available for download.)

    What is the Actual Cost of Exposed Data?

    So data compromise is inevitable, and companies have richer stores of data these days. They real question is this: Does the cost associated with a data breach exceed the budget needed to prevent one?

    One of the very best resources for understanding what drives the cost behind a data breach is IBM’s annual Cost of a Data Breach report. The worldwide average cost of a breach in 2021 was $4.24 million, the highest average total cost in the history of the report. That works out to about $180 per record for customer information, and $176 per record for employee data.

    Importantly, it wasn’t just direct remediation costs that contributed to this total. Thirty-eight percent of the total cost was attributable to “customer turnover, lost revenue due to system downtime, and the increasing cost of acquiring new business due to diminished reputation,” which suggests that the pain caused by a breach lasts for years beyond the initial incident.

    Again, having duplicate records drives up costs here. A single customer, for example, might be tied to data that “lives” in several systems, both production environments and non-production environments. Which means that a single customer is not just $180 worth of risk, but potentially 4-5 times that amount.

    Prevention Needs to be Modern, Too

    In short, data breaches are much larger and more complex than they were even a decade or two ago. That also makes them much more costly. It also means that the methods for preventing breaches and reducing risk need to be similarly modern and complex.

    For example, data discovery needs to be a part of any security efforts. Discovering all databases and all instances of records in a working organization can be a massive challenge; AI-based tools are now necessary to both find and identify all the data in play.

    Once data is discovered, there are various tools that can be used to protect that data, including encryption, masking, and access controls. Which tools are appropriate for which data sets depend on factors such as how often the data needs to be accessed, who will need to access it, and system performance requirements.

    That said, there is a set procedure that should be followed to reduce the risk of exposure. Here at Mage Data, we’ve honed that procedure over the years; in some cases, we can reduce the dollar-amount risk by more than 90%.

    To see what this procedure is, and to see the math behind this reduction of risk, download our white paper, The ROI of Risk Reduction for Data Breaches.

  • Seven Key Test Data Management Metrics to Understand and Track

    Seven Key Test Data Management Metrics to Understand and Track

    If your organization performs software testing, then you probably use test data in some form. But is your process for producing, updating, and implementing that test data adding net value for the organization? Or is it a hidden roadblock to producing quality software?

    Unfortunately, most organizations assume that simply having some sort of Test Data Management (TDM) process in place is sufficient. True, doing some Test Data Management is better than doing none at all. (And there are telltale signs that an organization needs Test Data Management to move forward.) But even with a Test Data Management program in place, it’s important to set up appropriate metrics and KPIs to ensure that your Test Data Management efforts are actually producing the kind of test data needed for optimal quality control.

    Why Are Metrics Needed for Test Data Management?

    Managing test data can be challenging, especially in large and complex software systems. Many TDM projects fail because the processes involved work at first, but erode over time. For example, the test data could lose its “freshness,” or the request process is not robust enough to create effective data sets.

    This is why it is important to gain insight into the TDM process by collecting various metrics. Some of these can be captured using Test Data Management tools, while others will require some customized reporting. But the more complete your picture of the Test Data Management process is, the better your organization will be able to keep its testing process on-track and delivering according to schedule.

    7 Key Test Data Management Metrics

    Here, then, are seven key metrics to consider for tracking your Test Data Management capabilities. These can be split into two categories: Metrics that measure the test data itself and its quality, and metrics for the testing process.

    Metrics for Test Data Quality

    Data completeness. Data completeness is a measure of how well test data covers scenarios from production domains. This can especially be a concern if test data is created via subsetting, or by creating synthetic data. Special cases exist in all data sets and workflows, and those cases need to be represented in test data. There also need to be appropriate boundary cases, null cases, and negative-path cases as well. Otherwise, testing will not be sufficient.

    Data quality. While data completeness is a measure of which cases are covered, data quality is a measure of how well the test data respects the rules and constraints of the database schema, as well as the application being tested. In other words, it is a measure of how well the test data “matches” production data, which in turn affects how well the data will turn up bugs with consistency and reliability.

    Data freshness (data age). Aspects of data sets change over time; using test data that accurately represents the freshest production data is thus crucial. For example, the demographics of clients in a client database might shift over time, or account activity might change as new interfaces and new products are introduced. The freshness of one’s test data can be measured in terms of the age of the data itself, or in terms of the rate at which new test data is generated (the refresh rate).

    Metrics for Test Data Management Processes

    Data security. To what degree do the processes for generating and using test data ensure the security of the original production data? How is test data itself kept secure and compliant? Data security metrics should give numeric proof that datasets are handled in such a way as to keep sensitive information secure and its use compliant with local laws and regulations.

    Test cycle time. Test cycle time is the total time for a testing cycle to complete, from request to test data creation to actual testing and validation. The goal is to reduce test cycle time as much as possible without sacrificing quality—by introducing automation, for example.

    Data request % completion. Are all requests for reliable test data being met? Data request % completion is the other side of the coin from test cycle time; while cycle time measures the average speed of provisioning, data request % completion measures how many requests are actually being met in a timely matter.

    Test effectiveness. If all of the above metrics were to improve within an organization, then overall test effectiveness should improve as well. So, even though test effectiveness is a lagging indicator of the quality of test data and Test Data Management, it is important to track as effectiveness is what will ultimately affect the bottom line.

    Here, test effectiveness is simply a count of the number of bugs found during a period of testing, divided by the total bugs found (that is, both bugs found during testing and bugs found after shipping/deployment). For example, if all bugs are found during testing and none in production, testing effectiveness is 100%. If testing only reveals half of the bugs, with the other half discovered after deployment, testing effectiveness is 50%. The higher test effectiveness is, the better: Catching bugs in testing often makes remediation orders of magnitude cheaper than if caught in production.

    How Mage Data Helps with Test Data Management

    If you have a Test Data Management strategy in place, you’ve already taken the first step in the right direction. Now it is important to start collecting the right metrics, measuring your efforts, and bringing on board the right tools to improve the process.

    Mage’s Data own Test Data Management solution ensures the security and efficiency of your test data processes with secure provisioning of sensitive data across teams, wherever required and whenever needed. This tool allows organizations to rapidly provision high-quality test data, allowing your TDM efforts to scale while staying secure.

    To learn more about how Mage Data can help solve your Test Data Management challenges, contact us today to schedule a free demo.

  • Four Best Practices for Test Data Management in the Banking Sector

    Four Best Practices for Test Data Management in the Banking Sector

    While managing test data may not be as exciting as what financial institutions can do with analytics, poorly managed test data holds an existential risk. Imagine a rare bug that failed to be uncovered during testing—something that reported a balance of zero for a small set of users. The consequences would be dire. Or, imagine testing done with live data that was not properly masked. The potential for a data breach would be astronomical.

    The good news is that properly managed test data can help reduce and eliminate many of these issues before applications are rolled out to users, and so securely. Getting this process right is key to keeping development costs low and avoiding the pain of small bugs with massive consequences.

    Best Practice #1: Understand Your Data Landscape.
    A core principle of data privacy and security is that businesses should minimize their use of data to what is necessary. One of the primary uses for test data is likely to test user interfaces on public-facing applications. In this scenario, there is a lot of information that banks hold, like social security numbers, spending patterns, and credit histories, that won’t ever show up on the front end—and consequently won’t be relevant. While your live data is a key starting point for creating a high-quality test dataset, copying it wholesale will be a waste of resources and create more risk in the event of a leak or breach. By thoughtfully exploring the minimum amount of data necessary for a test, banks can reduce their risk and speed up testing time while reducing resource consumption.

    On the other side of the coin, banks often face issues gathering enough of the right information for testing. Banks are especially at risk for holding data in profoundly non-centralized manners. Legacy systems, some decades old, might handle core business operations like payment processing, credit card rewards, and even the bank accounts themselves. When data is fragmented, collecting the necessary types and amounts of data for testing while respecting data privacy and security laws can be a significant challenge. Banks need a “census” of their data, cataloging what kinds of data they hold and where it is so that they can ensure their test data is as complete as possible and assembled in a legal manner.

    Best Practice #2: Proactively Refresh Your Data
    Like any business, banks are always developing new products and services. Test data management provides a means of stress testing these new offerings so that they perform as expected on launch. However, using test data that isn’t up-to-date can mean that products could pass testing, but fail in actual use. For example, a bank could be expanding into a new country. That would mean that its applications must support a new language and currency symbol. While those may seem like minor adjustments to make, if poorly implemented, they could create a terrible experience for your users in the new area. Without the right test data, the issue may not be identified beforehand.
    Consequently, refreshing test data isn’t just about having the best picture of where your business currently is, though that is important. Instead, refreshing test data can help propel your business to where it’s going and help future-proof it against upcoming developments. Of course, it’s also important to update your data so that it remains current. If test data grows stale, it may become a poor reflection of your business, causing testing to miss critically important issues before changes are rolled out to the public.

    Best Practice #3: Anonymize Your Data
    One of the riskiest things a business could do is take its most sensitive data and give it to its most junior employee. Yet, developers need access to test data to validate their software solutions. In banking, giving employees access to customer information is not just a risk, but in a heavily regulated industry like finance, is probably illegal. Companies could use various techniques to help maintain compliance, such as static or dynamic data masking, anonymization, or pseudonymization.

    Best Practice #4: Automate Test Data Management
    Every day, banks and their customers generate billions of new data points. In an environment where massive change can occur relatively quickly, it can become impossible for humans to keep up with the necessary changes. This is true for test data management. One of the best ways to ensure that your test data is fresh enough for testing is to automate the creation and refreshing of test datasets. That can be a massive challenge at scale, but with the right tools, banks and other financial institutions can keep up with the pace of change.

    How Mage Helps Banks with Test Data Management
    Ultimately, Test Data Management is not just one process but many often-complex interlocking processes. Banks have to get all of these processes right to have effective test data and perform them securely under the watchful gaze of any number of regulators. In an environment where regulatory intervention can be devastating, not just any Test Data Management solution will do. With a deep data privacy and security background, Mage Data’s Test Data Management platform provides businesses of any size with the tools they need to tackle their thorniest data management issues. Mage Data’s platform can handle data at any scale and already supports multiple businesses with multibillion-dollar revenues. To learn more about what Mage Data’s Test Data Management solution can do for your business, contact Mage Data today for a free consultation.

  • Why Do Test Data Management (TDM) Projects Fail?

    Why Do Test Data Management (TDM) Projects Fail?

    Test data, and Test Data Management, remain a huge challenge for tech-driven companies around the globe. Having good test data is a boon to the organization: It helps support rigorous testing to ensure that software is stable and reliable, while also mitigating security risks. But having good test data is exactly the issue. The way in which test data is created and subsequently managed has a huge effect on testers’ ability to do their jobs well.

    This is why many Test Data Management projects fail. Testers manage to create testing that is usable once, or perhaps a handful of times. But over time, problems accumulate. The data loses its freshness, for example, or the request process is not robust enough to create appropriate data sets.

    While we cannot diagnose every unique issue that organizations face when it comes to Test Data Management, there are some common challenges that we’ve seen—challenges that routinely sink Test Data Management projects, making them less effective, more costly, and more likely to fail outright.

    Here are the top six:

    1: Lack of Buy-In

    Test Data Management isn’t likely to be high on the list of priorities for any company. Too often, it’s an afterthought. There isn’t always an internal champion making the project a priority, let alone making the argument for investing in better tools.

    When this lack of buy-in exists, especially among company leadership, a TDM project is liable to fail before it ever takes off. Thus, it is important to get stakeholders on board; they should understand why Test Data Management is important, and have some say into new projects from the start. This includes both company leadership and those who will be responsible for implementing the plan.

    2: Lack of Standardization

    One sign or outcome of a lack of buy-in is a lack of standardization. Answer these questions: Does your organization have a well-developed data dictionary? Where can that be found? Who created it? What does the data model look like? If the answers to these questions are not readily known by you or the team responsible for Test Data Management, chances are you don’t have robust standards in place.

    Another manifestation of this problem is an absence of standard data request forms. This leads to data requests in different types of formats, as well as different tests, both of which ultimately lengthen testing cycles.

    3: Older Data and/or Merged Data

    Test data should cover a range of relevant test cases and legitimately resemble the data in your production environments. But data is not static; it tends to change over time. For example, the demographics of clients in a client database might shift over time, or account activity might change as new interfaces and new products are introduced. This means there is an expiration date on datasets, and yet outdated test data is routinely used.

    This especially becomes a problem after an acquisition or merger. Each party will bring their own data to the new entity, and merging the datasets is a challenge in its own right. If the conversion is done badly, this can throw off the data sample.

    4: Privacy and Safety Standards

    Many applications traffic is sensitive data—banking apps, tax preparation software, HR software, shopping apps…the list goes on. These applications can reveal crucially important insights, but they also risk revealing sensitive personal and financial information.

    This creates a trade-off between safety and relevance when it comes to test data. Using real-time data from production environments ensures relevance, but often at the cost of privacy. Using synthetic data ensures privacy but risks not having the proper relationships and, hence, not being relevant.

    Using automated test data generation software can help the team to deal with this problem. This helps to create data with the relevant relations intact, but without revealing sensitive information. Masking and/or encryption should be part of this process.

    5: Problems with Referential Integrity

    Ideally, any set of test data should contain a representative cross-section of the data, maintaining a high degree of referential integrity. Again, this is easier to achieve with real-time data but much harder with synthetic test data.

    So, when creating synthetic testing data, it is important to have a data model that accurately defines the relationships between key pieces of data. Primary keys must be properly linked, and data relationships should be based on well-defined business rules.

    Sometimes a TDM project will fail because the test data does not have this kind of referential integrity, and the entire testing process becomes an exercise in demonstrating the adage: “garbage in, garbage out.”

    6: Waterfall TDM in an Agile Development Environment

    There are plenty of data management tools out there today…and many of them assume a more-or-less waterfall approach to development. In a waterfall approach, a “subset, mask, and copy” methodology usually ensures that test data is representative of live data, is small enough for efficient testing, and meets all data privacy requirements. With the testing cycle lasting weeks or months and known well in advance, it’s relatively easy to schedule data refreshes as needed to keep test data fresh.

    Things are not so straightforward in an agile development environment. Besides integration challenges, commonly there also are timing challenges. Agile sprints tend to be much shorter than the traditional waterfall process, so the prep time for test data is dramatically shortened. The approach outlined above tends to impede operations by forcing a team to wait for test data and can create a backlog of completed, but untested, features waiting for deployment. Again, automation can help create test data on-the-fly and as needed.

    Preventing Failure

    Given that the above list represents the majority of reasons why TDM projects fail, we can reverse-engineer those reasons to put together a plan for TDM success:

    1. Get buy-in early from both teams and leadership,
    2. Start with appropriate standardization (including a data dictionary and current data model with appropriate referential integrity),
    3. Make a plan to clean older, irrelevant data and convert any data from other sources,
    4. Choose methods that will yield data with the appropriate trade-off between privacy/security and relevance (including using masking and encryption), and
    5. Automate the process wherever possible.

      Mage Data can help with many of these steps. Our Test Data Management solution ensures the security and efficiency of your test data processes, allowing for the secure provisioning of sensitive data across teams, wherever required and whenever needed. Your teams will have a better experience with our ready-to-go, plug-and-play platform for TDM, irrespective of data store type. You can ask for a demo today.

  • What is Data Provisioning in Test Data Management?

    What is Data Provisioning in Test Data Management?

    If your company has taken the time to master test data generation—including steps to ensure that your test data is free from personally identifiable information, is suitable for different tests, and is representative of your data as a whole—data provisioning might feel like an unimportant step. But like a runner who trips a few feet before the finish line, companies who struggle with data provisioning will face delays and other issues at one of the last steps in the Test Data Management process, wasting much of their hard work. The good news is that getting data provisioning right is a straightforward process, though it will require businesses to have a strong inventory of their data management needs.

    What is Data Provisioning?

    Data provisioning is taking prepared datasets and delivering them to the teams responsible for software testing. That process might sound deceptively simple at first, but data provisioning faces similar challenges to last-mile logistics in package delivery. Moving packages in bulk from San Francisco to Dallas on time and at a low cost is relatively easy. It’s much more challenging to achieve a low price and on-time delivery when taking those same packages and delivering them to thousands of homes across the DFW metro area.

    In the same way, creating one or more high-quality datasets that help testers identify issues before launch is not that complicated, relatively speaking. But doing it when multiple teams may be testing different parts of an app, or even testing across multiple apps, can be a big lift. And if your company is using an agile software development approach , there could be dozens of different teams doing sprints, potentially starting and stopping at different times, each with its own unique testing needs. Those teams may start on an entirely new project in as little as two weeks, which means those managing your test data could receive dozens of requests a month for very different datasets.

    Why Does Data Provisioning Matter?

    Failing to deliver test data on time can have severe consequences. For example, a lack of test data could mean that the launch of a critical new feature is delayed, despite being essentially complete. Data that’s even a day or two late could lead to developers being pulled off their new sprints to resolve bugs revealed in testing. When that happens, other teams are potentially disrupted as personnel are moved around to keep things on track, or else the issue can potentially lead to cascading delays.

    In other scenarios, the consequences could be smaller. The test data could exist, but not be stored in a way that testers can easily access. That could mean that your test data managers are spending time in a “customer service” role, where they have to spend time ensuring testers have what they need. If the friction of this process grows too large, testers might start reusing old datasets to save time, which can lead to bugs and other issues going undetected. The data provisioning challenge for businesses is ensuring that testers always have what they need, when needed, to ensure that testing catches bugs before they go live and becomes much more expensive to fix.

    Strategies for Effective Data Provisioning

    Does that mean that an IT-style approach is right for data provisioning? For the typical IT department, as long as there is enough capacity to support all needs on the busiest days, there won’t be any significant IT problems. However, data provisioning is significantly different from IT needs. IT needs are unpredictable, with some days having heavy demands and others producing very few requests. Data provisioning needs are tied to the development process and are nearly 100 percent predictable. Because of its predictability, companies can be efficient in resource usage for data provisioning, aiming for a “just-in-time” style process rather than maintaining excess or insufficient capacity.

    Self Service

    Of course, achieving a just-in-time process is easier said than done. One of the most effective steps companies can take to streamline their data provisioning process is to adopt a self-service portal. While it will vary from company to company, a significant portion of test data generally needs to be reused in multiple tests. This could be for features in continuous development or applications where the data structure remains unchanged, even as the front end undergoes transformations. Enabling developers and testers to grab commonly needed datasets on their own through a portal frees up your data managers to spend more time on the strategic decision-making needed to create great “custom” datasets for more challenging use cases.

    Automation

    Test data sets, whether in a self-service portal or used on a longer project, need to be regularly refreshed to ensure the data they contain is up-to-date and reflective of the business. Maintaining these portals can be a very time-consuming task for your data managers. Automating the process so that this data can be regularly refreshed, rather through a request in a self-service portal or by regular updates on the backend based on rules set by the test data managers, can help ensure that data is always available and up to date.

    How Mage Data Helps with Data Provisioning

    The reality of data provisioning is that your process may not look anything like anyone else’s, and that’s a good thing, as it means that you’ve customized it to your specific needs. However, getting to that point by building your own tools could be a long and expensive process. At the same time, off-the-shelf solutions may not meet all your needs. With Mage Data, companies can have the best of both worlds. With its suite of powerful tools, Mage Data gives companies just about everything they need for data provisioning and Test Data Management as a whole right out of the box. However, everything is customizable to a company’s specific needs, allowing you to obtain the benefits of customized software without the price tag. To learn more about what Mage Data can do for you, contact us today to schedule a free trial.

  • How to Create a Secure Test Data Management Strategy

    How to Create a Secure Test Data Management Strategy

    Proper Test Data Management helps businesses create better products that perform more reliably on deployment. But creating test data, in the right amount and with the right kinds of relationships, can be a much-more-challenging process than one would think. Getting the most out of test data requires more than simply having a tool for generating or subsetting data; it requires having a clear Test Data Management strategy.

    Test Data Management might not be the first area that comes to mind when thinking about corporate strategy. But testing generally holds just as much potential as any other area to damage your business if handled incorrectly—or to propel you to further success if handled well.

    An upshot of this is that it can help you find the best Test Data Management tools as well. After all, if the creator of a tool understands what is involved in a Test Data Management strategy, you can rest assured their tools will actually be designed to make those strategic goals a reality.

    Here, then, are the elements for a successful and secure Test Data Management strategy.

    The Core Elements of Test Data Management Strategies

    Creating a secure Test Data Management strategy starts with having a plan that makes your goals explicit, as well as the steps for getting there. After all, it doesn’t matter how secure your strategy is, if you don’t achieve the outcomes you’re looking for. All effective Test Data Management strategies rely on the following four pillars.

    Understanding Your Data

    First, it’s essential that you understand your data. Good testing data is typically composed of data points of radically different types sourced from a different database. Understanding what that data is and where it comes from is necessary to determine if it will produce a test result that reflects what your live service offers. Companies must also consider the specific test they’re running and alter the data they choose to produce the most accurate results possible.

    De-Identifying Data

    Second, producing realistic test results requires using realistic data. However, companies that are cavalier with their use of customer data in their tests put themselves at greater risk of leaks and breaches and may also run afoul of data privacy laws.

    There are many different methods for de-identifying data. Masking permanently replaces live data with dummy data with a similar structure. Tokenization replaces live data with values of a similar structure that appear real and can be reversed later. Encryption uses an algorithm to scramble information so it can’t be read without a decryption key. Whichever approach you use, ensure your personally identifiable information is protected and used per your privacy policy.

    Aggregating and Subsetting Data

    Third, your company may hold hundreds or billions of data points. Trying to use all of them for testing would be extremely inefficient. Subsetting, or creating a sample of your data that reflects the whole, is one proven method for efficient testing. Generally, data must also be aggregated from multiple different sources to provide all the types of data that your tests require.

    Refreshing and Automating Test Data in Real Time

    Finally, your company is not static. It changes and grows, and as it does, the data you hold can shift dramatically. If your test data is static, it will quickly become a poor representation of your company’s live environment and cause tests to miss critical errors. Consequently, test data must be regularly refreshed to ensure it reflects your company in the present moment. The best way to accomplish that task is to leverage automation to refresh your data regularly.

    What Makes a Test Data Management Strategy Secure?

    The reality of using test data is that, if improperly handled, it multiplies the preexisting security issues that your company already has. For example, if you take one insecure dataset and create five testing datasets, you end up with at least six times as much risk.

    When your data isn’t secure to begin with, securing your test data won’t make a meaningful impact on your overall security posture. At the same time, creating test data comes with its own risks. Data will be stored in new locations, accessed by more people than usual, and used in ways that it might not be during the normal course of business. That means you need to pay special attention to your test data to keep it secure.

    The following framework provides a way to think through the new risks that test data create.

    Who?

    First is the who. In addition to the people assembling the test datasets, other people (such as back- and front-end developers, or data analysts) will come in contact with the test data. While it’s tempting to provide all of them with the same data, the reality is that the data they need to do their job will vary from role to role. Your experienced lead developer will need a higher level of insight for troubleshooting than a junior developer on their first day on the job. To maximize your security around this data, you need a tool that can help you make these kinds of nuanced decisions about access.

    What?

    Knowing what data you’re using matters. With an ever-growing number of data privacy laws around the world, businesses must be able to detail how they’re using data in their operations. Using data that’s not covered by your privacy policy or using data in a manner that isn’t covered could result in serious regulatory action, possibly in multiple countries at once. Companies increasingly need to be able to prove they’re in compliance, which is most easily accomplished with robust audit logging.

    Where?

    An increasing number of countries are penalizing companies for offshoring data, especially if it isn’t declared in the enterprise’s privacy policy. Even with that in mind, running your data analysis in other countries may still make financial sense. In that situation, companies should evaluate whether masked or entirely synthetic datasets would suffice to reduce the risk of regulatory action or leaks that come with moving data across borders.

    How?

    The growing complexity of securing your Test Data Management process means that it’s no longer possible for humans to oversee every part of the process. A good policy starts with your human workers setting the rules, but then a technological solution is needed to handle the process at the scale required for modern business applications.

    Overall, Test Data Management strategies will vary from company to company. However, by following the principles in this article, companies can develop an approach that meets their testing needs while ensuring that data is kept secure.

    How Mage Data Can Help with Test Data Management

    While it would be dramatic to suggest that a poor Test Data Management strategy could doom a business, it’s not an exaggeration to suggest that a poor strategy drives up costs in a measurable way. Poor testing can easily lead to a buggier product that takes more time and costs more money to fix. And a worse product could lose customers, even as the expanded fixes hurt your bottom line. The good news is that companies don’t have to develop their Test Data Management strategy on their own. Mage’s Data suite of Test Data Management tools provides everything businesses need to build their test data pipeline while having the customization they need to make it their own. Schedule a demo today to see Mage Data in action.

  • Best Practices for Test Data Management in Agile

    Best Practices for Test Data Management in Agile

    Agile is a growing part of nearly every business’ software development process. Agile can better align teams with the most pressing customer issues, speed up development, and cut costs. However, like just-in-time manufacturing, Agile’s unique approach to development means that a delay in any part of the process can lead to a screeching halt across all of it. Testing software solutions, as the last step before deployment, is critical to ensuring that companies ship working software, as well as catching and resolving edge cases and bugs before the code goes live (and becomes far more expensive to fix). If Test Data Management is handled poorly in an Agile environment, the entire process is at risk of breaking down.

    Why Test Data Management is a Bigger Challenge in Agile

    As companies produce and consume more and more data, managing your test data is an increasing challenge. The key to success in leveraging test data is the realization that, the more your test data represents your live data, the better it will be at helping you uncover bugs and edge cases before deployment. While using your live production data in your tests would resolve this issue, that approach has serious data privacy and security concerns (and may not be legal in some jurisdictions). At the same time, the larger your dataset, the slower your tests.

    In a traditional waterfall approach to development, a “subset, mask, and copy” approach generally ensures that data is representative of your live data, small enough for efficient testing, and meets all data privacy requirements. With the testing cycle lasting weeks or months and known well in advance, it’s relatively easy to schedule data refreshes as needed to keep test data fresh.

    Agile sprints tend to be much shorter than the traditional waterfall process, so the prep time for test data is dramatically shortened. A traditional subset, mask, copy approach could severely impede operations by forcing a team to wait on test data to start development. Even worse, it could create a backlog of completed but untested features waiting for deployment, which would require companies to keep teams from starting new stories or pull people off a project to fix bugs after testing is completed. Both hurt efficiency and prevent companies from fully implementing an Agile development process.

    Best Practices for Effective Test Data Management in Agile

    Unfortunately, there are no shortcuts to Test Data Management in an agile system. You have to do everything you would have done in a traditional approach, but significantly speed up the process to ensure it’s never the bottleneck. Implementing this system can require a change in institutional thinking. Success in this area means finding new ways to integrate your testers and data managers into the development process and providing them with the tools they need to succeed in an Agile environment.

    1. Integrate Data Managers into the Planning Process

    No matter how efficient your test data managers are, creating the right dataset for testing for a particular customer story takes at least some time. Waiting until after the planning phase is over to inform your data team of the needed data will lead to delays just from the time needed to create a dataset. However, if more esoteric data is required, the delay could be much longer than typical. By integrating your data team into the planning phase, they can leverage their expertise to help identify potential areas of concern before the start of the development phase. They can also begin working on the test datasets before the start of development, potentially providing everything needed for development and testing on the first day of development.

    2. Adopt Continuous Data Refreshing

    At most companies, data managers support multiple teams. With different customer stories requiring different amounts of time to complete, the data team must be flexible and efficient to meet sometimes unpredictable deadlines. However, that doesn’t excuse them from ensuring that data is up to date, that it’s free of personally identifiable information, or that it’s subset correctly for the test.

    The good news is that significant portions of this process can be automated with the right tools. Modern tools can identify PII in a dataset, enabling rapid, automated transformation of an insecure database into a secure one for testing. Plus, synthetic generation tools can help companies rapidly create great datasets for testing that include no reversible PII while maintaining important referential integrity. With these processes in place, testing teams will be better equipped to handle the pace of Agile while also spending more time on high-value planning operations rather than low-level data manipulation.

    3. Create a Self-Service Portal

    One thing that’s guaranteed to slow Agile teams down is a formal request process to access test data. While tracking who is accessing what data is important, access requests and tracking can largely be automated with today’s tools. This idea can be taken one step further by creating a self-service portal that includes basic datasets for common development scenarios. A self-service portal ensures that smaller teams or side projects can run meaningful tests without tying up your data manager’s resources. Just like with your primary testing datasets, these must be kept reasonably up to date, but automation can significantly help reduce this burden.

    How Mage Data Helps with Agile Test Data Management

    Agile is a process that can greatly speed up development and transform the delivery of new features to your customers. However, teams need more training and tooling to execute it effectively. Not all Test Data Management solutions are up to handling an Agile approach to development. But, Mage’s Data Test Data Management solution is, providing just about everything a company could want right out of the box, while providing flexible customization options to enable companies to build the test data pipeline that works best for their needs. Contact Mage Data today for a free demo and learn more about how Mage Data can help streamline your Agile Test Data Management process.

  • What Are the Best Test Data Management Tools?

    What Are the Best Test Data Management Tools?

    Evaluating Test Data Management tools can be a challenge, especially when you may not be able to get your hands on a product before making a purchase. The good news is that prioritizing the right approach and features can help businesses maximize their ROI with Test Data Management tools. Whether you’re just starting your evaluation process or need to prepare for an imminent purchase, we’ve compiled the information you need to make the best choice for your business.

    What Are the Core Elements of Successful Test Data Management?

    Before examining the best Test Data Management tools in detail, we have to consider what outcomes organizations want to drive with these tools. An effective Test Data Management program helps companies identify bugs and other flaws as early in production as possible to allow for quick remediation while the mistakes are small-scale and inexpensive to rectify.

    Testing of applications is almost guaranteed to fail if you manage your test data in a way that makes it not representative of the data your live applications will use. As a result, your testing will fail to uncover critical flaws, and then your company will face an ever-escalating series of expensive repairs.

    Each new application or part of an application being tested will be slightly different. As a consequence, Test Data Management must be a dynamic process. Testers need to alter datasets from test to test to ensure that each provides comprehensive testing for the feature or tool in play. They will also need to be able to customize the dataset to the specific test being performed. The way the test data is stored in a database can be different from its form when used in a front-end application, so this customization step ensures there won’t be errors related to data incompatibility.

    Testers also clean data before (or after) formatting it for the test. Data cleaning is the process of modifying data to remove incomplete data points and ensuring that data points that would skew the test results are removed or mitigated. Testers will also need to identify sensitive data and ensure the proper steps are taken to protect personal information during testing. Finally, most testers will take steps to automate and scale the process. Many tests are run repeatedly, so having pre-prepared datasets on hand can greatly speed up the process.

    What Features Do the Best Test Data Management Tools Have?

    Given the complexity of the tasks that testers need to perform for compressive and successful testing, getting the right Test Data Management tool is critically important. But that’s easier said than done. The tech world is constantly evolving, and while Test Data Management may seem like a mundane task, it needs to evolve to ensure that it continues to serve your testing teams’ needs. The best Test Data Management tools provide some or all of the following capabilities to ensure that teams are well-equipped for their testing projects.

    Connection to Production Data

    While you could, in theory, create a phony dataset for your testing, the reality is that the data in your production environment will always be the most representative of the challenges your live applications will face. The best Test Data Management tools make it easy for organizations to connect to their live databases and gather the information needed for their tests.

    Efficient Subsetting

    As we covered before, matching your data to the test is critical for success. Subsetting is the process of creating a dataset that matches the test parameters. Generally, this dataset is significantly smaller than your databases as a whole. As a result, testers need an efficient subsetting process that is fast, repeatable, and scalable, so they can spend more time running tests and less preparing.

    Personally Identifiable Information Detection

    An easy way to get your company into trouble with the growing number of data privacy laws online is to use data with Personally Identifiable Information (PII) in it without declaring the use explicitly in your terms of service and getting consent from your users. Consequently, using PII by accident in your testing could result in regulatory action. Test Data Management tools need to help your team avoid this all-too-common scenario by identifying PII, enabling your team to properly anonymize the dataset before it’s used.

    Synthetic Data Generation

    A synthetic dataset is one of the most effective ways to sidestep the PII problem. Synthetic data resembles your source information, but unlike anonymized datasets, it holds little to no chance of being reversed. Because it doesn’t contain PII, it’s not subject to data privacy laws. One risk of synthetic data is that its creation may lead to data points with distorted relationships, compromising testing or analysis. However, Mage’s Data synthetic data solution uses an advanced method that preserves the underlying statistical relationships, even as the data points are recreated. This approach ensures the dataset is representative of your data while guaranteeing that no personal information is put at risk.

    How Do Current Test Data Management Tools Compare?

    Now that we’ve looked at what Test Data Management programs and tools need to do for success, let’s examine how Mage Data fits into the overall marketplace.
    Here we compare Mage’s Data Test Data Management capabilities against:
    • Informatica
    • Delphix
    • Oracle Enterprise Manager
    We consider not only raw capabilities, but other factors that actual users have reported as important.

    Informatica vs. Mage Data

    Informatica is a well-known Test Data Management solution that offers the core features needed for successful Test Data Management, including automated personal data discovery, data subsetting, and data masking. Users have raised issues with the platform though. One common customer complaint is that the tool doesn’t do anything that the competition doesn’t, has a steep learning curve, and is more expensive than competing tools. According to Gartner’s verified reviews, users rated Mage Data higher in every capability category, including dynamic and static data masking, subsetting, sensitive data discovery, and performance and scalability.

    Delphix vs. Mage Data

    Unlike Informatica, Delphix generally has kept up with modern developments in Test Data Management. It receives high ratings across the board for its functionality, and if that were the entire story, it would be hard to pick an alternative. But it doesn’t have an amazing user interface, which makes daily operation uncomfortable and makes setup more challenging than it needs to be. It also isn’t as interoperable as other systems, with some data sources unable to connect, limiting the tool’s usefulness. Contrastingly, Mage Data embraces APIs for connecting to data sources and provides an API setup that allows companies to integrate third-party or legacy solutions for Test Data Management, ensuring that companies aren’t locked out of the functionality they need.

    Oracle Enterprise Manager vs. Mage Data

    Oracle has long been one of the kings of database tools, so it’s unsurprising that Oracle Enterprise Manager’s Test Data Management solution is generally well-regarded. It’s especially praised for both its power and its user-friendliness, which is to be expected from a company of this size. What may surprise you is that based on customer reviews, Mage Data outperforms Oracle Enterprise Manager in 6/6 capability categories, 2/2 for Evaluation & Contracting, 4/4 for Integration & Deployment, and 2/3 for Service & Support. Given Oracle Enterprise Manager’s generally high price, and Mage’s Data better performance across almost all categories, Mage Data clearly comes out on top.

    Why Choose Mage Data for Test Data Management

    For success in Test Data Management, you need a solution that provides comprehensive coverage for each use case you may face. Mage Data knows this, and that’s why its Test Data Management solution provides nearly everything you could need right out of the box, including database cloning and virtualization, data discovery and masking, data subsetting, and synthetic data generation. However, we know that not all use cases will be perfectly met with an off-the-shelf tool, so our system also allows for flexible customization for niche business needs that require a special touch.

    How Mage Data Helps with Test Data Management

    Mage Data is a best-in-class tool for Test Data Management, but that doesn’t mean the benefits stop there. Mage Data provides a suite of data protection tools, enabling businesses to protect every part of their stack from a single dashboard. And, unlike other tools, we’re happy to give you a peek under the hood before you sign on the dotted line. Schedule a demo today to learn more about what Mage Data can do for your business.

  • Ensuring Consumer Data Privacy in Financial Services

    Ensuring Consumer Data Privacy in Financial Services

    It used to be—even just a few years ago—that consumer advocacy groups were complaining that legislation had not “caught up” with technology, and that huge data privacy gaps existed. But today, gaps in data privacy regulation are only half of the problem, as multiple legislative bodies have worked tirelessly to close the gaps. As financial services becomes an increasingly digital industry, finserv organizations now have to prepare for tomorrow’s regulatory landscape as well.

    In particular, finserv organizations are finding that consumers of financial services themselves are sensitive to the collection and processing of their information. Fortunately, financial services ties healthcare as the most-trusted sector for data privacy, according to research from McKinsey & Company. Less fortunately, the percentage of respondents who trust data privacy practices in the financial services industry is still only 44%.

    Financial institutions can keep pace with these simultaneous rises in digital banking activity and data privacy legislation by taking proactive measures:

    • Commit to learning what’s protected under all relevant consumer data privacy laws.
    • Consider how advances in digital banking and online payments will affect privacy concerns going forward.
    • Give customers transparency regarding data collection, protection, sharing, and use.
    • Implement data security and data privacy throughout the entire data lifecycle.
    • Appoint chief privacy officers and give them the resources required to develop thorough privacy practices.

      The nature of the financial services industry makes the stakes particularly high, but a commitment to best practices instills confidence and competence

    What to Know About Data Privacy in Financial Services

    Individuals divulge a great deal of personal information during online transactions. This is necessary because financial services organizations must have identity proofing to tie every transaction to a valid entity. Failure to collect, confirm, and store the appropriate data increases the likelihood of fraud. The need to collect, handle, and store data creates tension, as doing so means that financial institutions absorb responsibility for data privacy and data security.

    The Current State of Financial Privacy Regulation

    The uncertainty around data privacy for financial services is highlighted by the sheer volume of the applicable legislation. Instead of one universal law, financial institutions are accountable to numerous regulations, especially when they operate internationally. The following are several of the most common data privacy regulations impacting the financial services industry:

    • Gramm-Leach-Bliley Act (GLBA)
    • Payment Card Industry Data Security Standard (PCI DSS)
    • Sarbanes-Oxley Act (SOX)
    • California Consumer Protection Act (CCPA)
    • Payment Services Directive (PSD2)
    • General Data Protection Regulation (GDPR)
    • Financial Privacy Rule from the Federal Trade Commission (FTC)
    • Regulations from the New York State Department of Financial Services (NYDFS)
    • Consumer Data Right (Australia)
    • Monetary Authority of Singapore (MAS)

    The list above is a lot to absorb, but it isn’t even exhaustive. Organizations like the NYDFS and FTC work continuously to regulate financial services and products as threats and best practices evolve.

    When Financial Privacy Regulations Collide

    At times, consumer privacy laws seem to contradict each other. For example, the CCPA gives individuals the right to delete (or request deletion of) some of their information, but there are exceptions. One such exception occurs when financial institutions need the information in question to operate. Reconciling consumer privacy regulations and finance-specific privacy regulations—especially when one regulation supersedes another—requires constant diligence and a comprehensive data protection strategy.

    Evolving Data Privacy Responsibilities in the Finance Industry

    Data protection strategies must be works in progress if they’re not to become outdated. The current state of data privacy for financial services organizations doesn’t stay current for very long. Legislative bodies are creating and modifying regulations, which is far from the only concern. Data breaches, technological adoption, and data privacy best practices are evolving every bit as quickly.

    4 Steps Toward Data Privacy for Financial Services Organizations

    Pressing data security challenges in the financial services industry include third-party risks, data transfers, and compliance issues. A comprehensive strategy allows financial institutions to prepare, implement, and maintain an integrated data privacy platform:

    PrepareImplementMaintain
    1. Sensitive Data Discovery and Classification
    2. Appoint a chief privacy officer or work with a third-party organization to keep up with changing regulations
    1. Static Data Masking
    2. Dynamic Data Masking
    3. Additional Privacy Enhancing Technologies
    1. Database Activity Monitoring
    2. Data Subject Access Rights Requests
    3. Database Visualization
    4. Database Firewall
    1. Sensitive Data Discovery

    By definition, a privacy plan can only be comprehensive when it covers all information throughout the enterprise or institution. Data discovery is a vital first step. Diligent data discovery brings hidden or forgotten information into the light.

    Accurate data discovery and thoughtful data classification make privacy plans more intentional. The process answers some of the most pressing data protection questions:

    • Which data is necessary, and which is best disposed of?
    • Which individuals, applications, and third parties need access to data?
    • How sensitive is the information retained?
    • What do applications and analysts need to get from the data to operate as intended?

      Answers to these questions provide the visibility required to protect data as efficiently as possible.
    2. Data Protection

    It’s often necessary to retain sensitive information. The financial institution absorbs responsibility for data privacy and security in these cases. Techniques like encryption, tokenization, and masking anonymize data in various states and environments. Dynamic data masking, for example, protects data in use without slowing down the analytics team. Static data masking is a common choice for data at rest, or whenever it’s best to permanently replace sensitive data without the potential for re-identification (also known as de-anonymization).

    To close the loop, Privacy enhancing technologies (PETs) facilitate analysis of the privacy plan itself: with data sensitivity scorecards, incremental data scanning, and audit-ready reporting. Interconnected and interdependent networks of PETs allow financial services institutions to meet their various data privacy goals and expectations.

    After selecting PETs or other data protection techniques to meet all privacy requirements, it’s time to bring everything into a manageable hub.

    3. Integrating Data Security and Data Privacy

    Beyond well-integrated, a data security platform must be scalable enough to protect all data sources, production, and non-production environments. Bringing data privacy into a central hub makes it more difficult for anything to fall through the cracks. Such a platform ties preparation, implementation, and maintenance together.

    Increased visibility, manageable breach notifications, and adequate compliance reporting take the guesswork out of data privacy. Scalable solutions and excellent knowledge of the current state of an organization’s data privacy efforts make it easier to evolve. Adapting to changing regulations doesn’t have to mean starting over.

    Finally, integrated data security and data privacy platforms keep the burden of database activity monitoring to a minimum. Data subject access rights requests, alerts, and notifications appear in one central location. From there, it’s easier to add database visualization tools to help spend less time identifying priorities and more time working on them.

    4. Data Privacy and Consumer Consent

    Collecting and using personal data is generally prohibited unless the subject consents or the data processing is expressly allowed by regulation. Even when a financial organization has every right to collect information, it may be required to provide privacy notices. Under the GLBA, for example, organizations are required to provide notice of how they collect and use information. An organization must provide notice to the data subject even when the data processing does not require the subject’s consent.

    Data Privacy Technology for the Finance Industry

    Regulatory technology, or RegTech, is working behind the scenes to help financial services firms regain customer trust. Financial institutions that navigate the customer trust landscape gain competitive advantages by protecting their reputations.

    Societal factors like social distancing forced banks to accelerate digital transformation plans, and RegTech offers a much-needed boost. Consumer insights and regulatory compliance are twin differentiators. Legislative bodies and individual users are most satisfied when financial institutions go above and beyond minimum requirements and invest proactively in data protection solutions.

    Getting Started With Data Privacy for Financial Services

    The rising legislative and market-driven demand for data privacy separates financial institutions with stringent data protection plans. A demonstrable commitment to data privacy helps organizations win trust to increase their user base, and avoid fines to protect profits. Comprehensive programs address privacy, security, and data risks as interconnected and interdependent issues. To see how data security and data privacy for financial services combine, contact Mage Data for a demo.

  • Data Privacy Regulatory Compliance: A Primer

    Data Privacy Regulatory Compliance: A Primer

    Businesses can get into serious legal trouble if they don’t take care of customer data. But, that’s not the only reason that data privacy is important. Improper access to data and data breaches can have profoundly negative consequences on your employees and your users. Handling personal data correctly not only protects your business, but also shows that you’re treating your customers and employees ethically—and ensures that their data is safe from prying eyes.

    To meet those lofty goals, businesses need a comprehensive approach to data privacy grounded in modern best practices.

    What is Data Privacy?

    Before diving into the mechanics of data privacy, it’s important to understand the field’s evolution and how we got to where we are today.

    A History of Data Privacy

    The Privacy Act of 1974 is arguably the first data privacy act passed anywhere in the world. Given the rise of electronic databases across government agencies and the ease of sharing data between them, there were new avenues for potential abuse of private information. In passing the law, the US government established four principles that have heavily influenced subsequent data privacy laws.

    First, it requires government agencies to show individuals a copy of any records it keeps on them. This is very similar to the “Right to Access” that appears in later laws. Second, it outlined “fair information practices,” which were designed to better manage how government employees collected and used personal data. Third, it restricts how agencies should share personal data with individuals and other agencies, though law enforcement was generally exempted from this practice. Fourth, it allowed people to sue the government for mishandling the data, establishing that data privacy was important and deserved a legal remedy when it was violated.

    The next major data privacy law came in 1998, with the Health Information Portability and Accountability Act, or HIPPA. One of the critical innovations of this approach to data privacy was the acknowledgment that some kinds of personal data were more sensitive than others and required greater safeguards. In this instance, medical information was put in the spotlight. While it has transformed how the medical industry handles personal information, it was only a glimpse of things to come.

    While the Privacy Act of 1974 and HIPPA focused on governmental and medical records, recent laws have focused on personal information of nearly every type and across all industries. The General Data Protection Regulation, which went into effect in the EU in 2018, heavily restricted how companies could use data, required consent for data use, and created a “right of deletion” for personal data held by companies. Likewise, the California Consumer Protection Act, which went into effect in 2020, places a strong emphasis on consent when processing personal data and allows for legal remedies in the case of a data breach.

    Why Do Companies Retain Records, Anyway?

    Given the headache of managing data in the modern regulatory environment, it’s fair to wonder why companies do it. Here are the core reasons companies collect, store, and manage data as a part of their regular business.

    Provide Core Business Services

    The first and most important reason companies collect and store information is to provide their core business services. Imagine that you’re a retail company trying to fulfill an online order without the customer’s name, address, or credit card information. It would be impossible! Nearly all businesses need some customer information to run their business. However, for many organizations, the complexity of the data collected is much higher.

    For example, a mortgage company may need detailed information about employment and pay and one or multiple credit reports. And it may need to hold onto them for a while, or request updates, as the customer in question shops for a home during a changing economic environment. Or a medical company may hold extensive information about a patient’s health and healthcare, with records that date back years. Doctors need that information to provide the right care to their patients.
    Your company may differ from those listed above. However, the fact remains that there are likely fundamental business processes that you would be unable to run without collecting and using customer data.

    Analyze and Project Business Performance

    Customer data may also be needed to analyze and project business performance. Historical reporting often tangentially or directly requires customer data. Without it, there would be holes in the report and gaps in the business’ understanding of how it was doing. However, historical reporting is only part of the equation. Businesses need to innovate to remain competitive and want to find ways to increase their sales and revenue. Data, especially customer data, may hold the secrets to unlock these advances.

    Meet Regulatory Requirements

    Sometimes, organizations must hold on to personal data for a certain period to meet regulatory requirements. This is more common when the organization or the information it keeps are related to finance, medicine, or education—but it can happen in any industry. It’s important to ensure that your business complies with these laws, as failing to do so can result in severe fines and a significant hit to your company’s public reputation.

    Learn More About Data Privacy Regulatory Compliance

    Handing data is a core part of just about any business operation today. Given that it’s so central to what businesses do, it’s important that they both manage data as efficiently as possible and comply with the privacy laws worldwide that dictate what they can’t and must do with their data.