December 20, 2023
The ROI of Test Data Management Tool
As software teams increasingly take a “shift left” approach to software testing, the need to reduce testing cycle times and improve the rigor of tests is growing in lock-step. This creates a conundrum: Testing coverage and completeness is deeply dependent on the quality of the test dataset used—but provisioning quality test data has to take less time, not more.
This is where test data management (TDM) tools come into play, giving DevOps teams the resources to provision exactly what they need to test early and often. But, as with anything else, quality TDM tool has a cost associated with it. How can decision makers measure the return on investment (ROI) for such tool?
To be clear, the issue is not how to do an ROI calculation; there is a well-defined formula for that. The challenge comes with knowing what to measure, and how to translate the functions of TDM tool into concrete cost savings. To get started, it helps to consider the downsides to traditional testing that make TDM attractive, proceeding from there to categorize the areas where TDM tool creates efficiencies as well as new opportunities.
Traditional Software Testing without TDM—Slow, Ineffective, and Insecure
The traditional method for generating test data is a largely manual process. A production database would be cloned for the purpose, and then an individual or team would be tasked with creating data subsets and performing other needed functions. This method is inefficient for several reasons:
- Storage costs. Cloning an entire production database increases storage costs. Although the cost of storage is rather low today, production databases can be large; storing an entire copy is an unnecessary cost.
- Cloning a database and manually preparing a subset can be a labor-intensive process. According to one survey of DevOps professionals, an average of 3.5 days and 3.8 people were needed to fulfill a request for test data that used production environment data; for 20% of the respondents, the timeframe was over a week.
- Completeness/edge cases. Missing or misleading edge cases can skew the results of testing. A proper test data subset will need to include important edge cases, but not so many that they overwhelm test results.
- Referential integrity. When creating a subset, that subset must be representative of the entire dataset. The data model underlying the test data must accurately define the relationships among key pieces of data. Primary keys must be properly linked, and data relationships should be based on well-defined business rules.
- Ensuring data privacy and compliance. With the increasing number of data security and privacy laws worldwide, it’s important to ensure that your test data generation methods comply with relevant legislation.
The goal in procuring a TDM tool is to overcome these challenges by automating large parts of the test data procurement process. Thus, the return on such an investment depends on the tool’s ability to guarantee speed, completeness, and referential integrity without consuming too many additional resources or creating compliance issues.
Efficiency Returns—Driving Down Costs Associated with Testing
When discussing saved costs, there are two main areas to consider: Internal costs and external ones. Internal costs reflect inefficiencies in process or resource allocation. External costs reflect missed opportunities or problems that arise when bringing a product to market. TDM can help organizations realize a return with both.
Internal Costs and Test Data Procurement Efficiency
There is no doubt that testing can happen faster, and sooner, when adequate data is provided more quickly with an automated process. Some industry experts report that, for most organizations, somewhere between 40% and 70% of all test data creation and provisioning can be automated.
Part of an automated workflow should involve either subsetting the data, or virtualizing it. These steps alleviate the need to store complete copies of production databases, driving down storage costs. Even for a medium-sized organization, this can mean terabytes of saved storage space, with 80% to 90% reductions in storage space being reported by some companies.
As for overall efficiency, team leaders say their developers are 20% to 25% more efficient when they have access to proper test data management tools.
External Costs and Competitiveness in the Market
Most organizations see TDM tools as a way to make testing more efficient, but just as important are the opportunity costs that accrue from slower and more error-prone manual testing. For example, the mean time to the detection of defects (MTTD) will be lower when test data is properly managed, which means software can be improved more quickly, preventing further bugs and client churn. The number of unnoticed defects is likely to decline as well. Catching an error early in development incurs only about one-tenth of the cost of fixing an error in production.
Time-to-market (TTM) is also a factor here. Traditionally, software projects might have a TTM from six months to several years—but that timeframe is rapidly shrinking. If provisioning of test data takes a week’s worth of time, and there are several testing cycles needed, the delay in TTM due only to data provisioning can be a full month or more. That is not only a month’s worth of lost revenue, but adequate space for a competitor to become more established.
The Balance
To review, the cost of any TDM tool and its implementation needs to be balanced against:
- The cost of storage space for test data
- The cost of personnel needs (3.8 employees, on average, over 3.5 days)
- The benefit of an increase in efficiency of your development teams
- Overall cost of a bug when found in production rather than in testing
- Lost opportunity due to a slower time-to-market
TDM Tools Achieve Positive ROI When They Solve These Challenges
Admittedly, every organization will look different when these factors are assessed. So, while there are general considerations when it comes to the ROI of TDM tools, specific examples will vary wildly. We encourage readers to derive their own estimates for the above numbers.
That said, the real question is not whether TDM tools provide an ROI. The question is which TDM tools are most likely to do so. Currently available tools differ in terms of their feature sets and ease of use. The better the tool, the higher the ROI will be.
A tool will achieve positive ROI insofar as it can solve these challenges:
- Ensuring referential integrity. This can be achieved through proper subsetting and pseudonymization capabilities. The proper number and kind of edge cases should be present, too.
- Automated provisioning with appropriate security. This means being able to rapidly provision test data across the organization while also staying compliant with all major security and privacy regulations.
- Scalability and flexibility. The more databases an organization has, the more it will need a tool that can work seamlessly across multiple data platforms. A good tool should have flexible deployment mechanisms to make scalability easy.
These are specifically the challenges our engineers had in mind when developing Mage’s TDM capabilities. Our TDM solution achieves that balance, providing an ROI by helping DevOps teams test more quickly and get to market faster. For more specific numbers and case studies, you can schedule a demo and speak with our team.