Test Data Management Benefits from a Data Product Approach

Tali Einhorn

Tali Einhorn

TDM Product Manager, K2View

Before an app can be released, and all through its development cycle, it needs to be tested, preferably with real-life, masked data. This article discusses the benefits of test data managementand how a data product approach adds even more value.

Table of Contents

Test Data Management Background
Top Test Data Management Benefits
Test Data Management Tools
Advantages of Test Data Management Based on Data Products

Test Data Management Background

Modern Test Data Management (TDM) tools offer an automated means for provisioning realistic data subsets for data products into a test environment. Such data sets are generated from enterprise production systems and provide high-quality data to testing teams.

The right TDM platform should act as (1) a test data warehouse for the provisioned test data, and (2) an ETL layer for extracting data from production sources and loading it to the target environment.

One of the main challenges of provisioning test data is that data is often fragmented between different data sources. For example, a customer’s data may be stored in CRM, billing, ordering, ticketing, customer feedback, and collection systems. To run functional tests on customers in an agile development testing environment, their data must be extracted from all relevant source systems while maintaining referential integrity.

A data product for each business entity instance, ensures smooth data provisioning, based on the company’s business needs, and eliminates the need to extract a complete copy of each data source.

Top Test Data Management Benefits

Here are 7 key benefits traditionally associated with test data management:

  1. Enhanced test data coverage
    By tracking test data to test cases, and then to requirements, TDM delivers a 360 view of test data coverage, as well as the error patterns.

  2. Reduced cost due to error tracking
    As explained above, the combination of superior test data coverage and tracking provides exceptional clarity. This broad picture detects faults earlier, reducing the cost of production fixes.

  3. Data provisioning by testing type
    TDM manages data in a test data warehouse. From there, the right data can be provisioned for different testing types (e.g., functional, integration, performance, etc.) thus reducing redundant data copies, and storage costs.

    In a test data warehouse, data subsets are provisioned on demand.

  4. Data compliance and security
    The growing body of data privacy regulations needs to be upheld by every enterprise, everywhere. That makes data masking a core piece of the TDM process, in which data compliance and security have highest priority.

  5. Reusability of test data
    Reusability is perhaps the most important feature of TDM, because it reduces costs even further. Reusable test data is categorized, and archived, in the test data warehouse for future use by testers, whenever the need arises.

  6. No data copies necessary
    For any given project, different teams often make identical copies of the production data for their own individual use – resulting in redundancy and wasted storage space. When a test data warehouse is used by all teams, referential data integrity, and optimized storage, are maintained.

  7. Customer trust
    Key advantages of TDM are the high quality, and broad coverage, of the data, meaning that errors are dealt with early on during the testing phase. The result is a stable and high-quality application, with minimal defects.

Test Data Management Tools

Modern TDM tools feature:

  • A self-service web application, where testers can request data to be provisioned on demand


    Self-service functionality is a must-have for data testers.

  • A test data warehouse, of provisioned test data

  • Live transfer, or the ability to transfer data into live testing environments

  • Test data subset requests, including selection criteria from the source systems, re-deployment of test data sets, and test data appending into the test environment

  • Synthetic data generation, by cloning a given production entity into the target environment, while avoiding sequence duplication and ensuring referential integrity in the test environment

  • Dynamic data security and masking, applied in flight, as the data is retrieved from the source systems

  • A “time travel” mechanism, in which testers can:

    • Save specific versions of a test data subset

    • Load a selected version of test data to a specific target environment

    • Provision data on-demand, or automatically, according to schedule (e.g., once a week)

Advantages of Test Data Management Based on Data Products

A data product approach to test data offers many advantages. A data product corresponds to a business entity, such as a customer, order, product, or any other business object that’s essential to the tested application. All the data related to a data product instance is stored in its own encrypted Micro-Database™.

Test data is first collected from the source systems by the Data Product Platform, then unified and masked (as a Micro-Database), and finally provisioned to the target test systems. This approach simplifies test data management, assuring control of the TDM process, efficiency, and referential integrity.

When taking a data product approach to TDM, all Micro-Databases are ingested into a centralized test data warehouse, enabling testing teams to subset data by applying selection criteria, and then provisioning it accordingly. The test data warehouse supports data versioning to enable test data rollbacks, as well as the segregation of test data by testers.

WithTDM based on data products, you can:

  • Connect many sources and targets

  • Provision from any source, to any target, in any version

  • Build subsets of data on demand

  • Create test data for new applications

  • Discover and mask structured and unstructured data consistently

  • Pipeline test data for CI/CD

  • Secure access via role-based access control

Read why customers are raving about
data masking based on data products.