K2VIEW ebook

What is Data Migration?

The Strategic Guide

Data migration is the process of transferring data between data formats, data
stores, or computing environments, based on business or technology needs.



Data Migration – The Need to Modernize

Enterprises are migrating data at massive scale to keep up with the pace of business.

Data migration is the process of transferring data between computing environments, data formats, or storage systems. Data is moved from one place to another, or from one application to another, based on business and/or technology requirements.

Today’s enterprises migrate data for a variety of reasons, such as:

  • Retiring legacy applications and data warehouses, in favor of new ones
  • Moving to modern, cloud-based software systems
  • Enabling digital business transformation
  • Adopting a new version of a software system
  • Implementing Post-Merger Integration (PMI)
  • Adapting to data security regulatory changes

The most common example of data migration is data storage, where companies are migrating massive amounts of data from on-premise to cloud data stores, to increase performance and reduce costs.

Data migrations are complex, risky, expensive, and are often plagued by unexpected challenges. These challenges lead to shortcuts in testing and quality assurance, resulting in a turbulent post-migration experience for users of the new environment. To address these issues, data teams can proactively improve data quality, ensure regulatory compliance, and deliver on time, by relying on the right kind of technology to support their data migration efforts.

This guide first defines the WHY and WHAT of data migration, and then goes on to discuss its various strategies, tools, considerations, requirements, and a promising new technological approach.

Chapter 01

Why Migrate Data?

The 3 main drivers for migrating data are: 

  1. Greater agility 
    Legacy applications are often the cause of obstacles and delays that negatively impact responsiveness. Enterprises need to be able to incrementally transform older systems in order to stay ahead of the game. Data migration enables the agility needed to sustain data-intensive enterprises. 

  2. Reduced costs 
    Legacy systems and technologies are often expensive to operate and maintain. Companies can save on both hardware and human resources, by moving data to the cloud, for example.  

  3. Increased collaboration 
    Breaking down legacy data silos enables business domains to gain cross-company visibility, and work better together.

DM outcomes

Chapter 02

What is Data Migration?

Data migration starts with the identification, extraction, preparation, and transformation of data – and continues with its relocation from one data store to another. It concludes with validating the migrated data for completeness, and then decommissioning it from the legacy systems.

Data migration tools are used in any system consolidation, implementation, or upgrade. Ideally they offer automation in order to free up data teams from performing tedious tasks.

Project teams often underestimate the complexity inherent in data migration, and the time and effort needed for a successful conclusion. Midway through the project, they try to retroactively implement data migration best practices, but are often forced to take short cuts, and wind up delivering inadequate results.

Data migration success factors
To ensure the success of data migration projects, data teams should apply specific techniques in 3 domains, as depicted below.

DM success factors

  1. Planning and preparation
    In the planning phase, data teams select the data or applications to be migrated based on business, project, and technical needs. To do this, they must first:
    • Ensure data quality
    • Minimize the scope of the data to be migrated
    • Evaluate a phased migration approach
    • Build the right team
    • Analyze hardware and bandwidth requirements
    • Develop migration strategies, tests, automation scripts, mappings, and procedures
    • Determine data cleansing and transformation requirements for different data formats, to improve data quality and eliminate redundancy
    • Decide on a migration architecture
    • Begin change management processes

  2. Execution
    In the execution phase, the teams:
    • Validate the hardware and software requirements
    • Customize migration procedures as needed
    • Extract the data from the old system
    • Load the data to the new system
    • Verify that the migration plan is complete

  3. Governance and validation
    After data migration, the teams:
    • Ensure that the data is complete, accurate, and supports the processes in the new system
    • Determine governance roles and processes, and data masking tools
    • Document the migration project, and produce related reports
    • Validate successful termination, and decommission legacy systems
    • Conduct migration close-out meetings, and officially end the project

Chapter 03

Data Migration Types

The 6 most commonly used types of data migration are listed below. Note that they’re fluid, in the sense that a specific use case may include aspects of both cloud and database migration, or involve storage and application migration concurrently.

DM types

  1. Application migration
    Application migration moves data from one computing environment to another. It’s needed when an enterprise changes application vendors, or engages in application modernization – for example, exchanging one HR system for another, or deploying a new CRM. Applications can be migrated via
    • APIs, to protect data integrity
    • Middleware, to plug up any technology holes
    • Scripts, to transfer data automatically
  2. Business process migration
    Business process migration is used when business applications, and data on business entities (customer, products, etc.), processes, and metrics, are moved to a new environment. Catalysts for this type of movement include
    • Business optimization
    • Mergers and acquisitions
    • Reorgs aimed at penetrating new markets, or answering competitive challenges
  3. Cloud data migration
    Cloud data migration is the process of moving data storage and applications into the cloud. A migration initiative may involve cloud data integration from an on-premise DWH, or building new cloud-based data repositories. Cloud data migrations leverage all the built-in advantages of cloud computing: high-speed data provisioning, unlimited scalability, pay-as-you-go pricing, lower infrastructure costs, easy upgrades, and technological agility.
  4. Database migration
    A databases is a data storage repository in which data is organized in a structured way. It’s managed through a database management system (DBMS), so database migration involves (1) moving from one DBMS to another, or (2) moving from the current version to an upgraded version of the same DBMS. The first case is more challenging, especially if the source and the target systems employ different data structures.
  5. Data center migration
    A data center, which contains infrastructure and core applications, includes computers, network routers, servers, storage devices, switches, and other equipment. Data center migration deals with the wholesale movement of the data center infrastructure to a different physical location – as well as the migration of data (from old) to new equipment in the same data center.
  6. Storage migration
    Storage migration is all about moving data from one data store to another. Enterprises typically migrate data to newer technologies for better performance and more affordable scaling while enabling the usual data management features, like backups, cloning, disaster recovery, and snapshots.

Chapter 04

Data Migration Strategies

There are 3 strategies to Data Migration, as detailed below.

Big bang data migration
Using a big bang strategy, enterprises transfer all their data, all at once, in a just few days (like a long weekend). During the migration, all systems are down and all apps are inaccessible to users. This is because the data is in motion, and being transformed to meet the specifications of the new infrastructure. The migration is commonly scheduled for legal holidays, when customers aren’t likely to be using the application.

Theoretically, a big bang allows an enterprise to complete its data migration over the shortest period of time, and avoid working across old and new systems at the same. It’s touted as being more cost-effective, simpler, and quicker – “bang”, and we’re done.

The disadvantages of this approach include the high risk of a failure that might abort the migration process, definitive downtime (that may have to be extended due to unforeseen circumstances), and the risk of affecting the business (e.g., customer loyalty when the apps can’t be reached).

A big bang strategy is better suited to small companies, and small amounts of data. It’s not recommended for business-critical enterprise applications that must always be accessible.

Phased data migration
A phased strategy, in contrast, migrates the data in predefined stages. For example, by customer segments: VIP customers, followed by residential customers, followed by enterprise customers...or by geography.

During implementation, the old and new systems operate in parallel, until the migration is complete. This may take several months. Operational processes must cope with data residing in 2 systems, and have the ability to refer to the right system at the right time This reduces the risk of downtime, or operational interruptions.

This approach is much less likely to experience any unexpected failures, and is associated with zero downtime.

The disadvantages of this approach are that it is more expensive, time-consuming, and complex due to the need to have two systems running at once.

Phased migrations are better suited to enterprises that have zero tolerance for downtime, and have enough technical expertise to address the challenges that may arise. Examples of key industries required to provide 24/7 service include finance, retail, healthcare, and telecom.

On-demand data migration
An on-demand strategy migrates data, as the name suggests, on demand. This approach is used to move small amounts of data from one point to another, when needed.

The disadvantage of this approach is in ensuring the data integrity of a “micro” data migration. On-demand data migrations are typically implemented in conjunction with phased data migrations.

Chapter 05

Data Migration Considerations

Moving a massive dataset to the cloud, or anywhere else, is a highly complex undertaking that requires detailed coordination and planning to minimize disruptions.

Here are 10 data migration considerations:

  1. Business impact

    Is any data loss acceptable? If so, what kind, and how much? How would delays or unexpected challenges affect operations? How much time should a particular data migration take? For example, if a legacy system is being decommissioned, when will its license expire? What kind of data security is necessary throughout the migration process?

  2. Cost

    Is budget a priority? Using a cloud-based data migration tool saves on manpower and infrastructure costs, and frees up resources for other projects.

  3. Data consumers

    How will the data be used? For example, there may be different formatting and storage requirements for data used for regulatory compliance, compared to analytics. Who uses the data now, and who will use it in the future?

  4. Data model

    Does the data model have to change? Whether moving from an on-premise data lake to a cloud-based DWH, or from relational data to a mix of structured and unstructured data, cloud-based data migration tools tend to be more flexible than on-premise tools.

  5. Data privacy and security

    Is any of the data to be migrated of a sensitive nature? If so, it must comply with privacy regulations, that are often difficult to support during the migration process. Cloud-based tools are more likely to meet industry standards, while on-premise tools must rely on the security of the overall infrastructure.

  6. Data quality

    Does the source data need to be cleansed and enriched, or can it be loaded directly into the target? Which workflow is best at complying with data governance regulations, and improving data quality?

    "Data migration projects often exceed their budget by 25% to 100% or more, due to a lack of proactive attention to data quality issues (a problem that persists post-migration)."
    Gartner logo-1

  7. Data transformation

    Does the migrated data need to be transformed (cleansed, enriched, merged, etc.)? Although all data migration tools can transform data, cloud-based tools are often the most flexible, and support most data types.

  8. Data volume

    How much data needs to be migrated? To migrate terabytes (TB) of data, a client-provided storage device is usually the simplest and least expensive way to go. However, to migrate petabytes (PB) of data, a migration device supplied by your cloud provider might be the best option. Alternatively, a cloud data ingestion tool, or an online data migration tool, could also be used.

  9. Location

    Will the data be migrated prem-to-prem (within the same environment), from prem to cloud, or from cloud to cloud?

  10. Source/target environments

    What subsets of data need to be moved? Will the same operating system be running in both old and new environments? Are there any data quality issues, and do they need to be addressed prior to migration? Will data formatting or database schemas need to change?

Chapter 06

Data Migration Capabilities

It’s extremely challenging and time-consuming for an enterprise to develop its own data migration solution. Especially when commercially-available tools are so much more efficient and cost-effective. To make the right choice, look at:

  1. Connectivity

    Does it only support currently used software and systems, or is it future-proof to account for evolving business requirements and use cases?

  2. Scalability

    What are its data volume limits, and is it possible that the organization’s data needs might exceed them in the future? Can it scale quickly?

  3. Security

    What security measures does it support? Is it responsive in adapting to, and complying with, new data privacy regulations, as they emerge?

  4. Speed

    How fast can the data be processed? Can it reach sub-second response times for data provisioning, to support operational workloads?

  5. Transformation

    Can it transform structured and unstructured data? Can it support data fabric, data mesh, and data hub architectures – on-prem and in the cloud?

Chapter 07

Data Migration Requirements

Before embarking on a data migration program, be sure to:

  1. Catalog your data

    Use an automated data catalog to locate and access specific data quickly and easily.

  2. Define your goals

    Set your business goals, and select your data migration tool accordingly.

  3. Mind your metadata

    Augment your metadata with business context, curate it, and reveal data lineage and relationships between entities.

  4. Govern your data

    Enact data governance laws, that allow for no-code cleansing, enrichment, standardization, and data masking.

Chapter 08

Traditional Data Migration + Business Continuity = Mission Impossible

Migrating data is like moving offices. For example, take a company with many offices in a commercial building, moving to a new address.

In traditional data migration, data is moved one database table at a time. So in our example, only the computer equipment is moved, from each office in the old location, to each office in the new location.

The moving van then returns for the tables and chairs, and moves them in the same way. And it repeats this round trip process for the light fixtures – and, eventually, all the other office contents – until the move is complete.

The traditional way to move data is table by table, field by field.

The problem with this approach is that it requires a business to shut down, typically during national holidays or long weekends.

Today, any outage, for any length of time, is unacceptable to enterprises expected to provide 24/7 access to its applications. Think consumers withdrawing cash from bank ATMs, filling prescriptions at drugstores, or texting via their cell phones. Shutting down such services is like saying goodbye to customers.

Chapter 09

Entity-based Data Migration = Zero Disruption

Data migration, based on business entities, is an elegant way to simplify the most complex migration projects. Going back to our moving van analogy, the contents of each office is moved in its entirety – computer equipment, tables, chairs, light fixtures, and everything else it needs to operate – at the same time.

Most of us have moved, from one home to another, at some point, so we know the drill. We don’t move only the electrical appliances in the house, separately, and then come back for the furniture, etc. We move everything in the house at once.

Entity-based data migration moves all the data associated with a
particular business entity (1 office) – together, at the same time.

The beauty of this process is that even during the split-second it takes for one office to be moved, every other office is open for business. That’s exactly what a business entity approach does for data migration. While one business entity (customer, order, or payment) is being moved – along with all its related datasets – all the others remain fully accessible. That’s the formula for operational continuity.

Chapter 10

Why an Entity-based Approach Excels at High-Speed, High-Scale Data Migration

An entity-based approach enables the completion of complex enterprise data migration projects in weeks. The following is a summary of its advantages:

  1. One migration platform, zero business disruption

    • Built-in data integration, transformation, delivery, validation, and test data management (for ongoing changes to the new application)
    • Migration bridging enables business continuity, by being able to migrate data in phases
  2. Exceptional flexibility and time to value

    • Simple strategy switching (big bang/phased/on-demand), using a no-code, intuitive GUI
    • Full automation, from orchestration to reconciliation
    • Unmatched implementation speed, and ease of use
      — Transformations performed in-memory, via drag-and-drop tools
      — Fastest time to value on the market
      — Field-proven in the most complex enterprise data migrations
  3. Embedded validation

    • Cross-system, entity-level testing/cleansing of source and target data
    • Source-to-target reconciliation, with updates applied only as needed

Chapter 11

Going Beyond Data Migration with Data Products

A data product refers to a reusable data asset, built to provide a trusted dataset for a particular purpose. It integrates and processes data from underlying source systems, assures that it’s compliant with privacy regulations, and delivers it instantly to authorized data consumers.

A data product generally corresponds to one or more business entities (customers, suppliers, devices, orders, etc.) and is made up of metadata and dataset instances.

From the moment it’s first deployed, a data product stores and manages each dataset instance in its own high-performance Micro-Database™, for enterprise-grade agility, resilience, and scale.

With entity-based data migration software, each business entity (say, customer) corresponds to a data product instance (say, John Doe), which can be reused for a variety of use cases, notably:

Complete your data migration project in weeks with an entity-based approach

Get Demo