Docs Home
About TiDB
Quick Start
Develop
- Overview
- Quick Start
  - Build a TiDB Cluster in TiDB Cloud (Developer Tier)
  - CRUD SQL in TiDB
  - Build a Simple CRUD App with TiDB
    - Java
    - Golang
- Example Applications
  - Build a TiDB Application using Spring Boot
- Connect to TiDB
- Design Database Schema
- Write Data
- Read Data
- Transaction
- Optimize
  - Overview
  - SQL Performance Tuning
  - Best Practices for Performance Tuning
  - Best Practices for Indexing
  - Other Optimization Methods
    - Avoid Implicit Type Conversions
    - Unique Serial Number Generation
- Troubleshoot
- Reference
  - Bookshop Example Application
  - Guidelines
    - Object Naming Convention
    - SQL Development Specifications
  - Archived Docs
- Cloud Native Development Environment
  - Gitpod
- Third-party Support
  - Third-Party Libraries Support
  - Integrate with ProxySQL
Deploy
- Software and Hardware Requirements
- Environment Configuration Checklist
- Plan Cluster Topology
- Install and Start
  - Use TiUP (Recommended)
  - Deploy in Kubernetes
- Verify Cluster Status
- Test Cluster Performance
  - Test TiDB Using Sysbench
  - Test TiDB Using TPC-C
Migrate
Integrate
- Overview
- Integration Scenarios
  - Integrate with Confluent Cloud
  - Integrate with Apache Kafka and Apache Flink
Maintain
Monitor and Alert
Troubleshoot
Performance Tuning
- Tuning Guide
- Configuration Tuning
  - System Tuning
    - Operating System Tuning
  - Software Tuning
    - Configuration
    - Coprocessor Cache
- SQL Tuning
  - Overview
  - Understanding the Query Execution Plan
  - SQL Optimization Process
    - Overview
    - Logic Optimization
    - Physical Optimization
    - Prepare Execution Plan Cache
  - Control Execution Plans
Tutorials
TiDB Tools
- Overview
- Use Cases
- Download
- TiUP
- PingCAP Clinic Diagnostic Service
- TiDB Operator
- Dumpling
- TiDB Lightning
  - Overview
  - Prechecks and requirements
  - Key Features
  - Tutorial
  - Deploy
  - Configure
  - Monitor
  - FAQ
  - Glossary
- TiDB Data Migration
  - About TiDB Data Migration
  - Architecture
  - Quick Start
  - Deploy a DM cluster
  - Tutorials
    - Create a Data Source
    - Manage Data Sources
    - Configure Tasks
    - Table Routing
    - Block and Allow Lists
    - Binlog Event Filter
    - Filter DMLs Using SQL Expressions
    - Manage a Data Migration Task
  - Advanced Tutorials
    - Merge and Migrate Data from Sharded Tables
    - Migrate from MySQL Databases that Use GH-ost/PT-osc
    - Migrate Data to a Downstream TiDB Table with More Columns
    - Continuous Data Validation
  - Maintain
    - Cluster Upgrade
      - Maintain DM Clusters Using TiUP (Recommended)
      - Manually Upgrade from v1.0.x to v2.0+
    - Tools
      - Manage Using WebUI
      - Manage Using dmctl
    - Performance Tuning
    - Manage Data Sources
      - Switch the MySQL Instance to Be Migrated
    - Manage Tasks
      - Handle Failed DDL Statements
      - Manage Schemas of Tables to be Migrated
    - Export and Import Data Sources and Task Configurations of Clusters
    - Handle Alerts
    - Daily Check
  - Reference
    - Architecture
      - DM-worker
      - Relay Log
    - Command Line
      - DM-master & DM-worker
    - Configuration Files
    - OpenAPI
    - Compatibility Catalog
    - Secure
      - Enable TLS for DM Connections
      - Generate Self-signed Certificates
    - Monitoring and Alerts
      - Monitoring Metrics
      - Alert Rules
    - Error Codes
    - Glossary
  - Example
  - Troubleshoot
    - FAQ
    - Handle Errors
  - Release Notes
- Backup & Restore (BR)
- Point-in-Time Recovery
- TiDB Binlog
  - Overview
  - Quick Start
  - Deploy
  - Maintain
  - Configure
    - Pump
    - Drainer
  - Upgrade
  - Monitor
  - Reparo
  - binlogctl
  - Binlog Consumer Client
  - TiDB Binlog Relay Log
  - Bidirectional Replication Between TiDB Clusters
  - Glossary
  - Troubleshoot
    - Troubleshoot
    - Handle Errors
  - FAQ
- TiCDC
  - Overview
  - Deploy
  - Maintain
  - Monitor and Alert
    - Monitoring Metrics
    - Alert Rules
  - Troubleshoot
  - Reference
  - FAQs
  - Glossary
- Dumpling
- sync-diff-inspector
- TiSpark
  - User Guide
Reference
FAQs
Release Notes
- All Releases
- Release Timeline
- TiDB Versioning
- TiDB Installation Packages
- v6.2
  - 6.2.0-DMR
- v6.1
  - 6.1.0
- v6.0
  - 6.0.0-DMR
- v5.4
- v5.3
- v5.2
- v5.1
- v5.0
- v4.0
- v3.1
- v3.0
- v2.1
- v2.0
- v1.0
  - 1.0.8
  - 1.0.7
  - 1.0.6
  - 1.0.5
  - 1.0.4
  - 1.0.3
  - 1.0.2
  - 1.0.1
  - 1.0
  - Pre-GA
  - RC4
  - RC3
  - RC2
  - RC1
Glossary

Migrate from One TiDB Cluster to Another TiDB Cluster

This document describes how to migrate data from one TiDB cluster to another TiDB cluster. This function applies to the following scenarios:

Split databases: You can split databases when a TiDB cluster is excessively large, or you want to avoid impact between services of a cluster.
Relocate databases: Physically relocate databases, such as changing the data center.
Migrate data to a TiDB cluster of a newer version: Migrate data to a TiDB cluster of a newer version to satisfy data security and accuracy requirements.

This document exemplifies the whole migration process and contains the following steps:

Set up the environment.
Migrate full data.
Migrate incremental data.
Migrate services to the new TiDB cluster.

Step 1. Set up the environment

Deploy TiDB clusters.

Deploy two TiDB clusters, one upstream and the other downstream by using TiUP Playground. For more information, refer to Deploy and Maintain an Online TiDB Cluster Using TiUP.

# Create an upstream cluster
tiup --tag upstream playground --host 0.0.0.0 --db 1 --pd 1 --kv 1 --tiflash 0 --ticdc 1
# Create a downstream cluster
tiup --tag downstream playground --host 0.0.0.0 --db 1 --pd 1 --kv 1 --tiflash 0 --ticdc 1
# View cluster status
tiup status

Initialize data.

By default, test databases are created in the newly deployed clusters. Therefore, you can use sysbench to generate test data and simulate data in real scenarios.

sysbench oltp_write_only --config-file=./tidb-config --tables=10 --table-size=10000 prepare

In this document, we use sysbench to run the oltp_write_only script. This script generates 10 tables in the test database, each with 10,000 rows. The tidb-config is as follows:

mysql-host=172.16.6.122 # Replace the value with the IP address of your upstream cluster
mysql-port=4000
mysql-user=root
mysql-password=
db-driver=mysql         # Set database driver to MySQL
mysql-db=test           # Set the database as a test database
report-interval=10      # Set data collection period to 10s
threads=10              # Set the number of worker threads to 10
time=0                  # Set the time required for executing the script. O indicates time unlimited
rate=100                # Set average TPS to 100

Simulate service workload.
In real scenarios, service data is continuously written to the upstream cluster. In this document, we use sysbench to simulate this workload. Specifically, run the following command to enable 10 workers to continuously write data to three tables, sbtest1, sbtest2, and sbtest3, with a total TPS not exceeding 100.
```
sysbench oltp_write_only --config-file=./tidb-config --tables=3 run
```

Prepare external storage.

In full data backup, both the upstream and downstream clusters need to access backup files. It is recommended that you use External storage to store backup files. In this document, Minio is used to simulate an S3-compatible storage service.

wget https://dl.min.io/server/minio/release/linux-amd64/minio
chmod +x minio
# Configure access-key access-screct-id to access minio
export HOST_IP='172.16.6.122' # Replace the value with the IP address of your upstream cluster
export MINIO_ROOT_USER='minio'
export MINIO_ROOT_PASSWORD='miniostorage'
# Create the database directory. backup is the bucket name.
mkdir -p data/backup
# Start minio at port 6060
./minio server ./data --address :6060 &

The preceding command starts a minio server on one node to simulate S3 services. Parameters in the command are configured as follows:

Endpoint: http://${HOST_IP}:6060/
Access-key: minio
Secret-access-key: miniostorage
Bucket: backup

The access link is as follows:

s3://backup?access-key=minio&secret-access-key=miniostorage&endpoint=http://${HOST_IP}:6060&force-path-style=true

Step 2. Migrate full data

After setting up the environment, you can use the backup and restore functions of BR to migrate full data. BR can be started in three ways. In this document, we use the SQL statements, BACKUP and RESTORE.

Note

In production clusters, performing a backup with GC disabled might affect cluster performance. It is recommended that you back up data in off-peak hours, and set RATE_LIMIT to a proper value to avoid performance degradation.

If the versions of the upstream and downstream clusters are different, you should check BR compatibility. In this document, we assume that the upstream and downstream clusters are the same version.

Disable GC.
To ensure that newly written data is not deleted during incremental migration, you should disable GC for the upstream cluster before backup. In this way, history data is not deleted.
Run the following command to disable GC:
```
MySQL [test]> SET GLOBAL tidb_gc_enable=FALSE;
```
```
Query OK, 0 rows affected (0.01 sec)
```
To verify that the change takes effect, query the value of tidb_gc_enable:
```
MySQL [test]> SELECT @@global.tidb_gc_enable;
```
```
+-------------------------+:
| @@global.tidb_gc_enable |
+-------------------------+
|                       0 |
+-------------------------+
1 row in set (0.00 sec)
```

Back up data.

Run the BACKUP statement in the upstream cluster to back up data:

MySQL [(none)]> BACKUP DATABASE * TO 's3://backup?access-key=minio&secret-access-key=miniostorage&endpoint=http://${HOST_IP}:6060&force-path-style=true' RATE_LIMIT = 120 MB/SECOND;

+---------------+----------+--------------------+---------------------+---------------------+
| Destination   | Size     | BackupTS           | Queue Time          | Execution Time      |
+---------------+----------+--------------------+---------------------+---------------------+
| s3://backup   | 10315858 | 431434047157698561 | 2022-02-25 19:57:59 | 2022-02-25 19:57:59 |
+---------------+----------+--------------------+---------------------+---------------------+
1 row in set (2.11 sec)

After the BACKUP command is executed, TiDB returns metadata about the backup data. Pay attention to BackupTS, because data generated before it is backed up. In this document, we use BackupTS as the end of data check and the start of incremental migration scanning by TiCDC.

Restore data.

Run the RESTORE command in the downstream cluster to restore data:

mysql> RESTORE DATABASE * FROM 's3://backup?access-key=minio&secret-access-key=miniostorage&endpoint=http://${HOST_IP}:6060&force-path-style=true';

+--------------+-----------+--------------------+---------------------+---------------------+
| Destination  | Size      | BackupTS           | Queue Time          | Execution Time      |
+--------------+-----------+--------------------+---------------------+---------------------+
| s3://backup  | 10315858  | 431434141450371074 | 2022-02-25 20:03:59 | 2022-02-25 20:03:59 |
+--------------+-----------+--------------------+---------------------+---------------------+
1 row in set (41.85 sec)

(Optional) Validate data.

You can use sync-diff-inspector to check data consistency between upstream and downstream at a certain time. The preceding BACKUP output shows that the upstream cluster finishes backup at 431434047157698561. The preceding RESTORE output shows that the downstream finishes restoration at 431434141450371074.

sync_diff_inspector -C ./config.yaml

For details about how to configure the sync-diff-inspector, see Configuration file description. In this document, the configuration is as follows:

# Diff Configuration.
######################### Datasource config #########################
[data-sources]
[data-sources.upstream]
    host = "172.16.6.122" # Replace the value with the IP address of your upstream cluster
    port = 4000
    user = "root"
    password = ""
    snapshot = "431434047157698561" # Set snapshot to the actual backup time (BackupTS in the "Back up data" section in [Step 2. Migrate full data](#step-2-migrate-full-data))
[data-sources.downstream]
    host = "172.16.6.125" # Replace the value with the IP address of your downstream cluster
    port = 4000
    user = "root"
    password = ""

######################### Task config #########################
[task]
    output-dir = "./output"
    source-instances = ["upstream"]
    target-instance = "downstream"
    target-check-tables = ["*.*"]

Step 3. Migrate incremental data

Deploy TiCDC.
After finishing full data migration, deploy and configure a TiCDC to replicate incremental data. In production environments, deploy TiCDC as instructed in Deploy TiCDC. In this document, a TiCDC node has been started upon the creation of the test clusters. Therefore, you can skip the step of deploying TiCDC and proceed with changefeed configuration.
Create a changefeed.
In the upstream cluster, run the following command to create a changefeed from the upstream to the downstream clusters:
```
tiup cdc cli changefeed create --pd=http://172.16.6.122:2379 --sink-uri="mysql://root:@172.16.6.125:4000" --changefeed-id="upstream-to-downstream" --start-ts="431434047157698561"
```
In this command, the parameters are as follows:
- --pd: PD address of the upstream cluster
- --sink-uri: URI of the downstream cluster
- --changefeed-id: changefeed ID, must be in the format of a regular expression, ^[a-zA-Z0-9]+(-[a-zA-Z0-9]+)*$
- --start-ts: start timestamp of the changefeed, must be the backup time (or BackupTS in the "Back up data" section in Step 2. Migrate full data)
For more information about the changefeed configurations, see Task configuration file.
Enable GC.
In incremental migration using TiCDC, GC only removes history data that is replicated. Therefore, after creating a changefeed, you need to run the following command to enable GC. For details, see What is the complete behavior of TiCDC garbage collection (GC) safepoint?.
To enable GC, run the following command:
```
MySQL [test]> SET GLOBAL tidb_gc_enable=TRUE;
```
```
Query OK, 0 rows affected (0.01 sec)
```
To verify that the change takes effect, query the value of tidb_gc_enable:
```
MySQL [test]> SELECT @@global.tidb_gc_enable;
```
```
+-------------------------+
| @@global.tidb_gc_enable |
+-------------------------+
|                       1 |
+-------------------------+
1 row in set (0.00 sec)
```

Step 4. Migrate services to the new TiDB cluster

After creating a changefeed, data written to the upstream cluster is replicated to the downstream cluster with low latency. You can migrate read traffic to the downstream cluster gradually. Observe for a period. If the downstream cluster is stable, you can migrate write traffic to the downstream cluster by performing the following steps:

Stop write services in the upstream cluster. Make sure that all upstream data are replicated to downstream before stopping the changefeed.

# Stop the changefeed from the upstream cluster to the downstream cluster
tiup cdc cli changefeed pause -c "upstream-to-downstream" --pd=http://172.16.6.122:2379

# View the changefeed status
tiup cdc cli changefeed list

[
  {
    "id": "upstream-to-downstream",
    "summary": {
    "state": "stopped",  # Ensure that the status is stopped
    "tso": 431747241184329729,
    "checkpoint": "2022-03-11 15:50:20.387", # This time must be later than the time of stopping writing
    "error": null
    }
  }
]

Create a changefeed from downstream to upstream. You can leave start-ts unspecified so as to use the default setting, because the upstream and downstream data are consistent and there is no new data written to the cluster.
```
tiup cdc cli changefeed create --pd=http://172.16.6.125:2379 --sink-uri="mysql://root:@172.16.6.122:4000" --changefeed-id="downstream -to-upstream"
```
After migrating writing services to the downstream cluster, observe for a period. If the downstream cluster is stable, you can discard the upstream cluster.

Download PDF Request docs changes Edit this page

What’s on this page

Step 1. Set up the environment
Step 2. Migrate full data
Step 3. Migrate incremental data
Step 4. Migrate services to the new TiDB cluster

Was this page helpful?