Docs Home
About TiDB
Quick Start
Develop
- Overview
- Quick Start
  - Build a TiDB Cluster in TiDB Cloud (Developer Tier)
  - CRUD SQL in TiDB
  - Build a Simple CRUD App with TiDB
    - Java
    - Golang
- Example Applications
  - Build a TiDB Application using Spring Boot
- Connect to TiDB
- Design Database Schema
- Write Data
- Read Data
- Transaction
- Optimize
  - Overview
  - SQL Performance Tuning
  - Best Practices for Performance Tuning
  - Best Practices for Indexing
  - Other Optimization Methods
    - Avoid Implicit Type Conversions
    - Unique Serial Number Generation
- Troubleshoot
- Reference
  - Bookshop Example Application
  - Guidelines
    - Object Naming Convention
    - SQL Development Specifications
  - Archived Docs
- Cloud Native Development Environment
  - Gitpod
- Third-party Support
  - Third-Party Libraries Support
  - Integrate with ProxySQL
Deploy
- Software and Hardware Requirements
- Environment Configuration Checklist
- Plan Cluster Topology
- Install and Start
  - Use TiUP (Recommended)
  - Deploy in Kubernetes
- Verify Cluster Status
- Test Cluster Performance
  - Test TiDB Using Sysbench
  - Test TiDB Using TPC-C
Migrate
Integrate
- Overview
- Integration Scenarios
  - Integrate with Confluent Cloud
  - Integrate with Apache Kafka and Apache Flink
Maintain
Monitor and Alert
Troubleshoot
Performance Tuning
- Tuning Guide
- Configuration Tuning
  - System Tuning
    - Operating System Tuning
  - Software Tuning
    - Configuration
    - Coprocessor Cache
- SQL Tuning
  - Overview
  - Understanding the Query Execution Plan
  - SQL Optimization Process
    - Overview
    - Logic Optimization
    - Physical Optimization
    - Prepare Execution Plan Cache
  - Control Execution Plans
Tutorials
TiDB Tools
- Overview
- Use Cases
- Download
- TiUP
- PingCAP Clinic Diagnostic Service
- TiDB Operator
- Dumpling
- TiDB Lightning
  - Overview
  - Prechecks and requirements
  - Key Features
  - Tutorial
  - Deploy
  - Configure
  - Monitor
  - FAQ
  - Glossary
- TiDB Data Migration
  - About TiDB Data Migration
  - Architecture
  - Quick Start
  - Deploy a DM cluster
  - Tutorials
    - Create a Data Source
    - Manage Data Sources
    - Configure Tasks
    - Table Routing
    - Block and Allow Lists
    - Binlog Event Filter
    - Filter DMLs Using SQL Expressions
    - Manage a Data Migration Task
  - Advanced Tutorials
    - Merge and Migrate Data from Sharded Tables
    - Migrate from MySQL Databases that Use GH-ost/PT-osc
    - Migrate Data to a Downstream TiDB Table with More Columns
    - Continuous Data Validation
  - Maintain
    - Cluster Upgrade
      - Maintain DM Clusters Using TiUP (Recommended)
      - Manually Upgrade from v1.0.x to v2.0+
    - Tools
      - Manage Using WebUI
      - Manage Using dmctl
    - Performance Tuning
    - Manage Data Sources
      - Switch the MySQL Instance to Be Migrated
    - Manage Tasks
      - Handle Failed DDL Statements
      - Manage Schemas of Tables to be Migrated
    - Export and Import Data Sources and Task Configurations of Clusters
    - Handle Alerts
    - Daily Check
  - Reference
    - Architecture
      - DM-worker
      - Relay Log
    - Command Line
      - DM-master & DM-worker
    - Configuration Files
    - OpenAPI
    - Compatibility Catalog
    - Secure
      - Enable TLS for DM Connections
      - Generate Self-signed Certificates
    - Monitoring and Alerts
      - Monitoring Metrics
      - Alert Rules
    - Error Codes
    - Glossary
  - Example
  - Troubleshoot
    - FAQ
    - Handle Errors
  - Release Notes
- Backup & Restore (BR)
- Point-in-Time Recovery
- TiDB Binlog
  - Overview
  - Quick Start
  - Deploy
  - Maintain
  - Configure
    - Pump
    - Drainer
  - Upgrade
  - Monitor
  - Reparo
  - binlogctl
  - Binlog Consumer Client
  - TiDB Binlog Relay Log
  - Bidirectional Replication Between TiDB Clusters
  - Glossary
  - Troubleshoot
    - Troubleshoot
    - Handle Errors
  - FAQ
- TiCDC
  - Overview
  - Deploy
  - Maintain
  - Monitor and Alert
    - Monitoring Metrics
    - Alert Rules
  - Troubleshoot
  - Reference
  - FAQs
  - Glossary
- Dumpling
- sync-diff-inspector
- TiSpark
  - User Guide
Reference
FAQs
Release Notes
- All Releases
- Release Timeline
- TiDB Versioning
- TiDB Installation Packages
- v6.2
  - 6.2.0-DMR
- v6.1
  - 6.1.0
- v6.0
  - 6.0.0-DMR
- v5.4
- v5.3
- v5.2
- v5.1
- v5.0
- v4.0
- v3.1
- v3.0
- v2.1
- v2.0
- v1.0
  - 1.0.8
  - 1.0.7
  - 1.0.6
  - 1.0.5
  - 1.0.4
  - 1.0.3
  - 1.0.2
  - 1.0.1
  - 1.0
  - Pre-GA
  - RC4
  - RC3
  - RC2
  - RC1
Glossary

Data Check in the Sharding Scenario

sync-diff-inspector supports data check in the sharding scenario. Assume that you use the TiDB Data Migration tool to replicate data from multiple MySQL instances into TiDB, you can use sync-diff-inspector to check upstream and downstream data.

For scenarios where the number of upstream sharded tables is small and the naming rules of sharded tables do not have a pattern as shown below, you can use Datasource config to configure table-0, set corresponding rules and configure the tables that have the mapping relationship between the upstream and downstream databases. This configuration method requires setting all sharded tables.

shard-table-replica-1

Below is a complete example of the sync-diff-inspector configuration.

# Diff Configuration.

######################### Global config #########################

# The number of goroutines created to check data. The number of connections between upstream and downstream databases are slightly greater than this value
check-thread-count = 4

# If enabled, SQL statements is exported to fix inconsistent tables
export-fix-sql = true

# Only compares the table structure instead of the data
check-struct-only = false

######################### Datasource config #########################
[data-sources.mysql1]
    host = "127.0.0.1"
    port = 3306
    user = "root"
    password = ""

    route-rules = ["rule1"]

[data-sources.mysql2]
    host = "127.0.0.1"
    port = 3306
    user = "root"
    password = ""

    route-rules = ["rule2"]

[data-sources.tidb0]
    host = "127.0.0.1"
    port = 4000
    user = "root"
    password = ""

########################### Routes ###########################
[routes.rule1]
schema-pattern = "test"        # Matches the schema name of the data source. Supports the wildcards "*" and "?"
table-pattern = "table-[1-2]"  # Matches the table name of the data source. Supports the wildcards "*" and "?"
target-schema = "test"         # The name of the schema in the target database
target-table = "table-0"       # The name of the target table

[routes.rule2]
schema-pattern = "test"      # Matches the schema name of the data source. Supports the wildcards "*" and "?"
table-pattern = "table-3"    # Matches the table name of the data source. Supports the wildcards "*" and "?"
target-schema = "test"       # The name of the schema in the target database
target-table = "table-0"     # The name of the target table

######################### Task config #########################
[task]
    output-dir = "./output"

    source-instances = ["mysql1", "mysql2"]

    target-instance = "tidb0"

    # The tables of downstream databases to be compared. Each table needs to contain the schema name and the table name, separated by '.'
    target-check-tables = ["test.table-0"]

You can use table-rules for configuration when there are a large number of upstream sharded tables and the naming rules of all sharded tables have a pattern, as shown below:

shard-table-replica-2

Below is a complete example of the sync-diff-inspector configuration.

# Diff Configuration.
######################### Global config #########################

# The number of goroutines created to check data. The number of connections between upstream and downstream databases are slightly greater than this value.
check-thread-count = 4

# If enabled, SQL statements is exported to fix inconsistent tables.
export-fix-sql = true

# Only compares the table structure instead of the data.
check-struct-only = false

######################### Datasource config #########################
[data-sources.mysql1]
    host = "127.0.0.1"
    port = 3306
    user = "root"
    password = ""

[data-sources.mysql2]
    host = "127.0.0.1"
    port = 3306
    user = "root"
    password = ""

[data-sources.tidb0]
    host = "127.0.0.1"
    port = 4000
    user = "root"
    password = ""

########################### Routes ###########################
[routes.rule1]
schema-pattern = "test"      # Matches the schema name of the data source. Supports the wildcards "*" and "?"
table-pattern = "table-*"    # Matches the table name of the data source. Supports the wildcards "*" and "?"
target-schema = "test"       # The name of the schema in the target database
target-table = "table-0"     # The name of the target table

######################### Task config #########################
[task]
    output-dir = "./output"
    source-instances = ["mysql1", "mysql2"]

    target-instance = "tidb0"

    # The tables of downstream databases to be compared. Each table needs to contain the schema name and the table name, separated by '.'
    target-check-tables = ["test.table-0"]

Note

If test.table-0 exists in the upstream database, the downstream database also compares this table.

Download PDF Request docs changes Edit this page

What’s on this page

Note

Was this page helpful?