- Docs Home
- About TiDB
- Quick Start
- Develop
- Overview
- Quick Start
- Build a TiDB Cluster in TiDB Cloud (Developer Tier)
- CRUD SQL in TiDB
- Build a Simple CRUD App with TiDB
- Example Applications
- Connect to TiDB
- Design Database Schema
- Write Data
- Read Data
- Transaction
- Optimize
- Troubleshoot
- Reference
- Cloud Native Development Environment
- Third-party Support
- Deploy
- Software and Hardware Requirements
- Environment Configuration Checklist
- Plan Cluster Topology
- Install and Start
- Verify Cluster Status
- Test Cluster Performance
- Migrate
- Overview
- Migration Tools
- Migration Scenarios
- Migrate from Aurora
- Migrate MySQL of Small Datasets
- Migrate MySQL of Large Datasets
- Migrate and Merge MySQL Shards of Small Datasets
- Migrate and Merge MySQL Shards of Large Datasets
- Migrate from CSV Files
- Migrate from SQL Files
- Migrate from One TiDB Cluster to Another TiDB Cluster
- Migrate from TiDB to MySQL-compatible Databases
- Advanced Migration
- Integrate
- Overview
- Integration Scenarios
- Maintain
- Monitor and Alert
- Troubleshoot
- TiDB Troubleshooting Map
- Identify Slow Queries
- Analyze Slow Queries
- SQL Diagnostics
- Identify Expensive Queries Using Top SQL
- Identify Expensive Queries Using Logs
- Statement Summary Tables
- Troubleshoot Hotspot Issues
- Troubleshoot Increased Read and Write Latency
- Save and Restore the On-Site Information of a Cluster
- Troubleshoot Cluster Setup
- Troubleshoot High Disk I/O Usage
- Troubleshoot Lock Conflicts
- Troubleshoot TiFlash
- Troubleshoot Write Conflicts in Optimistic Transactions
- Troubleshoot Inconsistency Between Data and Indexes
- Performance Tuning
- Tuning Guide
- Configuration Tuning
- System Tuning
- Software Tuning
- SQL Tuning
- Overview
- Understanding the Query Execution Plan
- SQL Optimization Process
- Overview
- Logic Optimization
- Physical Optimization
- Prepare Execution Plan Cache
- Control Execution Plans
- Tutorials
- TiDB Tools
- Overview
- Use Cases
- Download
- TiUP
- Documentation Map
- Overview
- Terminology and Concepts
- Manage TiUP Components
- FAQ
- Troubleshooting Guide
- Command Reference
- Overview
- TiUP Commands
- TiUP Cluster Commands
- Overview
- tiup cluster audit
- tiup cluster check
- tiup cluster clean
- tiup cluster deploy
- tiup cluster destroy
- tiup cluster disable
- tiup cluster display
- tiup cluster edit-config
- tiup cluster enable
- tiup cluster help
- tiup cluster import
- tiup cluster list
- tiup cluster patch
- tiup cluster prune
- tiup cluster reload
- tiup cluster rename
- tiup cluster replay
- tiup cluster restart
- tiup cluster scale-in
- tiup cluster scale-out
- tiup cluster start
- tiup cluster stop
- tiup cluster template
- tiup cluster upgrade
- TiUP DM Commands
- Overview
- tiup dm audit
- tiup dm deploy
- tiup dm destroy
- tiup dm disable
- tiup dm display
- tiup dm edit-config
- tiup dm enable
- tiup dm help
- tiup dm import
- tiup dm list
- tiup dm patch
- tiup dm prune
- tiup dm reload
- tiup dm replay
- tiup dm restart
- tiup dm scale-in
- tiup dm scale-out
- tiup dm start
- tiup dm stop
- tiup dm template
- tiup dm upgrade
- TiDB Cluster Topology Reference
- DM Cluster Topology Reference
- Mirror Reference Guide
- TiUP Components
- PingCAP Clinic Diagnostic Service
- TiDB Operator
- Dumpling
- TiDB Lightning
- TiDB Data Migration
- About TiDB Data Migration
- Architecture
- Quick Start
- Deploy a DM cluster
- Tutorials
- Advanced Tutorials
- Maintain
- Cluster Upgrade
- Tools
- Performance Tuning
- Manage Data Sources
- Manage Tasks
- Export and Import Data Sources and Task Configurations of Clusters
- Handle Alerts
- Daily Check
- Reference
- Architecture
- Command Line
- Configuration Files
- OpenAPI
- Compatibility Catalog
- Secure
- Monitoring and Alerts
- Error Codes
- Glossary
- Example
- Troubleshoot
- Release Notes
- Backup & Restore (BR)
- Point-in-Time Recovery
- TiDB Binlog
- TiCDC
- Dumpling
- sync-diff-inspector
- TiSpark
- Reference
- Cluster Architecture
- Key Monitoring Metrics
- Secure
- Privileges
- SQL
- SQL Language Structure and Syntax
- SQL Statements
ADD COLUMNADD INDEXADMINADMIN CANCEL DDLADMIN CHECKSUM TABLEADMIN CHECK [TABLE|INDEX]ADMIN SHOW DDL [JOBS|QUERIES]ADMIN SHOW TELEMETRYALTER DATABASEALTER INDEXALTER INSTANCEALTER PLACEMENT POLICYALTER TABLEALTER TABLE COMPACTALTER TABLE SET TIFLASH MODEALTER USERANALYZE TABLEBACKUPBATCHBEGINCHANGE COLUMNCOMMITCHANGE DRAINERCHANGE PUMPCREATE [GLOBAL|SESSION] BINDINGCREATE DATABASECREATE INDEXCREATE PLACEMENT POLICYCREATE ROLECREATE SEQUENCECREATE TABLE LIKECREATE TABLECREATE USERCREATE VIEWDEALLOCATEDELETEDESCDESCRIBEDODROP [GLOBAL|SESSION] BINDINGDROP COLUMNDROP DATABASEDROP INDEXDROP PLACEMENT POLICYDROP ROLEDROP SEQUENCEDROP STATSDROP TABLEDROP USERDROP VIEWEXECUTEEXPLAIN ANALYZEEXPLAINFLASHBACK TABLEFLUSH PRIVILEGESFLUSH STATUSFLUSH TABLESGRANT <privileges>GRANT <role>INSERTKILL [TIDB]LOAD DATALOAD STATSMODIFY COLUMNPREPARERECOVER TABLERENAME INDEXRENAME TABLEREPLACERESTOREREVOKE <privileges>REVOKE <role>ROLLBACKSAVEPOINTSELECTSET DEFAULT ROLESET [NAMES|CHARACTER SET]SET PASSWORDSET ROLESET TRANSACTIONSET [GLOBAL|SESSION] <variable>SHOW ANALYZE STATUSSHOW [BACKUPS|RESTORES]SHOW [GLOBAL|SESSION] BINDINGSSHOW BUILTINSSHOW CHARACTER SETSHOW COLLATIONSHOW [FULL] COLUMNS FROMSHOW CONFIGSHOW CREATE PLACEMENT POLICYSHOW CREATE SEQUENCESHOW CREATE TABLESHOW CREATE USERSHOW DATABASESSHOW DRAINER STATUSSHOW ENGINESSHOW ERRORSSHOW [FULL] FIELDS FROMSHOW GRANTSSHOW INDEX [FROM|IN]SHOW INDEXES [FROM|IN]SHOW KEYS [FROM|IN]SHOW MASTER STATUSSHOW PLACEMENTSHOW PLACEMENT FORSHOW PLACEMENT LABELSSHOW PLUGINSSHOW PRIVILEGESSHOW [FULL] PROCESSSLISTSHOW PROFILESSHOW PUMP STATUSSHOW SCHEMASSHOW STATS_HEALTHYSHOW STATS_HISTOGRAMSSHOW STATS_METASHOW STATUSSHOW TABLE NEXT_ROW_IDSHOW TABLE REGIONSSHOW TABLE STATUSSHOW [FULL] TABLESSHOW [GLOBAL|SESSION] VARIABLESSHOW WARNINGSSHUTDOWNSPLIT REGIONSTART TRANSACTIONTABLETRACETRUNCATEUPDATEUSEWITH
- Data Types
- Functions and Operators
- Overview
- Type Conversion in Expression Evaluation
- Operators
- Control Flow Functions
- String Functions
- Numeric Functions and Operators
- Date and Time Functions
- Bit Functions and Operators
- Cast Functions and Operators
- Encryption and Compression Functions
- Locking Functions
- Information Functions
- JSON Functions
- Aggregate (GROUP BY) Functions
- Window Functions
- Miscellaneous Functions
- Precision Math
- Set Operations
- List of Expressions for Pushdown
- TiDB Specific Functions
- Clustered Indexes
- Constraints
- Generated Columns
- SQL Mode
- Table Attributes
- Transactions
- Garbage Collection (GC)
- Views
- Partitioning
- Temporary Tables
- Cached Tables
- Character Set and Collation
- Placement Rules in SQL
- System Tables
mysql- INFORMATION_SCHEMA
- Overview
ANALYZE_STATUSCLIENT_ERRORS_SUMMARY_BY_HOSTCLIENT_ERRORS_SUMMARY_BY_USERCLIENT_ERRORS_SUMMARY_GLOBALCHARACTER_SETSCLUSTER_CONFIGCLUSTER_HARDWARECLUSTER_INFOCLUSTER_LOADCLUSTER_LOGCLUSTER_SYSTEMINFOCOLLATIONSCOLLATION_CHARACTER_SET_APPLICABILITYCOLUMNSDATA_LOCK_WAITSDDL_JOBSDEADLOCKSENGINESINSPECTION_RESULTINSPECTION_RULESINSPECTION_SUMMARYKEY_COLUMN_USAGEMETRICS_SUMMARYMETRICS_TABLESPARTITIONSPLACEMENT_POLICIESPROCESSLISTREFERENTIAL_CONSTRAINTSSCHEMATASEQUENCESSESSION_VARIABLESSLOW_QUERYSTATISTICSTABLESTABLE_CONSTRAINTSTABLE_STORAGE_STATSTIDB_HOT_REGIONSTIDB_HOT_REGIONS_HISTORYTIDB_INDEXESTIDB_SERVERS_INFOTIDB_TRXTIFLASH_REPLICATIKV_REGION_PEERSTIKV_REGION_STATUSTIKV_STORE_STATUSUSER_PRIVILEGESVARIABLES_INFOVIEWS
METRICS_SCHEMA
- UI
- TiDB Dashboard
- Overview
- Maintain
- Access
- Overview Page
- Cluster Info Page
- Top SQL Page
- Key Visualizer Page
- Metrics Relation Graph
- SQL Statements Analysis
- Slow Queries Page
- Cluster Diagnostics
- Monitoring Page
- Search Logs Page
- Instance Profiling
- Session Management and Configuration
- FAQ
- CLI
- Command Line Flags
- Configuration File Parameters
- System Variables
- Storage Engines
- Telemetry
- Errors Codes
- Table Filter
- Schedule Replicas by Topology Labels
- FAQs
- Release Notes
- All Releases
- Release Timeline
- TiDB Versioning
- TiDB Installation Packages
- v6.2
- v6.1
- v6.0
- v5.4
- v5.3
- v5.2
- v5.1
- v5.0
- v4.0
- v3.1
- v3.0
- v2.1
- v2.0
- v1.0
- Glossary
TiDB Binlog Cluster Deployment
This document describes how to deploy TiDB Binlog using a Binary package.
Hardware requirements
Pump and Drainer are deployed and operate on 64-bit universal hardware server platforms with Intel x86-64 architecture.
In environments of development, testing and production, the requirements on server hardware are as follows:
| Service | The Number of Servers | CPU | Disk | Memory |
|---|---|---|---|---|
| Pump | 3 | 8 core+ | SSD, 200 GB+ | 16G |
| Drainer | 1 | 8 core+ | SAS, 100 GB+ (If binlogs are output as local files, the disk size depends on how long these files are retained.) | 16G |
Deploy TiDB Binlog using TiUP
It is recommended to deploy TiDB Binlog using TiUP. To do that, when deploying TiDB using TiUP, you need to add the node information of drainer and pump of TiDB Binlog in TiDB Binlog Deployment Topology. For detailed deployment information, refer to Deploy a TiDB Cluster Using TiUP.
Deploy TiDB Binlog using a binary package
Download the official binary package
The binary package of TiDB Binlog is included in the TiDB Toolkit. To download the TiDB Toolkit, see Download TiDB Tools.
The usage example
Assuming that you have three PD nodes, one TiDB node, two Pump nodes, and one Drainer node, the information of each node is as follows:
| Node | IP |
|---|---|
| TiDB | 192.168.0.10 |
| PD1 | 192.168.0.16 |
| PD2 | 192.168.0.15 |
| PD3 | 192.168.0.14 |
| Pump | 192.168.0.11 |
| Pump | 192.168.0.12 |
| Drainer | 192.168.0.13 |
The following part shows how to use Pump and Drainer based on the nodes above.
Deploy Pump using the binary.
To view the command line parameters of Pump, execute
./pump -help:Usage of Pump: -L string the output information level of logs: debug, info, warn, error, fatal ("info" by default) -V the print version information -addr string the RPC address through which Pump provides the service (-addr="192.168.0.11:8250") -advertise-addr string the RPC address through which Pump provides the external service (-advertise-addr="192.168.0.11:8250") -config string the path of the configuration file. If you specify the configuration file, Pump reads the configuration in the configuration file first. If the corresponding configuration also exits in the command line parameters, Pump uses the configuration of the command line parameters to cover that of the configuration file. -data-dir string the path where the Pump data is stored -gc int the number of days to retain the data in Pump ("7" by default) -heartbeat-interval int the interval of the heartbeats Pump sends to PD (in seconds) -log-file string the file path of logs -log-rotate string the switch frequency of logs (hour/day) -metrics-addr string the Prometheus Pushgateway address. If not set, it is forbidden to report the monitoring metrics. -metrics-interval int the report frequency of the monitoring metrics ("15" by default, in seconds) -node-id string the unique ID of a Pump node. If you do not specify this ID, the system automatically generates an ID based on the host name and listening port. -pd-urls string the address of the PD cluster nodes (-pd-urls="http://192.168.0.16:2379,http://192.168.0.15:2379,http://192.168.0.14:2379") -fake-binlog-interval int the frequency at which a Pump node generates fake binlog ("3" by default, in seconds)Taking deploying Pump on "192.168.0.11" as an example, the Pump configuration file is as follows:
# Pump Configuration # the bound address of Pump addr = "192.168.0.11:8250" # the address through which Pump provides the service advertise-addr = "192.168.0.11:8250" # the number of days to retain the data in Pump ("7" by default) gc = 7 # the directory where the Pump data is stored data-dir = "data.pump" # the interval of the heartbeats Pump sends to PD (in seconds) heartbeat-interval = 2 # the address of the PD cluster nodes (each separated by a comma with no whitespace) pd-urls = "http://192.168.0.16:2379,http://192.168.0.15:2379,http://192.168.0.14:2379" # [security] # This section is generally commented out if no special security settings are required. # The file path containing a list of trusted SSL CAs connected to the cluster. # ssl-ca = "/path/to/ca.pem" # The path to the X509 certificate in PEM format that is connected to the cluster. # ssl-cert = "/path/to/drainer.pem" # The path to the X509 key in PEM format that is connected to the cluster. # ssl-key = "/path/to/drainer-key.pem" # [storage] # Set to true (by default) to guarantee reliability by ensuring binlog data is flushed to the disk # sync-log = true # When the available disk capacity is less than the set value, Pump stops writing data. # 42 MB -> 42000000, 42 mib -> 44040192 # default: 10 gib # stop-write-at-available-space = "10 gib" # The LSM DB settings embedded in Pump. Unless you know this part well, it is usually commented out. # [storage.kv] # block-cache-capacity = 8388608 # block-restart-interval = 16 # block-size = 4096 # compaction-L0-trigger = 8 # compaction-table-size = 67108864 # compaction-total-size = 536870912 # compaction-total-size-multiplier = 8.0 # write-buffer = 67108864 # write-L0-pause-trigger = 24 # write-L0-slowdown-trigger = 17The example of starting Pump:
./pump -config pump.tomlIf the command line parameters is the same with the configuration file parameters, the values of command line parameters are used.
Deploy Drainer using binary.
To view the command line parameters of Drainer, execute
./drainer -help:Usage of Drainer: -L string the output information level of logs: debug, info, warn, error, fatal ("info" by default) -V the print version information -addr string the address through which Drainer provides the service (-addr="192.168.0.13:8249") -c int the number of the concurrency of the downstream for replication. The bigger the value, the better throughput performance of the concurrency ("1" by default). -cache-binlog-count int the limit on the number of binlog items in the cache ("8" by default) If a large single binlog item in the upstream causes OOM in Drainer, try to lower the value of this parameter to reduce memory usage. -config string the directory of the configuration file. Drainer reads the configuration file first. If the corresponding configuration exists in the command line parameters, Drainer uses the configuration of the command line parameters to cover that of the configuration file. -data-dir string the directory where the Drainer data is stored ("data.drainer" by default) -dest-db-type string the downstream service type of Drainer The value can be "mysql", "tidb", "kafka", and "file". ("mysql" by default) -detect-interval int the interval of checking the online Pump in PD ("10" by default, in seconds) -disable-detect whether to disable the conflict monitoring -disable-dispatch whether to disable the SQL feature of splitting a single binlog file. If it is set to "true", each binlog file is restored to a single transaction for replication based on the order of binlogs. It is set to "False", when the downstream is MySQL. -ignore-schemas string the db filter list ("INFORMATION_SCHEMA,PERFORMANCE_SCHEMA,mysql,test" by default) It does not support the Rename DDL operation on tables of `ignore schemas`. -initial-commit-ts If Drainer does not have the related breakpoint information, you can configure the related breakpoint information using this parameter. ("-1" by default) If the value of this parameter is `-1`, Drainer automatically obtains the latest timestamp from PD. -log-file string the path of the log file -log-rotate string the switch frequency of log files, hour/day -metrics-addr string the Prometheus Pushgateway address It it is not set, the monitoring metrics are not reported. -metrics-interval int the report frequency of the monitoring metrics ("15" by default, in seconds) -node-id string the unique ID of a Drainer node. If you do not specify this ID, the system automatically generates an ID based on the host name and listening port. -pd-urls string the address of the PD cluster nodes (-pd-urls="http://192.168.0.16:2379,http://192.168.0.15:2379,http://192.168.0.14:2379") -safe-mode Whether to enable safe mode so that data can be written into the downstream MySQL/TiDB repeatedly. This mode replaces the `INSERT` statement with the `REPLACE` statement and splits the `UPDATE` statement into `DELETE` plus `REPLACE`. -txn-batch int the number of SQL statements of a transaction which are output to the downstream database ("1" by default)Taking deploying Drainer on "192.168.0.13" as an example, the Drainer configuration file is as follows:
# Drainer Configuration. # the address through which Drainer provides the service ("192.168.0.13:8249") addr = "192.168.0.13:8249" # the address through which Drainer provides the external service advertise-addr = "192.168.0.13:8249" # the interval of checking the online Pump in PD ("10" by default, in seconds) detect-interval = 10 # the directory where the Drainer data is stored "data.drainer" by default) data-dir = "data.drainer" # the address of the PD cluster nodes (each separated by a comma with no whitespace) pd-urls = "http://192.168.0.16:2379,http://192.168.0.15:2379,http://192.168.0.14:2379" # the directory of the log file log-file = "drainer.log" # Drainer compresses the data when it gets the binlog from Pump. The value can be "gzip". If it is not configured, it will not be compressed # compressor = "gzip" # [security] # This section is generally commented out if no special security settings are required. # The file path containing a list of trusted SSL CAs connected to the cluster. # ssl-ca = "/path/to/ca.pem" # The path to the X509 certificate in PEM format that is connected to the cluster. # ssl-cert = "/path/to/pump.pem" # The path to the X509 key in PEM format that is connected to the cluster. # ssl-key = "/path/to/pump-key.pem" # Syncer Configuration [syncer] # If the item is set, the sql-mode will be used to parse the DDL statement. # If the downstream database is MySQL or TiDB, then the downstream sql-mode # is also set to this value. # sql-mode = "STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION" # the number of SQL statements of a transaction that are output to the downstream database ("20" by default) txn-batch = 20 # the number of the concurrency of the downstream for replication. The bigger the value, # the better throughput performance of the concurrency ("16" by default) worker-count = 16 # whether to disable the SQL feature of splitting a single binlog file. If it is set to "true", # each binlog file is restored to a single transaction for replication based on the order of binlogs. # If the downstream service is MySQL, set it to "False". disable-dispatch = false # In safe mode, data can be written into the downstream MySQL/TiDB repeatedly. # This mode replaces the `INSERT` statement with the `REPLACE` statement and replaces the `UPDATE` statement with `DELETE` plus `REPLACE` statements. safe-mode = false # the downstream service type of Drainer ("mysql" by default) # Valid value: "mysql", "tidb", "file", and "kafka". db-type = "mysql" # If `commit ts` of the transaction is in the list, the transaction is filtered and not replicated to the downstream. ignore-txn-commit-ts = [] # the db filter list ("INFORMATION_SCHEMA,PERFORMANCE_SCHEMA,mysql,test" by default) # Does not support the Rename DDL operation on tables of `ignore schemas`. ignore-schemas = "INFORMATION_SCHEMA,PERFORMANCE_SCHEMA,mysql" # `replicate-do-db` has priority over `replicate-do-table`. When they have the same `db` name, # regular expressions are supported for configuration. # The regular expression should start with "~". # replicate-do-db = ["~^b.*","s1"] # [syncer.relay] # It saves the directory of the relay log. The relay log is not enabled if the value is empty. # The configuration only comes to effect if the downstream is TiDB or MySQL. # log-dir = "" # the maximum size of each file # max-file-size = 10485760 # [[syncer.replicate-do-table]] # db-name ="test" # tbl-name = "log" # [[syncer.replicate-do-table]] # db-name ="test" # tbl-name = "~^a.*" # Ignore the replication of some tables # [[syncer.ignore-table]] # db-name = "test" # tbl-name = "log" # the server parameters of the downstream database when `db-type` is set to "mysql" [syncer.to] host = "192.168.0.13" user = "root" # If you do not want to set a cleartext `password` in the configuration file, you can create `encrypted_password` using `./binlogctl -cmd encrypt -text string`. # When you have created an `encrypted_password` that is not empty, the `password` above will be ignored, because `encrypted_password` and `password` cannot take effect at the same time. password = "" encrypted_password = "" port = 3306 [syncer.to.checkpoint] # When the checkpoint type is "mysql" or "tidb", this option can be # enabled to change the database that saves the checkpoint # schema = "tidb_binlog" # Currently only the "mysql" and "tidb" checkpoint types are supported # You can remove the comment tag to control where to save the checkpoint # The default method of saving the checkpoint for the downstream db-type: # mysql/tidb -> in the downstream MySQL or TiDB database # file/kafka -> file in `data-dir` # type = "mysql" # host = "127.0.0.1" # user = "root" # password = "" # `encrypted_password` is encrypted using `./binlogctl -cmd encrypt -text string`. # When `encrypted_password` is not empty, the `password` above will be ignored. # encrypted_password = "" # port = 3306 # the directory where the binlog file is stored when `db-type` is set to `file` # [syncer.to] # dir = "data.drainer" # the Kafka configuration when `db-type` is set to "kafka" # [syncer.to] # only one of kafka-addrs and zookeeper-addrs is needed. If both are present, the program gives priority # to the kafka address in zookeeper # zookeeper-addrs = "127.0.0.1:2181" # kafka-addrs = "127.0.0.1:9092" # kafka-version = "0.8.2.0" # The maximum number of messages (number of binlogs) in a broker request. If it is left blank or a value smaller than 0 is configured, the default value 1024 is used. # kafka-max-messages = 1024 # The maximum size of a broker request (unit: byte). The default value is 1 GiB and the maximum value is 2 GiB. # kafka-max-message-size = 1073741824 # the topic name of the Kafka cluster that saves the binlog data. The default value is <cluster-id>_obinlog. # To run multiple Drainers to replicate data to the same Kafka cluster, you need to set different `topic-name`s for each Drainer. # topic-name = ""Starting Drainer:
NoteIf the downstream is MySQL/TiDB, to guarantee the data integrity, you need to obtain the
initial-commit-tsvalue and make a full backup of the data and restore the data before the initial start of Drainer.When Drainer is started for the first time, use the
initial-commit-tsparameter../drainer -config drainer.toml -initial-commit-ts {initial-commit-ts}If the command line parameter and the configuration file parameter are the same, the parameter value in the command line is used.
Starting TiDB server:
After starting Pump and Drainer, start TiDB server with binlog enabled by adding this section to your config file for TiDB server:
[binlog] enable=trueTiDB server will obtain the addresses of registered Pumps from PD and will stream data to all of them. If there are no registered Pump instances, TiDB server will refuse to start or will block starting until a Pump instance comes online.
- When TiDB is running, you need to guarantee that at least one Pump is running normally.
- To enable the TiDB Binlog service in TiDB server, use the
-enable-binlogstartup parameter in TiDB, or add enable=true to the [binlog] section of the TiDB server configuration file. - Make sure that the TiDB Binlog service is enabled in all TiDB instances in a same cluster, otherwise upstream and downstream data inconsistency might occur during data replication. If you want to temporarily run a TiDB instance where the TiDB Binlog service is not enabled, set
run_ddl=falsein the TiDB configuration file. - Drainer does not support the
renameDDL operation on the table ofignore schemas(the schemas in the filter list). - If you want to start Drainer in an existing TiDB cluster, generally you need to make a full backup of the cluster data, obtain snapshot timestamp, import the data to the target database, and then start Drainer to replicate the incremental data from the corresponding snapshot timestamp.
- When the downstream database is TiDB or MySQL, ensure that the
sql_modein the upstream and downstream databases are consistent. In other words, thesql_modeshould be the same when each SQL statement is executed in the upstream and replicated to the downstream. You can execute theselect @@sql_mode;statement in the upstream and downstream respectively to comparesql_mode. - When a DDL statement is supported in the upstream but incompatible with the downstream, Drainer fails to replicate data. An example is to replicate the
CREATE TABLE t1(a INT) ROW_FORMAT=FIXED;statement when the downstream database MySQL uses the InnoDB engine. In this case, you can configure skipping transactions in Drainer, and manually execute compatible statements in the downstream database.