- Docs Home
- About TiDB
- Quick Start
- Develop
- Overview
- Quick Start
- Build a TiDB Cluster in TiDB Cloud (Developer Tier)
- CRUD SQL in TiDB
- Build a Simple CRUD App with TiDB
- Example Applications
- Connect to TiDB
- Design Database Schema
- Write Data
- Read Data
- Transaction
- Optimize
- Troubleshoot
- Reference
- Cloud Native Development Environment
- Third-party Support
- Deploy
- Software and Hardware Requirements
- Environment Configuration Checklist
- Plan Cluster Topology
- Install and Start
- Verify Cluster Status
- Test Cluster Performance
- Migrate
- Overview
- Migration Tools
- Migration Scenarios
- Migrate from Aurora
- Migrate MySQL of Small Datasets
- Migrate MySQL of Large Datasets
- Migrate and Merge MySQL Shards of Small Datasets
- Migrate and Merge MySQL Shards of Large Datasets
- Migrate from CSV Files
- Migrate from SQL Files
- Migrate from One TiDB Cluster to Another TiDB Cluster
- Migrate from TiDB to MySQL-compatible Databases
- Advanced Migration
- Integrate
- Overview
- Integration Scenarios
- Maintain
- Monitor and Alert
- Troubleshoot
- TiDB Troubleshooting Map
- Identify Slow Queries
- Analyze Slow Queries
- SQL Diagnostics
- Identify Expensive Queries Using Top SQL
- Identify Expensive Queries Using Logs
- Statement Summary Tables
- Troubleshoot Hotspot Issues
- Troubleshoot Increased Read and Write Latency
- Save and Restore the On-Site Information of a Cluster
- Troubleshoot Cluster Setup
- Troubleshoot High Disk I/O Usage
- Troubleshoot Lock Conflicts
- Troubleshoot TiFlash
- Troubleshoot Write Conflicts in Optimistic Transactions
- Troubleshoot Inconsistency Between Data and Indexes
- Performance Tuning
- Tuning Guide
- Configuration Tuning
- System Tuning
- Software Tuning
- SQL Tuning
- Overview
- Understanding the Query Execution Plan
- SQL Optimization Process
- Overview
- Logic Optimization
- Physical Optimization
- Prepare Execution Plan Cache
- Control Execution Plans
- Tutorials
- TiDB Tools
- Overview
- Use Cases
- Download
- TiUP
- Documentation Map
- Overview
- Terminology and Concepts
- Manage TiUP Components
- FAQ
- Troubleshooting Guide
- Command Reference
- Overview
- TiUP Commands
- TiUP Cluster Commands
- Overview
- tiup cluster audit
- tiup cluster check
- tiup cluster clean
- tiup cluster deploy
- tiup cluster destroy
- tiup cluster disable
- tiup cluster display
- tiup cluster edit-config
- tiup cluster enable
- tiup cluster help
- tiup cluster import
- tiup cluster list
- tiup cluster patch
- tiup cluster prune
- tiup cluster reload
- tiup cluster rename
- tiup cluster replay
- tiup cluster restart
- tiup cluster scale-in
- tiup cluster scale-out
- tiup cluster start
- tiup cluster stop
- tiup cluster template
- tiup cluster upgrade
- TiUP DM Commands
- Overview
- tiup dm audit
- tiup dm deploy
- tiup dm destroy
- tiup dm disable
- tiup dm display
- tiup dm edit-config
- tiup dm enable
- tiup dm help
- tiup dm import
- tiup dm list
- tiup dm patch
- tiup dm prune
- tiup dm reload
- tiup dm replay
- tiup dm restart
- tiup dm scale-in
- tiup dm scale-out
- tiup dm start
- tiup dm stop
- tiup dm template
- tiup dm upgrade
- TiDB Cluster Topology Reference
- DM Cluster Topology Reference
- Mirror Reference Guide
- TiUP Components
- PingCAP Clinic Diagnostic Service
- TiDB Operator
- Dumpling
- TiDB Lightning
- TiDB Data Migration
- About TiDB Data Migration
- Architecture
- Quick Start
- Deploy a DM cluster
- Tutorials
- Advanced Tutorials
- Maintain
- Cluster Upgrade
- Tools
- Performance Tuning
- Manage Data Sources
- Manage Tasks
- Export and Import Data Sources and Task Configurations of Clusters
- Handle Alerts
- Daily Check
- Reference
- Architecture
- Command Line
- Configuration Files
- OpenAPI
- Compatibility Catalog
- Secure
- Monitoring and Alerts
- Error Codes
- Glossary
- Example
- Troubleshoot
- Release Notes
- Backup & Restore (BR)
- Point-in-Time Recovery
- TiDB Binlog
- TiCDC
- Dumpling
- sync-diff-inspector
- TiSpark
- Reference
- Cluster Architecture
- Key Monitoring Metrics
- Secure
- Privileges
- SQL
- SQL Language Structure and Syntax
- SQL Statements
ADD COLUMN
ADD INDEX
ADMIN
ADMIN CANCEL DDL
ADMIN CHECKSUM TABLE
ADMIN CHECK [TABLE|INDEX]
ADMIN SHOW DDL [JOBS|QUERIES]
ADMIN SHOW TELEMETRY
ALTER DATABASE
ALTER INDEX
ALTER INSTANCE
ALTER PLACEMENT POLICY
ALTER TABLE
ALTER TABLE COMPACT
ALTER TABLE SET TIFLASH MODE
ALTER USER
ANALYZE TABLE
BACKUP
BATCH
BEGIN
CHANGE COLUMN
COMMIT
CHANGE DRAINER
CHANGE PUMP
CREATE [GLOBAL|SESSION] BINDING
CREATE DATABASE
CREATE INDEX
CREATE PLACEMENT POLICY
CREATE ROLE
CREATE SEQUENCE
CREATE TABLE LIKE
CREATE TABLE
CREATE USER
CREATE VIEW
DEALLOCATE
DELETE
DESC
DESCRIBE
DO
DROP [GLOBAL|SESSION] BINDING
DROP COLUMN
DROP DATABASE
DROP INDEX
DROP PLACEMENT POLICY
DROP ROLE
DROP SEQUENCE
DROP STATS
DROP TABLE
DROP USER
DROP VIEW
EXECUTE
EXPLAIN ANALYZE
EXPLAIN
FLASHBACK TABLE
FLUSH PRIVILEGES
FLUSH STATUS
FLUSH TABLES
GRANT <privileges>
GRANT <role>
INSERT
KILL [TIDB]
LOAD DATA
LOAD STATS
MODIFY COLUMN
PREPARE
RECOVER TABLE
RENAME INDEX
RENAME TABLE
REPLACE
RESTORE
REVOKE <privileges>
REVOKE <role>
ROLLBACK
SAVEPOINT
SELECT
SET DEFAULT ROLE
SET [NAMES|CHARACTER SET]
SET PASSWORD
SET ROLE
SET TRANSACTION
SET [GLOBAL|SESSION] <variable>
SHOW ANALYZE STATUS
SHOW [BACKUPS|RESTORES]
SHOW [GLOBAL|SESSION] BINDINGS
SHOW BUILTINS
SHOW CHARACTER SET
SHOW COLLATION
SHOW [FULL] COLUMNS FROM
SHOW CONFIG
SHOW CREATE PLACEMENT POLICY
SHOW CREATE SEQUENCE
SHOW CREATE TABLE
SHOW CREATE USER
SHOW DATABASES
SHOW DRAINER STATUS
SHOW ENGINES
SHOW ERRORS
SHOW [FULL] FIELDS FROM
SHOW GRANTS
SHOW INDEX [FROM|IN]
SHOW INDEXES [FROM|IN]
SHOW KEYS [FROM|IN]
SHOW MASTER STATUS
SHOW PLACEMENT
SHOW PLACEMENT FOR
SHOW PLACEMENT LABELS
SHOW PLUGINS
SHOW PRIVILEGES
SHOW [FULL] PROCESSSLIST
SHOW PROFILES
SHOW PUMP STATUS
SHOW SCHEMAS
SHOW STATS_HEALTHY
SHOW STATS_HISTOGRAMS
SHOW STATS_META
SHOW STATUS
SHOW TABLE NEXT_ROW_ID
SHOW TABLE REGIONS
SHOW TABLE STATUS
SHOW [FULL] TABLES
SHOW [GLOBAL|SESSION] VARIABLES
SHOW WARNINGS
SHUTDOWN
SPLIT REGION
START TRANSACTION
TABLE
TRACE
TRUNCATE
UPDATE
USE
WITH
- Data Types
- Functions and Operators
- Overview
- Type Conversion in Expression Evaluation
- Operators
- Control Flow Functions
- String Functions
- Numeric Functions and Operators
- Date and Time Functions
- Bit Functions and Operators
- Cast Functions and Operators
- Encryption and Compression Functions
- Locking Functions
- Information Functions
- JSON Functions
- Aggregate (GROUP BY) Functions
- Window Functions
- Miscellaneous Functions
- Precision Math
- Set Operations
- List of Expressions for Pushdown
- TiDB Specific Functions
- Clustered Indexes
- Constraints
- Generated Columns
- SQL Mode
- Table Attributes
- Transactions
- Garbage Collection (GC)
- Views
- Partitioning
- Temporary Tables
- Cached Tables
- Character Set and Collation
- Placement Rules in SQL
- System Tables
mysql
- INFORMATION_SCHEMA
- Overview
ANALYZE_STATUS
CLIENT_ERRORS_SUMMARY_BY_HOST
CLIENT_ERRORS_SUMMARY_BY_USER
CLIENT_ERRORS_SUMMARY_GLOBAL
CHARACTER_SETS
CLUSTER_CONFIG
CLUSTER_HARDWARE
CLUSTER_INFO
CLUSTER_LOAD
CLUSTER_LOG
CLUSTER_SYSTEMINFO
COLLATIONS
COLLATION_CHARACTER_SET_APPLICABILITY
COLUMNS
DATA_LOCK_WAITS
DDL_JOBS
DEADLOCKS
ENGINES
INSPECTION_RESULT
INSPECTION_RULES
INSPECTION_SUMMARY
KEY_COLUMN_USAGE
METRICS_SUMMARY
METRICS_TABLES
PARTITIONS
PLACEMENT_POLICIES
PROCESSLIST
REFERENTIAL_CONSTRAINTS
SCHEMATA
SEQUENCES
SESSION_VARIABLES
SLOW_QUERY
STATISTICS
TABLES
TABLE_CONSTRAINTS
TABLE_STORAGE_STATS
TIDB_HOT_REGIONS
TIDB_HOT_REGIONS_HISTORY
TIDB_INDEXES
TIDB_SERVERS_INFO
TIDB_TRX
TIFLASH_REPLICA
TIKV_REGION_PEERS
TIKV_REGION_STATUS
TIKV_STORE_STATUS
USER_PRIVILEGES
VARIABLES_INFO
VIEWS
METRICS_SCHEMA
- UI
- TiDB Dashboard
- Overview
- Maintain
- Access
- Overview Page
- Cluster Info Page
- Top SQL Page
- Key Visualizer Page
- Metrics Relation Graph
- SQL Statements Analysis
- Slow Queries Page
- Cluster Diagnostics
- Monitoring Page
- Search Logs Page
- Instance Profiling
- Session Management and Configuration
- FAQ
- CLI
- Command Line Flags
- Configuration File Parameters
- System Variables
- Storage Engines
- Telemetry
- Errors Codes
- Table Filter
- Schedule Replicas by Topology Labels
- FAQs
- Release Notes
- All Releases
- Release Timeline
- TiDB Versioning
- TiDB Installation Packages
- v6.2
- v6.1
- v6.0
- v5.4
- v5.3
- v5.2
- v5.1
- v5.0
- v4.0
- v3.1
- v3.0
- v2.1
- v2.0
- v1.0
- Glossary
Deploy and Maintain an Online TiDB Cluster Using TiUP
This document focuses on how to use the TiUP cluster component. For the complete steps of online deployment, refer to Deploy a TiDB Cluster Using TiUP.
Similar to the TiUP playground component used for a local test deployment, the TiUP cluster component quickly deploys TiDB for production environment. Compared with playground, the cluster component provides more powerful production cluster management features, including upgrading, scaling, and even operation and auditing.
For the help information of the cluster component, run the following command:
tiup cluster
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.10.3/cluster
Deploy a TiDB cluster for production
Usage:
tiup cluster [command]
Available Commands:
check Precheck a cluster
deploy Deploy a cluster for production
start Start a TiDB cluster
stop Stop a TiDB cluster
restart Restart a TiDB cluster
scale-in Scale in a TiDB cluster
scale-out Scale out a TiDB cluster
destroy Destroy a specified cluster
clean (Experimental) Clean up a specified cluster
upgrade Upgrade a specified TiDB cluster
display Display information of a TiDB cluster
list List all clusters
audit Show audit log of cluster operation
import Import an existing TiDB cluster from TiDB-Ansible
edit-config Edit TiDB cluster config
reload Reload a TiDB cluster's config and restart if needed
patch Replace the remote package with a specified package and restart the service
help Help about any command
Flags:
-c, --concurrency int Maximum number of concurrent tasks allowed (defaults to `5`)
--format string (EXPERIMENTAL) The format of output, available values are [default, json] (default "default")
-h, --help help for tiup
--ssh string (Experimental) The executor type. Optional values are 'builtin', 'system', and 'none'.
--ssh-timeout uint Timeout in seconds to connect a host via SSH. Operations that don't need an SSH connection are ignored. (default 5)
-v, --version TiUP version
--wait-timeout uint Timeout in seconds to wait for an operation to complete. Inapplicable operations are ignored. (defaults to `120`)
-y, --yes Skip all confirmations and assumes 'yes'
Deploy the cluster
To deploy the cluster, run the tiup cluster deploy
command. The usage of the command is as follows:
tiup cluster deploy <cluster-name> <version> <topology.yaml> [flags]
This command requires you to provide the cluster name, the TiDB cluster version, and a topology file of the cluster.
To write a topology file, refer to the example. The following file is an example of the simplest topology:
The topology file used by the TiUP cluster component for deployment and scaling is written using yaml syntax, so make sure that the indentation is correct.
---
pd_servers:
- host: 172.16.5.134
name: pd-134
- host: 172.16.5.139
name: pd-139
- host: 172.16.5.140
name: pd-140
tidb_servers:
- host: 172.16.5.134
- host: 172.16.5.139
- host: 172.16.5.140
tikv_servers:
- host: 172.16.5.134
- host: 172.16.5.139
- host: 172.16.5.140
grafana_servers:
- host: 172.16.5.134
monitoring_servers:
- host: 172.16.5.134
By default, TiUP is deployed as the binary files running on the amd64 architecture. If the target machine is the arm64 architecture, you can configure it in the topology file:
global:
arch: "arm64" # Configures all machines to use the binary files of the arm64 architecture by default
tidb_servers:
- host: 172.16.5.134
arch: "amd64" # Configures this machine to use the binary files of the amd64 architecture
- host: 172.16.5.139
arch: "arm64" # Configures this machine to use the binary files of the arm64 architecture
- host: 172.16.5.140 # Machines that are not configured with the arch field use the default value in the global field, which is arm64 in this case.
...
Save the file as /tmp/topology.yaml
. If you want to use TiDB v6.2.0 and your cluster name is prod-cluster
, run the following command:
tiup cluster deploy -p prod-cluster v6.2.0 /tmp/topology.yaml
During the execution, TiUP asks you to confirm your topology again and requires the root password of the target machine (the -p
flag means inputting password):
Please confirm your topology:
TiDB Cluster: prod-cluster
TiDB Version: v6.2.0
Type Host Ports Directories
---- ---- ----- -----------
pd 172.16.5.134 2379/2380 deploy/pd-2379,data/pd-2379
pd 172.16.5.139 2379/2380 deploy/pd-2379,data/pd-2379
pd 172.16.5.140 2379/2380 deploy/pd-2379,data/pd-2379
tikv 172.16.5.134 20160/20180 deploy/tikv-20160,data/tikv-20160
tikv 172.16.5.139 20160/20180 deploy/tikv-20160,data/tikv-20160
tikv 172.16.5.140 20160/20180 deploy/tikv-20160,data/tikv-20160
tidb 172.16.5.134 4000/10080 deploy/tidb-4000
tidb 172.16.5.139 4000/10080 deploy/tidb-4000
tidb 172.16.5.140 4000/10080 deploy/tidb-4000
prometheus 172.16.5.134 9090 deploy/prometheus-9090,data/prometheus-9090
grafana 172.16.5.134 3000 deploy/grafana-3000
Attention:
1. If the topology is not what you expected, check your yaml file.
2. Please confirm there is no port/directory conflicts in same host.
Do you want to continue? [y/N]:
After you enter the password, TiUP cluster downloads the required components and deploy them on the corresponding machines. When you see the following message, the deployment is successful:
Deployed cluster `prod-cluster` successfully
View the cluster list
After the cluster is successfully deployed, view the cluster list by running the following command:
tiup cluster list
Starting /root/.tiup/components/cluster/v1.10.3/cluster list
Name User Version Path PrivateKey
---- ---- ------- ---- ----------
prod-cluster tidb v6.2.0 /root/.tiup/storage/cluster/clusters/prod-cluster /root/.tiup/storage/cluster/clusters/prod-cluster/ssh/id_rsa
Start the cluster
After the cluster is successfully deployed, start the cluster by running the following command:
tiup cluster start prod-cluster
If you forget the name of your cluster, view the cluster list by running tiup cluster list
.
TiUP uses systemd
to start a daemon process. If the process terminates unexpectedly, it will be pulled up after 15 seconds.
Check the cluster status
TiUP provides the tiup cluster display
command to view the status of each component in the cluster. With this command, you don't have to log in to each machine to see the component status. The usage of the command is as follows:
tiup cluster display prod-cluster
Starting /root/.tiup/components/cluster/v1.10.3/cluster display prod-cluster
TiDB Cluster: prod-cluster
TiDB Version: v6.2.0
ID Role Host Ports Status Data Dir Deploy Dir
-- ---- ---- ----- ------ -------- ----------
172.16.5.134:3000 grafana 172.16.5.134 3000 Up - deploy/grafana-3000
172.16.5.134:2379 pd 172.16.5.134 2379/2380 Up|L data/pd-2379 deploy/pd-2379
172.16.5.139:2379 pd 172.16.5.139 2379/2380 Up|UI data/pd-2379 deploy/pd-2379
172.16.5.140:2379 pd 172.16.5.140 2379/2380 Up data/pd-2379 deploy/pd-2379
172.16.5.134:9090 prometheus 172.16.5.134 9090 Up data/prometheus-9090 deploy/prometheus-9090
172.16.5.134:4000 tidb 172.16.5.134 4000/10080 Up - deploy/tidb-4000
172.16.5.139:4000 tidb 172.16.5.139 4000/10080 Up - deploy/tidb-4000
172.16.5.140:4000 tidb 172.16.5.140 4000/10080 Up - deploy/tidb-4000
172.16.5.134:20160 tikv 172.16.5.134 20160/20180 Up data/tikv-20160 deploy/tikv-20160
172.16.5.139:20160 tikv 172.16.5.139 20160/20180 Up data/tikv-20160 deploy/tikv-20160
172.16.5.140:20160 tikv 172.16.5.140 20160/20180 Up data/tikv-20160 deploy/tikv-20160
The Status
column uses Up
or Down
to indicate whether the service is running normally.
For the PD component, |L
or |UI
might be appended to Up
or Down
. |L
indicates that the PD node is a Leader, and |UI
indicates that TiDB Dashboard is running on the PD node.
Scale in a cluster
This section describes only the syntax of the scale-in command. For detailed steps of online scaling, refer to Scale a TiDB Cluster Using TiUP.
Scaling in a cluster means making some node(s) offline. This operation removes the specific node(s) from the cluster and deletes the remaining files.
Because the offline process of the TiKV and TiDB Binlog components is asynchronous (which requires removing the node through API), and the process takes a long time (which requires continuous observation on whether the node is successfully taken offline), special treatment is given to the TiKV and TiDB Binlog components.
For TiKV and Binlog:
TiUP cluster takes the node offline through API and directly exits without waiting for the process to be completed.
Afterwards, when a command related to the cluster operation is executed, TiUP cluster examines whether there is a TiKV/Binlog node that has been taken offline. If not, TiUP cluster continues with the specified operation; If there is, TiUP cluster takes the following steps:
- Stop the service of the node that has been taken offline.
- Clean up the data files related to the node.
- Remove the node from the cluster topology.
For other components:
- When taking the PD component down, TiUP cluster quickly deletes the specified node from the cluster through API, stops the service of the specified PD node, and deletes the related data files.
- When taking other components down, TiUP cluster directly stops the node service and deletes the related data files.
The basic usage of the scale-in command:
tiup cluster scale-in <cluster-name> -N <node-id>
To use this command, you need to specify at least two flags: the cluster name and the node ID. The node ID can be obtained by using the tiup cluster display
command in the previous section.
For example, to make the TiKV node on 172.16.5.140
offline, run the following command:
tiup cluster scale-in prod-cluster -N 172.16.5.140:20160
By running tiup cluster display
, you can see that the TiKV node is marked Offline
:
tiup cluster display prod-cluster
Starting /root/.tiup/components/cluster/v1.10.3/cluster display prod-cluster
TiDB Cluster: prod-cluster
TiDB Version: v6.2.0
ID Role Host Ports Status Data Dir Deploy Dir
-- ---- ---- ----- ------ -------- ----------
172.16.5.134:3000 grafana 172.16.5.134 3000 Up - deploy/grafana-3000
172.16.5.134:2379 pd 172.16.5.134 2379/2380 Up|L data/pd-2379 deploy/pd-2379
172.16.5.139:2379 pd 172.16.5.139 2379/2380 Up|UI data/pd-2379 deploy/pd-2379
172.16.5.140:2379 pd 172.16.5.140 2379/2380 Up data/pd-2379 deploy/pd-2379
172.16.5.134:9090 prometheus 172.16.5.134 9090 Up data/prometheus-9090 deploy/prometheus-9090
172.16.5.134:4000 tidb 172.16.5.134 4000/10080 Up - deploy/tidb-4000
172.16.5.139:4000 tidb 172.16.5.139 4000/10080 Up - deploy/tidb-4000
172.16.5.140:4000 tidb 172.16.5.140 4000/10080 Up - deploy/tidb-4000
172.16.5.134:20160 tikv 172.16.5.134 20160/20180 Up data/tikv-20160 deploy/tikv-20160
172.16.5.139:20160 tikv 172.16.5.139 20160/20180 Up data/tikv-20160 deploy/tikv-20160
172.16.5.140:20160 tikv 172.16.5.140 20160/20180 Offline data/tikv-20160 deploy/tikv-20160
After PD schedules the data on the node to other TiKV nodes, this node will be deleted automatically.
Scale out a cluster
This section describes only the syntax of the scale-out command. For detailed steps of online scaling, refer to Scale a TiDB Cluster Using TiUP.
The scale-out operation has an inner logic similar to that of deployment: the TiUP cluster component firstly ensures the SSH connection of the node, creates the required directories on the target node, then executes the deployment operation, and starts the node service.
When you scale out PD, the node is added to the cluster by join
, and the configurations of services associated with PD are updated. When you scale out other services, the service is started directly and added to the cluster.
All services conduct correctness validation when they are scaled out. The validation results show whether the scaling-out is successful.
To add a TiKV node and a PD node in the tidb-test
cluster, take the following steps:
Create a
scale.yaml
file, and add IPs of the new TiKV and PD nodes:NoteYou need to create a topology file, which includes only the description of the new nodes, not the existing nodes.
--- pd_servers: - host: 172.16.5.140 tikv_servers: - host: 172.16.5.140
Perform the scale-out operation. TiUP cluster adds the corresponding nodes to the cluster according to the port, directory, and other information described in
scale.yaml
.tiup cluster scale-out tidb-test scale.yaml
After the command is executed, you can check the status of the scaled-out cluster by running
tiup cluster display tidb-test
.
Rolling upgrade
This section describes only the syntax of the upgrade command. For detailed steps of online upgrade, refer to Upgrade TiDB Using TiUP.
The rolling upgrade feature leverages the distributed capabilities of TiDB. The upgrade process is made as transparent as possible to the application, and does not affect the business.
Before the upgrade, TiUP cluster checks whether the configuration file of each component is rational. If so, the components are upgraded node by node; if not, TiUP reports an error and exits. The operations vary with different nodes.
Operations for different nodes
Upgrade the PD node
- First, upgrade non-Leader nodes.
- After all the non-Leader nodes are upgraded, upgrade the Leader node.
- The upgrade tool sends a command to PD that migrates Leader to an already upgraded node.
- After the Leader role is switched to another node, upgrade the previous Leader node.
- During the upgrade, if any unhealthy node is detected, the tool stops this upgrade operation and exits. You need to manually analyze the cause, fix the issue and run the upgrade again.
Upgrade the TiKV node
- First, add a scheduling operation in PD that migrates the Region Leader of this TiKV node. This ensures that the upgrade process does not affect the business.
- After the Leader is migrated, upgrade this TiKV node.
- After the upgraded TiKV is started normally, remove the scheduling of the Leader.
Upgrade other services
- Stop the service normally and update the node.
Upgrade command
The flags for the upgrade command is as follows:
Usage:
cluster upgrade <cluster-name> <version> [flags]
Flags:
--force Force upgrade won't transfer leader
-h, --help help for upgrade
--transfer-timeout int Timeout in seconds when transferring PD and TiKV store leaders (default 300)
Global Flags:
--ssh string (Experimental) The executor type. Optional values are 'builtin', 'system', and 'none'.
--wait-timeout int Timeout of waiting the operation
--ssh-timeout int Timeout in seconds to connect host via SSH, ignored for operations that don't need an SSH connection. (default 5)
-y, --yes Skip all confirmations and assumes 'yes'
For example, the following command upgrades the cluster to v6.2.0:
tiup cluster upgrade tidb-test v6.2.0
Update configuration
If you want to dynamically update the component configurations, the TiUP cluster component saves a current configuration for each cluster. To edit this configuration, execute the tiup cluster edit-config <cluster-name>
command. For example:
tiup cluster edit-config prod-cluster
TiUP cluster opens the configuration file in the vi editor. If you want to use other editors, use the EDITOR
environment variable to customize the editor, such as export EDITOR=nano
.
After editing the file, save the changes. To apply the new configuration to the cluster, execute the following command:
tiup cluster reload prod-cluster
The command sends the configuration to the target machine and restarts the cluster to make the configuration take effect.
For monitoring components, customize the configuration by executing the tiup cluster edit-config
command to add a custom configuration path on the corresponding instance. For example:
---
grafana_servers:
- host: 172.16.5.134
dashboard_dir: /path/to/local/dashboards/dir
monitoring_servers:
- host: 172.16.5.134
rule_dir: /path/to/local/rules/dir
alertmanager_servers:
- host: 172.16.5.134
config_file: /path/to/local/alertmanager.yml
The content and format requirements for files under the specified path are as follows:
- The folder specified in the
dashboard_dir
field ofgrafana_servers
must contain full*.json
files. - The folder specified in the
rule_dir
field ofmonitoring_servers
must contain full*.rules.yml
files. - For the format of files specified in the
config_file
field ofalertmanager_servers
, refer to the Alertmanager configuration template.
When you execute tiup reload
, TiUP first deletes all old configuration files in the target machine and then uploads the corresponding configuration from the control machine to the corresponding configuration directory of the target machine. Therefore, if you want to modify a particular configuration file, make sure that all configuration files (including the unmodified ones) are in the same directory. For example, to modify Grafana's tidb.json
file, you need to first copy all the *.json
files from Grafana's dashboards
directory to your local directory. Otherwise, other JSON files will be missing from the target machine.
If you have configured the dashboard_dir
field of grafana_servers
, after executing the tiup cluster rename
command to rename the cluster, you need to complete the following operations:
- In the local
dashboards
directory, change the cluster name to the new cluster name. - In the local
dashboards
directory, changedatasource
to the new cluster name, becausedatasource
is named after the cluster name. - Execute the
tiup cluster reload -R grafana
command.
Update component
For normal upgrade, you can use the upgrade
command. But in some scenarios, such as debugging, you might need to replace the currently running component with a temporary package. To achieve this, use the patch
command:
tiup cluster patch --help
Replace the remote package with a specified package and restart the service
Usage:
cluster patch <cluster-name> <package-path> [flags]
Flags:
-h, --help help for patch
-N, --node strings Specify the nodes
--overwrite Use this package in the future scale-out operations
-R, --role strings Specify the role
--transfer-timeout int Timeout in seconds when transferring PD and TiKV store leaders (default 300)
Global Flags:
--ssh string (Experimental) The executor type. Optional values are 'builtin', 'system', and 'none'.
--wait-timeout int Timeout of waiting the operation
--ssh-timeout int Timeout in seconds to connect host via SSH, ignored for operations that don't need an SSH connection. (default 5)
-y, --yes Skip all confirmations and assumes 'yes'
If a TiDB hotfix package is in /tmp/tidb-hotfix.tar.gz
and you want to replace all the TiDB packages in the cluster, run the following command:
tiup cluster patch test-cluster /tmp/tidb-hotfix.tar.gz -R tidb
You can also replace only one TiDB package in the cluster:
tiup cluster patch test-cluster /tmp/tidb-hotfix.tar.gz -N 172.16.4.5:4000
Import TiDB Ansible cluster
Currently, TiUP cluster's support for TiSpark is still experimental. It is not supported to import a TiDB cluster with TiSpark enabled.
Before TiUP is released, TiDB Ansible is often used to deploy TiDB clusters. To enable TiUP to take over the cluster deployed by TiDB Ansible, use the import
command.
The usage of the import
command is as follows:
tiup cluster import --help
Import an exist TiDB cluster from TiDB-Ansible
Usage:
cluster import [flags]
Flags:
-d, --dir string The path to TiDB-Ansible directory
-h, --help help for import
--inventory string The name of inventory file (default "inventory.ini")
--no-backup Don't backup ansible dir, useful when there're multiple inventory files
-r, --rename NAME Rename the imported cluster to NAME
Global Flags:
--ssh string (Experimental) The executor type. Optional values are 'builtin', 'system', and 'none'.
--wait-timeout int Timeout of waiting the operation
--ssh-timeout int Timeout in seconds to connect host via SSH, ignored for operations that don't need an SSH connection. (default 5)
-y, --yes Skip all confirmations and assumes 'yes'
You can use either of the following commands to import a TiDB Ansible cluster:
cd tidb-ansible
tiup cluster import
tiup cluster import --dir=/path/to/tidb-ansible
View the operation log
To view the operation log, use the audit
command. The usage of the audit
command is as follows:
Usage:
tiup cluster audit [audit-id] [flags]
Flags:
-h, --help help for audit
If the [audit-id]
flag is not specified, the command shows a list of commands that have been executed. For example:
tiup cluster audit
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.10.3/cluster audit
ID Time Command
-- ---- -------
4BLhr0 2022-08-10T13:25:09+08:00 /home/tidb/.tiup/components/cluster/v1.10.3/cluster deploy test v6.2.0 /tmp/topology.yaml
4BKWjF 2022-08-08T23:36:57+08:00 /home/tidb/.tiup/components/cluster/v1.10.3/cluster deploy test v6.2.0 /tmp/topology.yaml
4BKVwH 2022-08-08T23:02:08+08:00 /home/tidb/.tiup/components/cluster/v1.10.3/cluster deploy test v6.2.0 /tmp/topology.yaml
4BKKH1 2022-08-08T16:39:04+08:00 /home/tidb/.tiup/components/cluster/v1.10.3/cluster destroy test
4BKKDx 2022-08-08T16:36:57+08:00 /home/tidb/.tiup/components/cluster/v1.10.3/cluster deploy test v6.2.0 /tmp/topology.yaml
The first column is audit-id
. To view the execution log of a certain command, pass the audit-id
of a command as the flag as follows:
tiup cluster audit 4BLhr0
Run commands on a host in the TiDB cluster
To run command on a host in the TiDB cluster, use the exec
command. The usage of the exec
command is as follows:
Usage:
cluster exec <cluster-name> [flags]
Flags:
--command string the command run on cluster host (default "ls")
-h, --help help for exec
-N, --node strings Only exec on host with specified nodes
-R, --role strings Only exec on host with specified roles
--sudo use root permissions (default false)
Global Flags:
--ssh-timeout int Timeout in seconds to connect host via SSH, ignored for operations that don't need an SSH connection. (default 5)
-y, --yes Skip all confirmations and assumes 'yes'
For example, to execute ls /tmp
on all TiDB nodes, run the following command:
tiup cluster exec test-cluster --command='ls /tmp'
Cluster controllers
Before TiUP is released, you can control the cluster using tidb-ctl
, tikv-ctl
, pd-ctl
, and other tools. To make the tools easier to download and use, TiUP integrates them into an all-in-one component, ctl
.
Usage:
tiup ctl {tidb/pd/tikv/binlog/etcd} [flags]
Flags:
-h, --help help for tiup
This command has a corresponding relationship with those of the previous tools:
tidb-ctl [args] = tiup ctl tidb [args]
pd-ctl [args] = tiup ctl pd [args]
tikv-ctl [args] = tiup ctl tikv [args]
binlogctl [args] = tiup ctl bindlog [args]
etcdctl [args] = tiup ctl etcd [args]
For example, if you previously view the store by running pd-ctl -u http://127.0.0.1:2379 store
, now you can run the following command in TiUP:
tiup ctl pd -u http://127.0.0.1:2379 store
Environment checks for target machines
You can use the check
command to perform a series of checks on the environment of the target machine and output the check results. By executing the check
command, you can find common unreasonable configurations or unsupported situations. The command flag list is as follows:
Usage:
tiup cluster check <topology.yml | cluster-name> [flags]
Flags:
--apply Try to fix failed checks
--cluster Check existing cluster, the input is a cluster name.
--enable-cpu Enable CPU thread count check
--enable-disk Enable disk IO (fio) check
--enable-mem Enable memory size check
-h, --help help for check
-i, --identity_file string The path of the SSH identity file. If specified, public key authentication will be used.
-p, --password Use password of target hosts. If specified, password authentication will be used.
--user string The user name to login via SSH. The user must has root (or sudo) privilege.
By default, this command is used to check the environment before deployment. By specifying the --cluster
flag to switch the mode, you can also check the target machines of an existing cluster, for example:
# check deployed servers before deployment
tiup cluster check topology.yml --user tidb -p
# check deployed servers of an existing cluster
tiup cluster check <cluster-name> --cluster
The CPU thread count check, memory size check, and disk performance check are disabled by default. For the production environment, it is recommended that you enable the three checks and make sure they pass to obtain the best performance.
- CPU: If the number of threads is greater than or equal to 16, the check is passed.
- Memory: If the total size of physical memory is greater than or equal to 32 GB, the check is passed.
- Disk: Execute
fio
test on the partitions ofdata_dir
and record the results.
When running the checks, if the --apply
flag is specified, the program automatically repairs the failed items. Automatic repair is limited to some items that can be adjusted by modifying the configuration or system parameters. Other unrepaired items need to be handled manually according to the actual situation.
Environment checks are not necessary for deploying a cluster. For the production environment, it is recommended to perform environment checks and pass all check items before deployment. If not all the check items are passed, the cluster might be deployed and run normally, but the best performance might not be obtained.
Use the system's native SSH client to connect to cluster
All operations above performed on the cluster machine use the SSH client embedded in TiUP to connect to the cluster and execute commands. However, in some scenarios, you might also need to use the SSH client native to the control machine system to perform such cluster operations. For example:
- To use a SSH plug-in for authentication
- To use a customized SSH client
Then you can use the --ssh=system
command-line flag to enable the system-native command-line tool:
- Deploy a cluster:
tiup cluster deploy <cluster-name> <version> <topo> --ssh=system
- Start a cluster:
tiup cluster start <cluster-name> --ssh=system
- Upgrade a cluster:
tiup cluster upgrade ... --ssh=system
You can add --ssh=system
in all cluster operation commands above to use the system's native SSH client.
To avoid adding such a flag in every command, you can use the TIUP_NATIVE_SSH
system variable to specify whether to use the local SSH client:
export TIUP_NATIVE_SSH=true
# or
export TIUP_NATIVE_SSH=1
# or
export TIUP_NATIVE_SSH=enable
If you specify this environment variable and --ssh
at the same time, --ssh
has higher priority.
During the process of cluster deployment, if you need to use a password for connection (-p
) or passphrase
is configured in the key file, you must ensure that sshpass
is installed on the control machine; otherwise, a timeout error is reported.
Migrate control machine and back up TiUP data
The TiUP data is stored in the .tiup
directory in the user's home directory. To migrate the control machine, you can take the following steps to copy the .tiup
directory to the corresponding target machine:
Execute
tar czvf tiup.tar.gz .tiup
in the home directory of the original machine.Copy
tiup.tar.gz
to the home directory of the target machine.Execute
tar xzvf tiup.tar.gz
in the home directory of the target machine.Add the
.tiup
directory to thePATH
environment variable.If you use
bash
and you are atidb
user, you can addexport PATH=/home/tidb/.tiup/bin:$PATH
in~/.bashrc
and executesource ~/.bashrc
. Then make corresponding adjustments according to the shell and the user you use.
It is recommended that you back up the .tiup
directory regularly to avoid the loss of TiUP data caused by abnormal conditions, such as disk damage of the control machine.
- Deploy the cluster
- View the cluster list
- Start the cluster
- Check the cluster status
- Scale in a cluster
- Scale out a cluster
- Rolling upgrade
- Update configuration
- Update component
- Import TiDB Ansible cluster
- View the operation log
- Run commands on a host in the TiDB cluster
- Cluster controllers
- Environment checks for target machines
- Use the system's native SSH client to connect to cluster
- Migrate control machine and back up TiUP data