- Docs Home
- About TiDB
- Quick Start
- Develop
- Overview
- Quick Start
- Build a TiDB Cluster in TiDB Cloud (Developer Tier)
- CRUD SQL in TiDB
- Build a Simple CRUD App with TiDB
- Example Applications
- Connect to TiDB
- Design Database Schema
- Write Data
- Read Data
- Transaction
- Optimize
- Troubleshoot
- Reference
- Cloud Native Development Environment
- Third-party Support
- Deploy
- Software and Hardware Requirements
- Environment Configuration Checklist
- Plan Cluster Topology
- Install and Start
- Verify Cluster Status
- Test Cluster Performance
- Migrate
- Overview
- Migration Tools
- Migration Scenarios
- Migrate from Aurora
- Migrate MySQL of Small Datasets
- Migrate MySQL of Large Datasets
- Migrate and Merge MySQL Shards of Small Datasets
- Migrate and Merge MySQL Shards of Large Datasets
- Migrate from CSV Files
- Migrate from SQL Files
- Migrate from One TiDB Cluster to Another TiDB Cluster
- Migrate from TiDB to MySQL-compatible Databases
- Advanced Migration
- Integrate
- Maintain
- Monitor and Alert
- Troubleshoot
- TiDB Troubleshooting Map
- Identify Slow Queries
- Analyze Slow Queries
- SQL Diagnostics
- Identify Expensive Queries Using Top SQL
- Identify Expensive Queries Using Logs
- Statement Summary Tables
- Troubleshoot Hotspot Issues
- Troubleshoot Increased Read and Write Latency
- Save and Restore the On-Site Information of a Cluster
- Troubleshoot Cluster Setup
- Troubleshoot High Disk I/O Usage
- Troubleshoot Lock Conflicts
- Troubleshoot TiFlash
- Troubleshoot Write Conflicts in Optimistic Transactions
- Troubleshoot Inconsistency Between Data and Indexes
- Performance Tuning
- Tuning Guide
- Configuration Tuning
- System Tuning
- Software Tuning
- SQL Tuning
- Overview
- Understanding the Query Execution Plan
- SQL Optimization Process
- Overview
- Logic Optimization
- Physical Optimization
- Prepare Execution Plan Cache
- Control Execution Plans
- Tutorials
- TiDB Tools
- Overview
- Use Cases
- Download
- TiUP
- Documentation Map
- Overview
- Terminology and Concepts
- Manage TiUP Components
- FAQ
- Troubleshooting Guide
- Command Reference
- Overview
- TiUP Commands
- TiUP Cluster Commands
- Overview
- tiup cluster audit
- tiup cluster check
- tiup cluster clean
- tiup cluster deploy
- tiup cluster destroy
- tiup cluster disable
- tiup cluster display
- tiup cluster edit-config
- tiup cluster enable
- tiup cluster help
- tiup cluster import
- tiup cluster list
- tiup cluster patch
- tiup cluster prune
- tiup cluster reload
- tiup cluster rename
- tiup cluster replay
- tiup cluster restart
- tiup cluster scale-in
- tiup cluster scale-out
- tiup cluster start
- tiup cluster stop
- tiup cluster template
- tiup cluster upgrade
- TiUP DM Commands
- Overview
- tiup dm audit
- tiup dm deploy
- tiup dm destroy
- tiup dm disable
- tiup dm display
- tiup dm edit-config
- tiup dm enable
- tiup dm help
- tiup dm import
- tiup dm list
- tiup dm patch
- tiup dm prune
- tiup dm reload
- tiup dm replay
- tiup dm restart
- tiup dm scale-in
- tiup dm scale-out
- tiup dm start
- tiup dm stop
- tiup dm template
- tiup dm upgrade
- TiDB Cluster Topology Reference
- DM Cluster Topology Reference
- Mirror Reference Guide
- TiUP Components
- PingCAP Clinic Diagnostic Service
- TiDB Operator
- Dumpling
- TiDB Lightning
- TiDB Data Migration
- About TiDB Data Migration
- Architecture
- Quick Start
- Deploy a DM cluster
- Tutorials
- Advanced Tutorials
- Maintain
- Cluster Upgrade
- Tools
- Performance Tuning
- Manage Data Sources
- Manage Tasks
- Export and Import Data Sources and Task Configurations of Clusters
- Handle Alerts
- Daily Check
- Reference
- Architecture
- Command Line
- Configuration Files
- OpenAPI
- Compatibility Catalog
- Secure
- Monitoring and Alerts
- Error Codes
- Glossary
- Example
- Troubleshoot
- Release Notes
- Backup & Restore (BR)
- TiDB Binlog
- TiCDC
- Dumpling
- sync-diff-inspector
- TiSpark
- Reference
- Cluster Architecture
- Key Monitoring Metrics
- Secure
- Privileges
- SQL
- SQL Language Structure and Syntax
- SQL Statements
ADD COLUMN
ADD INDEX
ADMIN
ADMIN CANCEL DDL
ADMIN CHECKSUM TABLE
ADMIN CHECK [TABLE|INDEX]
ADMIN SHOW DDL [JOBS|QUERIES]
ADMIN SHOW TELEMETRY
ALTER DATABASE
ALTER INDEX
ALTER INSTANCE
ALTER PLACEMENT POLICY
ALTER TABLE
ALTER TABLE COMPACT
ALTER USER
ANALYZE TABLE
BACKUP
BATCH
BEGIN
CHANGE COLUMN
COMMIT
CHANGE DRAINER
CHANGE PUMP
CREATE [GLOBAL|SESSION] BINDING
CREATE DATABASE
CREATE INDEX
CREATE PLACEMENT POLICY
CREATE ROLE
CREATE SEQUENCE
CREATE TABLE LIKE
CREATE TABLE
CREATE USER
CREATE VIEW
DEALLOCATE
DELETE
DESC
DESCRIBE
DO
DROP [GLOBAL|SESSION] BINDING
DROP COLUMN
DROP DATABASE
DROP INDEX
DROP PLACEMENT POLICY
DROP ROLE
DROP SEQUENCE
DROP STATS
DROP TABLE
DROP USER
DROP VIEW
EXECUTE
EXPLAIN ANALYZE
EXPLAIN
FLASHBACK TABLE
FLUSH PRIVILEGES
FLUSH STATUS
FLUSH TABLES
GRANT <privileges>
GRANT <role>
INSERT
KILL [TIDB]
LOAD DATA
LOAD STATS
MODIFY COLUMN
PREPARE
RECOVER TABLE
RENAME INDEX
RENAME TABLE
REPLACE
RESTORE
REVOKE <privileges>
REVOKE <role>
ROLLBACK
SELECT
SET DEFAULT ROLE
SET [NAMES|CHARACTER SET]
SET PASSWORD
SET ROLE
SET TRANSACTION
SET [GLOBAL|SESSION] <variable>
SHOW ANALYZE STATUS
SHOW [BACKUPS|RESTORES]
SHOW [GLOBAL|SESSION] BINDINGS
SHOW BUILTINS
SHOW CHARACTER SET
SHOW COLLATION
SHOW [FULL] COLUMNS FROM
SHOW CONFIG
SHOW CREATE PLACEMENT POLICY
SHOW CREATE SEQUENCE
SHOW CREATE TABLE
SHOW CREATE USER
SHOW DATABASES
SHOW DRAINER STATUS
SHOW ENGINES
SHOW ERRORS
SHOW [FULL] FIELDS FROM
SHOW GRANTS
SHOW INDEX [FROM|IN]
SHOW INDEXES [FROM|IN]
SHOW KEYS [FROM|IN]
SHOW MASTER STATUS
SHOW PLACEMENT
SHOW PLACEMENT FOR
SHOW PLACEMENT LABELS
SHOW PLUGINS
SHOW PRIVILEGES
SHOW [FULL] PROCESSSLIST
SHOW PROFILES
SHOW PUMP STATUS
SHOW SCHEMAS
SHOW STATS_HEALTHY
SHOW STATS_HISTOGRAMS
SHOW STATS_META
SHOW STATUS
SHOW TABLE NEXT_ROW_ID
SHOW TABLE REGIONS
SHOW TABLE STATUS
SHOW [FULL] TABLES
SHOW [GLOBAL|SESSION] VARIABLES
SHOW WARNINGS
SHUTDOWN
SPLIT REGION
START TRANSACTION
TABLE
TRACE
TRUNCATE
UPDATE
USE
WITH
- Data Types
- Functions and Operators
- Overview
- Type Conversion in Expression Evaluation
- Operators
- Control Flow Functions
- String Functions
- Numeric Functions and Operators
- Date and Time Functions
- Bit Functions and Operators
- Cast Functions and Operators
- Encryption and Compression Functions
- Locking Functions
- Information Functions
- JSON Functions
- Aggregate (GROUP BY) Functions
- Window Functions
- Miscellaneous Functions
- Precision Math
- Set Operations
- List of Expressions for Pushdown
- TiDB Specific Functions
- Clustered Indexes
- Constraints
- Generated Columns
- SQL Mode
- Table Attributes
- Transactions
- Garbage Collection (GC)
- Views
- Partitioning
- Temporary Tables
- Cached Tables
- Character Set and Collation
- Placement Rules in SQL
- System Tables
mysql
- INFORMATION_SCHEMA
- Overview
ANALYZE_STATUS
CLIENT_ERRORS_SUMMARY_BY_HOST
CLIENT_ERRORS_SUMMARY_BY_USER
CLIENT_ERRORS_SUMMARY_GLOBAL
CHARACTER_SETS
CLUSTER_CONFIG
CLUSTER_HARDWARE
CLUSTER_INFO
CLUSTER_LOAD
CLUSTER_LOG
CLUSTER_SYSTEMINFO
COLLATIONS
COLLATION_CHARACTER_SET_APPLICABILITY
COLUMNS
DATA_LOCK_WAITS
DDL_JOBS
DEADLOCKS
ENGINES
INSPECTION_RESULT
INSPECTION_RULES
INSPECTION_SUMMARY
KEY_COLUMN_USAGE
METRICS_SUMMARY
METRICS_TABLES
PARTITIONS
PLACEMENT_POLICIES
PROCESSLIST
REFERENTIAL_CONSTRAINTS
SCHEMATA
SEQUENCES
SESSION_VARIABLES
SLOW_QUERY
STATISTICS
TABLES
TABLE_CONSTRAINTS
TABLE_STORAGE_STATS
TIDB_HOT_REGIONS
TIDB_HOT_REGIONS_HISTORY
TIDB_INDEXES
TIDB_SERVERS_INFO
TIDB_TRX
TIFLASH_REPLICA
TIKV_REGION_PEERS
TIKV_REGION_STATUS
TIKV_STORE_STATUS
USER_PRIVILEGES
VIEWS
METRICS_SCHEMA
- UI
- TiDB Dashboard
- Overview
- Maintain
- Access
- Overview Page
- Cluster Info Page
- Top SQL Page
- Key Visualizer Page
- Metrics Relation Graph
- SQL Statements Analysis
- Slow Queries Page
- Cluster Diagnostics
- Search Logs Page
- Instance Profiling
- Session Management and Configuration
- FAQ
- CLI
- Command Line Flags
- Configuration File Parameters
- System Variables
- Storage Engines
- Telemetry
- Errors Codes
- Table Filter
- Schedule Replicas by Topology Labels
- FAQs
- Release Notes
- All Releases
- Release Timeline
- TiDB Versioning
- v6.1
- v6.0
- v5.4
- v5.3
- v5.2
- v5.1
- v5.0
- v4.0
- v3.1
- v3.0
- v2.1
- v2.0
- v1.0
- Glossary
Split Region
For each new table created in TiDB, one Region is segmented by default to store the data of this table. This default behavior is controlled by split-table
in the TiDB configuration file. When the data in this Region exceeds the default Region size limit, the Region starts to split into two.
In the above case, because there is only one Region at the beginning, all write requests occur on the TiKV where the Region is located. If there are a large number of writes for the newly created table, hotspots are caused.
To solve the hotspot problem in the above scenario, TiDB introduces the pre-split function, which can pre-split multiple Regions for a certain table according to the specified parameters and scatter them to each TiKV node.
Synopsis
SplitRegionStmt:
SplitSyntaxOption:
TableName:
PartitionNameListOpt:
SplitOption:
RowValue:
Int64Num:
Usage of Split Region
There are two types of Split Region syntax:
The syntax of even split:
SPLIT TABLE table_name [INDEX index_name] BETWEEN (lower_value) AND (upper_value) REGIONS region_num
BETWEEN lower_value AND upper_value REGIONS region_num
defines the upper boundary, the lower boundary, and the Region amount. Then the current region will be evenly spilt into the number of regions (as specified inregion_num
) between the upper and lower boundaries.The syntax of uneven split:
SPLIT TABLE table_name [INDEX index_name] BY (value_list) [, (value_list)] ...
BY value_list…
specifies a series of points manually, based on which the current Region is spilt. It is suitable for scenarios with unevenly distributed data.
The following example shows the result of the SPLIT
statement:
+--------------------+----------------------+
| TOTAL_SPLIT_REGION | SCATTER_FINISH_RATIO |
+--------------------+----------------------+
| 4 | 1.0 |
+--------------------+----------------------+
TOTAL_SPLIT_REGION
: the number of newly split Regions.SCATTER_FINISH_RATIO
: the completion rate of scattering for newly split Regions.1.0
means that all Regions are scattered.0.5
means that only half of the Regions are scattered and the rest are being scattered.
The following two session variables might affect the behavior of the SPLIT
statement:
tidb_wait_split_region_finish
: It might take a while to scatter the Regions. This duration depends on PD scheduling and TiKV loads. This variable is used to control when executing theSPLIT REGION
statement whether to return the results to the client until all Regions are scattered. If its value is set to1
(by default), TiDB returns the results only after the scattering is completed. If its value is set to0
, TiDB returns the results regardless of the scattering status.tidb_wait_split_region_timeout
: This variable is to set the execution timeout of theSPLIT REGION
statement, in seconds. The default value is 300s. If thesplit
operation is not completed within the duration, TiDB returns a timeout error.
Split Table Region
The key of row data in each table is encoded by table_id
and row_id
. The format is as follows:
t[table_id]_r[row_id]
For example, when table_id
is 22 and row_id
is 11:
t22_r11
Row data in the same table have the same table_id
, but each row has its unique row_id
that can be used for Region split.
Even Split
Because row_id
is an integer, the value of the key to be split can be calculated according to the specified lower_value
, upper_value
, and region_num
. TiDB first calculates the step value (step = (upper_value - lower_value)/region_num
). Then split will be done evenly per each "step" between lower_value
and upper_value
to generate the number of Regions as specified by region_num
.
For example, if you want 16 evenly split Regions split from key rangeminInt64
~maxInt64
for table t, you can use this statement:
SPLIT TABLE t BETWEEN (-9223372036854775808) AND (9223372036854775807) REGIONS 16;
This statement splits table t into 16 Regions between minInt64 and maxInt64. If the given primary key range is smaller than the specified one, for example, 0~1000000000, you can use 0 and 1000000000 take place of minInt64 and maxInt64 respectively to split Regions.
SPLIT TABLE t BETWEEN (0) AND (1000000000) REGIONS 16;
Uneven split
If the known data is unevenly distributed, and you want a Region to be split respectively in key ranges -inf ~ 10000, 10000 ~ 90000, and 90000 ~ +inf, you can achieve this by setting fixed points, as shown below:
SPLIT TABLE t BY (10000), (90000);
Split index Region
The key of the index data in the table is encoded by table_id
, index_id
, and the value of the index column. The format is as follows:
t[table_id]_i[index_id][index_value]
For example, when table_id
is 22, index_id
is 5, and index_value
is abc:
t22_i5abc
The table_id
and index_id
of the same index data in one table is the same. To split index Regions, you need to split Regions based on index_value
.
Even Spilt
The way to split index evenly works the same as splitting data evenly. However, calculating the value of step is more complicated, because index_value
might not be an integer.
The values of upper
and lower
are encoded into a byte array firstly. After removing the longest common prefix of lower
and upper
byte array, the first 8 bytes of lower
and upper
are converted into the uint64 format. Then step = (upper - lower)/num
is calculated. After that, the calculated step is encoded into a byte array, which is appended to the longest common prefix of the lower
and upper
byte array for index split. Here is an example:
If the column of the idx
index is of the integer type, you can use the following SQL statement to split index data:
SPLIT TABLE t INDEX idx BETWEEN (-9223372036854775808) AND (9223372036854775807) REGIONS 16;
This statement splits the Region of index idx in table t into 16 Regions from minInt64
to maxInt64
.
If the column of index idx1 is of varchar type, and you want to split index data by prefix letters.
SPLIT TABLE t INDEX idx1 BETWEEN ("a") AND ("z") REGIONS 25;
This statement splits index idx1 into 25 Regions from a~z. The range of Region 1 is [minIndexValue, b)
; the range of Region 2 is [b, c)
; … the range of Region 25 is [y, minIndexValue]
. For the idx
index, data with the a
prefix is written into Region 1, and data with the b
prefix is written into Region 2, and so on.
In the split method above, both data with the y
and z
prefixes are written into Region 25, because the upper bound is not z
, but {
(the character next to z
in ASCII). Therefore, a more accurate split method is as follows:
SPLIT TABLE t INDEX idx1 BETWEEN ("a") AND ("{") REGIONS 26;
This statement splits index idx1 of the table t
into 26 Regions from a~{
. The range of Region 1 is [minIndexValue, b)
; the range of Region 2 is [b, c)
; … the range of Region 25 is [y, z)
, and the range of Region 26 is [z, maxIndexValue)
.
If the column of index idx2
is of time type like timestamp/datetime, and you want to split index Region by year:
SPLIT TABLE t INDEX idx2 BETWEEN ("2010-01-01 00:00:00") AND ("2020-01-01 00:00:00") REGIONS 10;
This statement splits the Region of index idx2
in table t
into 10 Regions from 2010-01-01 00:00:00
to 2020-01-01 00:00:00
. The range of Region 1 is [minIndexValue, 2011-01-01 00:00:00)
; the range of Region 2 is [2011-01-01 00:00:00, 2012-01-01 00:00:00)
and so on.
If you want to split the index Region by day, see the following example:
SPLIT TABLE t INDEX idx2 BETWEEN ("2020-06-01 00:00:00") AND ("2020-07-01 00:00:00") REGIONS 30;
This statement splits the data of June 2020 of index idex2
in table t
into 30 Regions, each Region representing 1 day.
Region split methods for other types of index columns are similar.
For data Region split of joint indexes, the only difference is that you can specify multiple columns values.
For example, index idx3 (a, b)
contains 2 columns, with column a
of timestamp type and column b
int. If you just want to do a time range split according to column a
, you can use the SQL statement for splitting time index of a single column. In this case, do not specify the value of column b
in lower_value
and upper_velue
.
SPLIT TABLE t INDEX idx3 BETWEEN ("2010-01-01 00:00:00") AND ("2020-01-01 00:00:00") REGIONS 10;
Within the same range of time, if you want to do one more split according to column b column. Just specify the value for column b when splitting.
SPLIT TABLE t INDEX idx3 BETWEEN ("2010-01-01 00:00:00", "a") AND ("2010-01-01 00:00:00", "z") REGIONS 10;
This statement splits 10 Regions in the range of a~z according to the value of column b, with the same time prefix as column a. If the value specified for column a is different, the value of column b might not be used in this case.
Uneven Split
Index data can also be split by specified index values.
For example, there is idx4 (a,b)
, with column a
of the varchar type and column b
of the timestamp type.
SPLIT TABLE t1 INDEX idx4 BY ("a", "2000-01-01 00:00:01"), ("b", "2019-04-17 14:26:19"), ("c", "");
This statement specifies 3 values to split 4 Regions. The range of each Region is as follows:
region1 [ minIndexValue , ("a", "2000-01-01 00:00:01"))
region2 [("a", "2000-01-01 00:00:01") , ("b", "2019-04-17 14:26:19"))
region3 [("b", "2019-04-17 14:26:19") , ("c", "") )
region4 [("c", "") , maxIndexValue )
Split Regions for partitioned tables
Splitting Regions for partitioned tables is the same as splitting Regions for ordinary tables. The only difference is that the same split operation is performed for every partition.
The syntax of even split:
SPLIT [PARTITION] TABLE t [PARTITION] [(partition_name_list...)] [INDEX index_name] BETWEEN (lower_value) AND (upper_value) REGIONS region_num
The syntax of uneven split:
SPLIT [PARTITION] TABLE table_name [PARTITION (partition_name_list...)] [INDEX index_name] BY (value_list) [, (value_list)] ...
Examples of Split Regions for partitioned tables
Create a partitioned table
t
. Suppose that you want to create a Hash table divided into two partitions. The example statement is as follows:create table t (a int,b int,index idx(a)) partition by hash(a) partitions 2;
After creating the table
t
, a Region is split for each partition. Use theSHOW TABLE REGIONS
syntax to view the Regions of this table:show table t regions;
+-----------+-----------+---------+-----------+-----------------+------------------+------------+---------------+------------+----------------------+------------------+ | REGION_ID | START_KEY | END_KEY | LEADER_ID | LEADER_STORE_ID | PEERS | SCATTERING | WRITTEN_BYTES | READ_BYTES | APPROXIMATE_SIZE(MB) | APPROXIMATE_KEYS | +-----------+-----------+---------+-----------+-----------------+------------------+------------+---------------+------------+----------------------+------------------+ | 1978 | t_1400_ | t_1401_ | 1979 | 4 | 1979, 1980, 1981 | 0 | 0 | 0 | 1 | 0 | | 6 | t_1401_ | | 17 | 4 | 17, 18, 21 | 0 | 223 | 0 | 1 | 0 | +-----------+-----------+---------+-----------+-----------------+------------------+------------+---------------+------------+----------------------+------------------+
Use the
SPLIT
syntax to split a Region for each partition. Suppose that you want to split the data in the[0,10000]
range of each partition into four Regions. The example statement is as follows:split partition table t between (0) and (10000) regions 4;
In the above statement,
0
and10000
respectively represent therow_id
of the upper and lower boundaries corresponding to the hotspot data you want to scatter.NoteThis example only applies to scenarios where hotspot data is evenly distributed. If the hotspot data is unevenly distributed in a specified data range, refer to the syntax of uneven split in Split Regions for partitioned tables.
Use the
SHOW TABLE REGIONS
syntax to view the Regions of this table again. You can see that this table now has ten Regions, each partition with five Regions, four of which are the row data and one is the index data.show table t regions;
+-----------+---------------+---------------+-----------+-----------------+------------------+------------+---------------+------------+----------------------+------------------+ | REGION_ID | START_KEY | END_KEY | LEADER_ID | LEADER_STORE_ID | PEERS | SCATTERING | WRITTEN_BYTES | READ_BYTES | APPROXIMATE_SIZE(MB) | APPROXIMATE_KEYS | +-----------+---------------+---------------+-----------+-----------------+------------------+------------+---------------+------------+----------------------+------------------+ | 1998 | t_1400_r | t_1400_r_2500 | 2001 | 5 | 2000, 2001, 2015 | 0 | 132 | 0 | 1 | 0 | | 2006 | t_1400_r_2500 | t_1400_r_5000 | 2016 | 1 | 2007, 2016, 2017 | 0 | 35 | 0 | 1 | 0 | | 2010 | t_1400_r_5000 | t_1400_r_7500 | 2012 | 2 | 2011, 2012, 2013 | 0 | 35 | 0 | 1 | 0 | | 1978 | t_1400_r_7500 | t_1401_ | 1979 | 4 | 1979, 1980, 1981 | 0 | 621 | 0 | 1 | 0 | | 1982 | t_1400_ | t_1400_r | 2014 | 3 | 1983, 1984, 2014 | 0 | 35 | 0 | 1 | 0 | | 1990 | t_1401_r | t_1401_r_2500 | 1992 | 2 | 1991, 1992, 2020 | 0 | 120 | 0 | 1 | 0 | | 1994 | t_1401_r_2500 | t_1401_r_5000 | 1997 | 5 | 1996, 1997, 2021 | 0 | 129 | 0 | 1 | 0 | | 2002 | t_1401_r_5000 | t_1401_r_7500 | 2003 | 4 | 2003, 2023, 2022 | 0 | 141 | 0 | 1 | 0 | | 6 | t_1401_r_7500 | | 17 | 4 | 17, 18, 21 | 0 | 601 | 0 | 1 | 0 | | 1986 | t_1401_ | t_1401_r | 1989 | 5 | 1989, 2018, 2019 | 0 | 123 | 0 | 1 | 0 | +-----------+---------------+---------------+-----------+-----------------+------------------+------------+---------------+------------+----------------------+------------------+
You can also split Regions for the index of each partition. For example, you can split the
[1000,10000]
range of theidx
index into two Regions. The example statement is as follows:split partition table t index idx between (1000) and (10000) regions 2;
Examples of Split Region for a single partition
You can specify the partition to be split.
Create a partitioned table. Suppose that you want to create a Range partitioned table split into three partitions. The example statement is as follows:
create table t ( a int, b int, index idx(b)) partition by range( a ) ( partition p1 values less than (10000), partition p2 values less than (20000), partition p3 values less than (MAXVALUE) );
Suppose that you want to split the data in the
[0,10000]
range of thep1
partition into two Regions. The example statement is as follows:split partition table t partition (p1) between (0) and (10000) regions 2;
Suppose that you want to split the data in the
[10000,20000]
range of thep2
partition into two Regions. The example statement is as follows:split partition table t partition (p2) between (10000) and (20000) regions 2;
You can use the
SHOW TABLE REGIONS
syntax to view the Regions of this table:show table t regions;
+-----------+----------------+----------------+-----------+-----------------+------------------+------------+---------------+------------+----------------------+------------------+ | REGION_ID | START_KEY | END_KEY | LEADER_ID | LEADER_STORE_ID | PEERS | SCATTERING | WRITTEN_BYTES | READ_BYTES | APPROXIMATE_SIZE(MB) | APPROXIMATE_KEYS | +-----------+----------------+----------------+-----------+-----------------+------------------+------------+---------------+------------+----------------------+------------------+ | 2040 | t_1406_ | t_1406_r_5000 | 2045 | 3 | 2043, 2045, 2044 | 0 | 0 | 0 | 1 | 0 | | 2032 | t_1406_r_5000 | t_1407_ | 2033 | 4 | 2033, 2034, 2035 | 0 | 0 | 0 | 1 | 0 | | 2046 | t_1407_ | t_1407_r_15000 | 2048 | 2 | 2047, 2048, 2050 | 0 | 35 | 0 | 1 | 0 | | 2036 | t_1407_r_15000 | t_1408_ | 2037 | 4 | 2037, 2038, 2039 | 0 | 0 | 0 | 1 | 0 | | 6 | t_1408_ | | 17 | 4 | 17, 18, 21 | 0 | 214 | 0 | 1 | 0 | +-----------+----------------+----------------+-----------+-----------------+------------------+------------+---------------+------------+----------------------+------------------+
Suppose that you want to split the
[0,20000]
range of theidx
index of thep1
andp2
partitions into two Regions. The example statement is as follows:split partition table t partition (p1,p2) index idx between (0) and (20000) regions 2;
pre_split_regions
To have evenly split Regions when a table is created, it is recommended you use SHARD_ROW_ID_BITS
together with PRE_SPLIT_REGIONS
. When a table is created successfully, PRE_SPLIT_REGIONS
pre-spilts tables into the number of Regions as specified by 2^(PRE_SPLIT_REGIONS)
.
The value of PRE_SPLIT_REGIONS
must be less than or equal to that of SHARD_ROW_ID_BITS
.
The tidb_scatter_region
global variable affects the behavior of PRE_SPLIT_REGIONS
. This variable controls whether to wait for Regions to be pre-split and scattered before returning results after the table creation. If there are intensive writes after creating the table, you need to set the value of this variable to 1
, then TiDB will not return the results to the client until all the Regions are split and scattered. Otherwise, TiDB writes the data before the scattering is completed, which will have a significant impact on write performance.
Examples of pre_split_regions
create table t (a int, b int,index idx1(a)) shard_row_id_bits = 4 pre_split_regions=2;
After building the table, this statement splits 4 + 1
Regions for table t. 4 (2^2)
Regions are used to save table row data, and 1 Region is for saving the index data of idx1
.
The ranges of the 4 table Regions are as follows:
region1: [ -inf , 1<<61 )
region2: [ 1<<61 , 2<<61 )
region3: [ 2<<61 , 3<<61 )
region4: [ 3<<61 , +inf )
The Region split by the Split Region statement is controlled by the Region merge scheduler in PD. To avoid PD re-merging the newly split Region soon after, you need to dynamically modify configuration items related to the Region merge feature.
MySQL compatibility
This statement is a TiDB extension to MySQL syntax.
See also
- SHOW TABLE REGIONS
- Session variables:
tidb_scatter_region
,tidb_wait_split_region_finish
andtidb_wait_split_region_timeout
.