- Docs Home
- About TiDB
- Quick Start
- Develop
- Overview
- Quick Start
- Build a TiDB Cluster in TiDB Cloud (Developer Tier)
- CRUD SQL in TiDB
- Build a Simple CRUD App with TiDB
- Example Applications
- Connect to TiDB
- Design Database Schema
- Write Data
- Read Data
- Transaction
- Optimize
- Troubleshoot
- Reference
- Cloud Native Development Environment
- Third-party Support
- Deploy
- Software and Hardware Requirements
- Environment Configuration Checklist
- Plan Cluster Topology
- Install and Start
- Verify Cluster Status
- Test Cluster Performance
- Migrate
- Overview
- Migration Tools
- Migration Scenarios
- Migrate from Aurora
- Migrate MySQL of Small Datasets
- Migrate MySQL of Large Datasets
- Migrate and Merge MySQL Shards of Small Datasets
- Migrate and Merge MySQL Shards of Large Datasets
- Migrate from CSV Files
- Migrate from SQL Files
- Migrate from One TiDB Cluster to Another TiDB Cluster
- Migrate from TiDB to MySQL-compatible Databases
- Advanced Migration
- Integrate
- Maintain
- Monitor and Alert
- Troubleshoot
- TiDB Troubleshooting Map
- Identify Slow Queries
- Analyze Slow Queries
- SQL Diagnostics
- Identify Expensive Queries Using Top SQL
- Identify Expensive Queries Using Logs
- Statement Summary Tables
- Troubleshoot Hotspot Issues
- Troubleshoot Increased Read and Write Latency
- Save and Restore the On-Site Information of a Cluster
- Troubleshoot Cluster Setup
- Troubleshoot High Disk I/O Usage
- Troubleshoot Lock Conflicts
- Troubleshoot TiFlash
- Troubleshoot Write Conflicts in Optimistic Transactions
- Troubleshoot Inconsistency Between Data and Indexes
- Performance Tuning
- Tuning Guide
- Configuration Tuning
- System Tuning
- Software Tuning
- SQL Tuning
- Overview
- Understanding the Query Execution Plan
- SQL Optimization Process
- Overview
- Logic Optimization
- Physical Optimization
- Prepare Execution Plan Cache
- Control Execution Plans
- Tutorials
- TiDB Tools
- Overview
- Use Cases
- Download
- TiUP
- Documentation Map
- Overview
- Terminology and Concepts
- Manage TiUP Components
- FAQ
- Troubleshooting Guide
- Command Reference
- Overview
- TiUP Commands
- TiUP Cluster Commands
- Overview
- tiup cluster audit
- tiup cluster check
- tiup cluster clean
- tiup cluster deploy
- tiup cluster destroy
- tiup cluster disable
- tiup cluster display
- tiup cluster edit-config
- tiup cluster enable
- tiup cluster help
- tiup cluster import
- tiup cluster list
- tiup cluster patch
- tiup cluster prune
- tiup cluster reload
- tiup cluster rename
- tiup cluster replay
- tiup cluster restart
- tiup cluster scale-in
- tiup cluster scale-out
- tiup cluster start
- tiup cluster stop
- tiup cluster template
- tiup cluster upgrade
- TiUP DM Commands
- Overview
- tiup dm audit
- tiup dm deploy
- tiup dm destroy
- tiup dm disable
- tiup dm display
- tiup dm edit-config
- tiup dm enable
- tiup dm help
- tiup dm import
- tiup dm list
- tiup dm patch
- tiup dm prune
- tiup dm reload
- tiup dm replay
- tiup dm restart
- tiup dm scale-in
- tiup dm scale-out
- tiup dm start
- tiup dm stop
- tiup dm template
- tiup dm upgrade
- TiDB Cluster Topology Reference
- DM Cluster Topology Reference
- Mirror Reference Guide
- TiUP Components
- PingCAP Clinic Diagnostic Service
- TiDB Operator
- Dumpling
- TiDB Lightning
- TiDB Data Migration
- About TiDB Data Migration
- Architecture
- Quick Start
- Deploy a DM cluster
- Tutorials
- Advanced Tutorials
- Maintain
- Cluster Upgrade
- Tools
- Performance Tuning
- Manage Data Sources
- Manage Tasks
- Export and Import Data Sources and Task Configurations of Clusters
- Handle Alerts
- Daily Check
- Reference
- Architecture
- Command Line
- Configuration Files
- OpenAPI
- Compatibility Catalog
- Secure
- Monitoring and Alerts
- Error Codes
- Glossary
- Example
- Troubleshoot
- Release Notes
- Backup & Restore (BR)
- TiDB Binlog
- TiCDC
- Dumpling
- sync-diff-inspector
- TiSpark
- Reference
- Cluster Architecture
- Key Monitoring Metrics
- Secure
- Privileges
- SQL
- SQL Language Structure and Syntax
- SQL Statements
ADD COLUMN
ADD INDEX
ADMIN
ADMIN CANCEL DDL
ADMIN CHECKSUM TABLE
ADMIN CHECK [TABLE|INDEX]
ADMIN SHOW DDL [JOBS|QUERIES]
ADMIN SHOW TELEMETRY
ALTER DATABASE
ALTER INDEX
ALTER INSTANCE
ALTER PLACEMENT POLICY
ALTER TABLE
ALTER TABLE COMPACT
ALTER USER
ANALYZE TABLE
BACKUP
BATCH
BEGIN
CHANGE COLUMN
COMMIT
CHANGE DRAINER
CHANGE PUMP
CREATE [GLOBAL|SESSION] BINDING
CREATE DATABASE
CREATE INDEX
CREATE PLACEMENT POLICY
CREATE ROLE
CREATE SEQUENCE
CREATE TABLE LIKE
CREATE TABLE
CREATE USER
CREATE VIEW
DEALLOCATE
DELETE
DESC
DESCRIBE
DO
DROP [GLOBAL|SESSION] BINDING
DROP COLUMN
DROP DATABASE
DROP INDEX
DROP PLACEMENT POLICY
DROP ROLE
DROP SEQUENCE
DROP STATS
DROP TABLE
DROP USER
DROP VIEW
EXECUTE
EXPLAIN ANALYZE
EXPLAIN
FLASHBACK TABLE
FLUSH PRIVILEGES
FLUSH STATUS
FLUSH TABLES
GRANT <privileges>
GRANT <role>
INSERT
KILL [TIDB]
LOAD DATA
LOAD STATS
MODIFY COLUMN
PREPARE
RECOVER TABLE
RENAME INDEX
RENAME TABLE
REPLACE
RESTORE
REVOKE <privileges>
REVOKE <role>
ROLLBACK
SELECT
SET DEFAULT ROLE
SET [NAMES|CHARACTER SET]
SET PASSWORD
SET ROLE
SET TRANSACTION
SET [GLOBAL|SESSION] <variable>
SHOW ANALYZE STATUS
SHOW [BACKUPS|RESTORES]
SHOW [GLOBAL|SESSION] BINDINGS
SHOW BUILTINS
SHOW CHARACTER SET
SHOW COLLATION
SHOW [FULL] COLUMNS FROM
SHOW CONFIG
SHOW CREATE PLACEMENT POLICY
SHOW CREATE SEQUENCE
SHOW CREATE TABLE
SHOW CREATE USER
SHOW DATABASES
SHOW DRAINER STATUS
SHOW ENGINES
SHOW ERRORS
SHOW [FULL] FIELDS FROM
SHOW GRANTS
SHOW INDEX [FROM|IN]
SHOW INDEXES [FROM|IN]
SHOW KEYS [FROM|IN]
SHOW MASTER STATUS
SHOW PLACEMENT
SHOW PLACEMENT FOR
SHOW PLACEMENT LABELS
SHOW PLUGINS
SHOW PRIVILEGES
SHOW [FULL] PROCESSSLIST
SHOW PROFILES
SHOW PUMP STATUS
SHOW SCHEMAS
SHOW STATS_HEALTHY
SHOW STATS_HISTOGRAMS
SHOW STATS_META
SHOW STATUS
SHOW TABLE NEXT_ROW_ID
SHOW TABLE REGIONS
SHOW TABLE STATUS
SHOW [FULL] TABLES
SHOW [GLOBAL|SESSION] VARIABLES
SHOW WARNINGS
SHUTDOWN
SPLIT REGION
START TRANSACTION
TABLE
TRACE
TRUNCATE
UPDATE
USE
WITH
- Data Types
- Functions and Operators
- Overview
- Type Conversion in Expression Evaluation
- Operators
- Control Flow Functions
- String Functions
- Numeric Functions and Operators
- Date and Time Functions
- Bit Functions and Operators
- Cast Functions and Operators
- Encryption and Compression Functions
- Locking Functions
- Information Functions
- JSON Functions
- Aggregate (GROUP BY) Functions
- Window Functions
- Miscellaneous Functions
- Precision Math
- Set Operations
- List of Expressions for Pushdown
- TiDB Specific Functions
- Clustered Indexes
- Constraints
- Generated Columns
- SQL Mode
- Table Attributes
- Transactions
- Garbage Collection (GC)
- Views
- Partitioning
- Temporary Tables
- Cached Tables
- Character Set and Collation
- Placement Rules in SQL
- System Tables
mysql
- INFORMATION_SCHEMA
- Overview
ANALYZE_STATUS
CLIENT_ERRORS_SUMMARY_BY_HOST
CLIENT_ERRORS_SUMMARY_BY_USER
CLIENT_ERRORS_SUMMARY_GLOBAL
CHARACTER_SETS
CLUSTER_CONFIG
CLUSTER_HARDWARE
CLUSTER_INFO
CLUSTER_LOAD
CLUSTER_LOG
CLUSTER_SYSTEMINFO
COLLATIONS
COLLATION_CHARACTER_SET_APPLICABILITY
COLUMNS
DATA_LOCK_WAITS
DDL_JOBS
DEADLOCKS
ENGINES
INSPECTION_RESULT
INSPECTION_RULES
INSPECTION_SUMMARY
KEY_COLUMN_USAGE
METRICS_SUMMARY
METRICS_TABLES
PARTITIONS
PLACEMENT_POLICIES
PROCESSLIST
REFERENTIAL_CONSTRAINTS
SCHEMATA
SEQUENCES
SESSION_VARIABLES
SLOW_QUERY
STATISTICS
TABLES
TABLE_CONSTRAINTS
TABLE_STORAGE_STATS
TIDB_HOT_REGIONS
TIDB_HOT_REGIONS_HISTORY
TIDB_INDEXES
TIDB_SERVERS_INFO
TIDB_TRX
TIFLASH_REPLICA
TIKV_REGION_PEERS
TIKV_REGION_STATUS
TIKV_STORE_STATUS
USER_PRIVILEGES
VIEWS
METRICS_SCHEMA
- UI
- TiDB Dashboard
- Overview
- Maintain
- Access
- Overview Page
- Cluster Info Page
- Top SQL Page
- Key Visualizer Page
- Metrics Relation Graph
- SQL Statements Analysis
- Slow Queries Page
- Cluster Diagnostics
- Search Logs Page
- Instance Profiling
- Session Management and Configuration
- FAQ
- CLI
- Command Line Flags
- Configuration File Parameters
- System Variables
- Storage Engines
- Telemetry
- Errors Codes
- Table Filter
- Schedule Replicas by Topology Labels
- FAQs
- Release Notes
- All Releases
- Release Timeline
- TiDB Versioning
- v6.1
- v6.0
- v5.4
- v5.3
- v5.2
- v5.1
- v5.0
- v4.0
- v3.1
- v3.0
- v2.1
- v2.0
- v1.0
- Glossary
Insert Data
This document describes how to insert data into TiDB by using the SQL language with different programming languages.
Before you start
Before reading this document, you need to prepare the following:
- Build a TiDB Cluster in TiDB Cloud (Developer Tier).
- Read Schema Design Overview, Create a Database, Create a Table, and Create Secondary Indexes
Insert rows
There are two ways to insert multiple rows of data. For example, if you need to insert 3 players' data.
A multi-line insertion statement:
INSERT INTO `player` (`id`, `coins`, `goods`) VALUES (1, 1000, 1), (2, 230, 2), (3, 300, 5);
Multiple single-line insertion statements:
INSERT INTO `player` (`id`, `coins`, `goods`) VALUES (1, 1000, 1); INSERT INTO `player` (`id`, `coins`, `goods`) VALUES (2, 230, 2); INSERT INTO `player` (`id`, `coins`, `goods`) VALUES (3, 300, 5);
Generally the multi-line insertion statement
runs faster than the multiple single-line insertion statements
.
- SQL
- Java
- Golang
CREATE TABLE `player` (`id` INT, `coins` INT, `goods` INT);
INSERT INTO `player` (`id`, `coins`, `goods`) VALUES (1, 1000, 1), (2, 230, 2);
For more information on how to use this SQL, see Connecting to a TiDB Cluster and follow the steps to enter the SQL statement after connecting to a TiDB cluster using a client.
// ds is an entity of com.mysql.cj.jdbc.MysqlDataSource
try (Connection connection = ds.getConnection()) {
connection.setAutoCommit(false);
PreparedStatement pstmt = connection.prepareStatement("INSERT INTO player (id, coins, goods) VALUES (?, ?, ?)"))
// first player
pstmt.setInt(1, 1);
pstmt.setInt(2, 1000);
pstmt.setInt(3, 1);
pstmt.addBatch();
// second player
pstmt.setInt(1, 2);
pstmt.setInt(2, 230);
pstmt.setInt(3, 2);
pstmt.addBatch();
pstmt.executeBatch();
connection.commit();
} catch (SQLException e) {
e.printStackTrace();
}
Due to the default MySQL JDBC Driver settings, you need to change some parameters to get better bulk insert performance.
Parameter | Means | Recommended Scenario | Recommended Configuration |
---|---|---|---|
useServerPrepStmts | Whether to use the server side to enable prepared statements | When you need to use a prepared statement more than once | true |
cachePrepStmts | Whether the client caches prepared statements | useServerPrepStmts=true 时 | true |
prepStmtCacheSqlLimit | Maximum size of a prepared statement (256 characters by default) | When the prepared statement is greater than 256 characters | Configured according to the actual size of the prepared statement |
prepStmtCacheSize | Maximum number of prepared statement caches (25 by default) | When the number of prepared statements is greater than 25 | Configured according to the actual number of prepared statements |
rewriteBatchedStatements | Whether to rewrite Batched statements | When batch operations are required | true |
allowMultiQueries | Start batch operations | Because a client bug requires this to be set when rewriteBatchedStatements = true and useServerPrepStmts = true | true |
MySQL JDBC Driver also provides an integrated configuration: useConfigs
. When it is configured with maxPerformance
, it is equivalent to configuring a set of configurations. Taking mysql:mysql-connector-java:8.0.28
as an example, useConfigs=maxPerformance
contains:
cachePrepStmts=true
cacheCallableStmts=true
cacheServerConfiguration=true
useLocalSessionState=true
elideSetAutoCommits=true
alwaysSendSetIsolation=false
enableQueryTimeouts=false
connectionAttributes=none
useInformationSchema=true
You can check mysql-connector-java-{version}.jar!/com/mysql/cj/configurations/maxPerformance.properties
to get the configurations contained in useConfigs=maxPerformance
for the corresponding version of MySQL JDBC Driver.
The following is a typical scenario of JDBC connection string configurations. In this example, Host: 127.0.0.1
, Port: 4000
, User name: root
, Password: null, Default database: test
:
jdbc:mysql://127.0.0.1:4000/test?user=root&useConfigs=maxPerformance&useServerPrepStmts=true&prepStmtCacheSqlLimit=2048&prepStmtCacheSize=256&rewriteBatchedStatements=true&allowMultiQueries=true
For a complete example in Java, see:
package main
import (
"database/sql"
"strings"
_ "github.com/go-sql-driver/mysql"
)
type Player struct {
ID string
Coins int
Goods int
}
func bulkInsertPlayers(db *sql.DB, players []Player, batchSize int) error {
tx, err := db.Begin()
if err != nil {
return err
}
stmt, err := tx.Prepare(buildBulkInsertSQL(batchSize))
if err != nil {
return err
}
defer stmt.Close()
for len(players) > batchSize {
if _, err := stmt.Exec(playerToArgs(players[:batchSize])...); err != nil {
tx.Rollback()
return err
}
players = players[batchSize:]
}
if len(players) != 0 {
if _, err := tx.Exec(buildBulkInsertSQL(len(players)), playerToArgs(players)...); err != nil {
tx.Rollback()
return err
}
}
if err := tx.Commit(); err != nil {
tx.Rollback()
return err
}
return nil
}
func playerToArgs(players []Player) []interface{} {
var args []interface{}
for _, player := range players {
args = append(args, player.ID, player.Coins, player.Goods)
}
return args
}
func buildBulkInsertSQL(amount int) string {
return "INSERT INTO player (id, coins, goods) VALUES (?, ?, ?)" + strings.Repeat(",(?,?,?)", amount-1)
}
For a complete example in Golang, see:
Bulk-Insert
If you need to quickly import a large amount of data into a TiDB cluster, it is recommended that you use a range of tools provided by PingCAP for data migration. Using the INSERT
statement is not the best way, because it is not efficient and requires to handle exceptions and other issues on your own.
The following are the recommended tools for bulk-insert:
- Data export: Dumpling. You can export MySQL or TiDB data to local or Amazon S3.
- Data import: TiDB Lightning. You can import Dumpling exported data, a CSV file, or Migrate Data from Amazon Aurora to TiDB. It also supports reading data from a local disk or Amazon S3 cloud disk.
- Data replication: TiDB Data Migration. You can replicate MySQL, MariaDB, and Amazon Aurora databases to TiDB. It also supports merging and migrating the sharded instances and tables from the source databases.
- Data backup and restore: Backup & Restore (BR). Compared to Dumpling, BR is more suitable for big data scenario.
- Data import: Data Import Task page in the TiDB Cloud console. You can import Dumpling exported data, a CSV file, or Migrate Data from Amazon Aurora to TiDB. It also supports reading data from a local disk, Amazon S3 cloud disk, or GCS cloud disk.
- Data replication: TiDB Data Migration. You can replicate MySQL, MariaDB, and Amazon Aurora databases to TiDB. It also supports merging and migrating the sharded instances and tables from the source databases.
- Data backup and restore: Backup page in the TiDB Cloud console. Compared to Dumpling, backup and restore is more suitable for big data scenario.
Avoid hotspots
When designing a table, you need to consider if there is a large number of insert operations. If so, you need to avoid hotspots during table design. See the Select primary key section and follow the Rules when selecting primary key.
For more information on how to handle hotspot issues, see Troubleshoot Hotspot Issues.
Insert data to a table with the AUTO_RANDOM
primary key
If the primary key of the table you insert has the AUTO_RANDOM
attribute, then by default the primary key cannot be specified. For example, in the bookshop
database, you can see that the id
field of the users
table contains the AUTO_RANDOM
attribute.
In this case, you cannot use SQL like the following to insert:
INSERT INTO `bookshop`.`users` (`id`, `balance`, `nickname`) VALUES (1, 0.00, 'nicky');
An error will occur:
ERROR 8216 (HY000): Invalid auto random: Explicit insertion on auto_random column is disabled. Try to set @@allow_auto_random_explicit_insert = true.
It is not recommended to manually specify the AUTO_RANDOM
column during insertion time.
There are two solutions to handle this error:
(Recommended) Remove this column from the insert statement and use the
AUTO_RANDOM
value that TiDB initialized for you. This fits the semantics ofAUTO_RANDOM
.INSERT INTO `bookshop`.`users` (`balance`, `nickname`) VALUES (0.00, 'nicky');
If you are sure that you must specify this column, then you can use the
SET
statement to allow the column ofAUTO_RANDOM
to be specified during insertion time by changing the user variable.SET @@allow_auto_random_explicit_insert = true; INSERT INTO `bookshop`.`users` (`id`, `balance`, `nickname`) VALUES (1, 0.00, 'nicky');
Use HTAP
In TiDB, HTAP capabilities save you from performing additional operations when inserting data. There is no additional insertion logic. TiDB automatically guarantees data consistency. All you need to do is turn on column-oriented replica synchronization after creating the table, and use the column-oriented replica to speed up your queries directly.