- Docs Home
- About TiDB
- Quick Start
- Develop
- Overview
- Quick Start
- Build a TiDB Cluster in TiDB Cloud (Developer Tier)
- CRUD SQL in TiDB
- Build a Simple CRUD App with TiDB
- Example Applications
- Connect to TiDB
- Design Database Schema
- Write Data
- Read Data
- Transaction
- Optimize
- Troubleshoot
- Reference
- Cloud Native Development Environment
- Third-party Support
- Deploy
- Software and Hardware Requirements
- Environment Configuration Checklist
- Plan Cluster Topology
- Install and Start
- Verify Cluster Status
- Test Cluster Performance
- Migrate
- Overview
- Migration Tools
- Migration Scenarios
- Migrate from Aurora
- Migrate MySQL of Small Datasets
- Migrate MySQL of Large Datasets
- Migrate and Merge MySQL Shards of Small Datasets
- Migrate and Merge MySQL Shards of Large Datasets
- Migrate from CSV Files
- Migrate from SQL Files
- Migrate from One TiDB Cluster to Another TiDB Cluster
- Migrate from TiDB to MySQL-compatible Databases
- Advanced Migration
- Integrate
- Maintain
- Monitor and Alert
- Troubleshoot
- TiDB Troubleshooting Map
- Identify Slow Queries
- Analyze Slow Queries
- SQL Diagnostics
- Identify Expensive Queries Using Top SQL
- Identify Expensive Queries Using Logs
- Statement Summary Tables
- Troubleshoot Hotspot Issues
- Troubleshoot Increased Read and Write Latency
- Save and Restore the On-Site Information of a Cluster
- Troubleshoot Cluster Setup
- Troubleshoot High Disk I/O Usage
- Troubleshoot Lock Conflicts
- Troubleshoot TiFlash
- Troubleshoot Write Conflicts in Optimistic Transactions
- Troubleshoot Inconsistency Between Data and Indexes
- Performance Tuning
- Tuning Guide
- Configuration Tuning
- System Tuning
- Software Tuning
- SQL Tuning
- Overview
- Understanding the Query Execution Plan
- SQL Optimization Process
- Overview
- Logic Optimization
- Physical Optimization
- Prepare Execution Plan Cache
- Control Execution Plans
- Tutorials
- TiDB Tools
- Overview
- Use Cases
- Download
- TiUP
- Documentation Map
- Overview
- Terminology and Concepts
- Manage TiUP Components
- FAQ
- Troubleshooting Guide
- Command Reference
- Overview
- TiUP Commands
- TiUP Cluster Commands
- Overview
- tiup cluster audit
- tiup cluster check
- tiup cluster clean
- tiup cluster deploy
- tiup cluster destroy
- tiup cluster disable
- tiup cluster display
- tiup cluster edit-config
- tiup cluster enable
- tiup cluster help
- tiup cluster import
- tiup cluster list
- tiup cluster patch
- tiup cluster prune
- tiup cluster reload
- tiup cluster rename
- tiup cluster replay
- tiup cluster restart
- tiup cluster scale-in
- tiup cluster scale-out
- tiup cluster start
- tiup cluster stop
- tiup cluster template
- tiup cluster upgrade
- TiUP DM Commands
- Overview
- tiup dm audit
- tiup dm deploy
- tiup dm destroy
- tiup dm disable
- tiup dm display
- tiup dm edit-config
- tiup dm enable
- tiup dm help
- tiup dm import
- tiup dm list
- tiup dm patch
- tiup dm prune
- tiup dm reload
- tiup dm replay
- tiup dm restart
- tiup dm scale-in
- tiup dm scale-out
- tiup dm start
- tiup dm stop
- tiup dm template
- tiup dm upgrade
- TiDB Cluster Topology Reference
- DM Cluster Topology Reference
- Mirror Reference Guide
- TiUP Components
- PingCAP Clinic Diagnostic Service
- TiDB Operator
- Dumpling
- TiDB Lightning
- TiDB Data Migration
- About TiDB Data Migration
- Architecture
- Quick Start
- Deploy a DM cluster
- Tutorials
- Advanced Tutorials
- Maintain
- Cluster Upgrade
- Tools
- Performance Tuning
- Manage Data Sources
- Manage Tasks
- Export and Import Data Sources and Task Configurations of Clusters
- Handle Alerts
- Daily Check
- Reference
- Architecture
- Command Line
- Configuration Files
- OpenAPI
- Compatibility Catalog
- Secure
- Monitoring and Alerts
- Error Codes
- Glossary
- Example
- Troubleshoot
- Release Notes
- Backup & Restore (BR)
- TiDB Binlog
- TiCDC
- Dumpling
- sync-diff-inspector
- TiSpark
- Reference
- Cluster Architecture
- Key Monitoring Metrics
- Secure
- Privileges
- SQL
- SQL Language Structure and Syntax
- SQL Statements
ADD COLUMNADD INDEXADMINADMIN CANCEL DDLADMIN CHECKSUM TABLEADMIN CHECK [TABLE|INDEX]ADMIN SHOW DDL [JOBS|QUERIES]ADMIN SHOW TELEMETRYALTER DATABASEALTER INDEXALTER INSTANCEALTER PLACEMENT POLICYALTER TABLEALTER TABLE COMPACTALTER USERANALYZE TABLEBACKUPBATCHBEGINCHANGE COLUMNCOMMITCHANGE DRAINERCHANGE PUMPCREATE [GLOBAL|SESSION] BINDINGCREATE DATABASECREATE INDEXCREATE PLACEMENT POLICYCREATE ROLECREATE SEQUENCECREATE TABLE LIKECREATE TABLECREATE USERCREATE VIEWDEALLOCATEDELETEDESCDESCRIBEDODROP [GLOBAL|SESSION] BINDINGDROP COLUMNDROP DATABASEDROP INDEXDROP PLACEMENT POLICYDROP ROLEDROP SEQUENCEDROP STATSDROP TABLEDROP USERDROP VIEWEXECUTEEXPLAIN ANALYZEEXPLAINFLASHBACK TABLEFLUSH PRIVILEGESFLUSH STATUSFLUSH TABLESGRANT <privileges>GRANT <role>INSERTKILL [TIDB]LOAD DATALOAD STATSMODIFY COLUMNPREPARERECOVER TABLERENAME INDEXRENAME TABLEREPLACERESTOREREVOKE <privileges>REVOKE <role>ROLLBACKSELECTSET DEFAULT ROLESET [NAMES|CHARACTER SET]SET PASSWORDSET ROLESET TRANSACTIONSET [GLOBAL|SESSION] <variable>SHOW ANALYZE STATUSSHOW [BACKUPS|RESTORES]SHOW [GLOBAL|SESSION] BINDINGSSHOW BUILTINSSHOW CHARACTER SETSHOW COLLATIONSHOW [FULL] COLUMNS FROMSHOW CONFIGSHOW CREATE PLACEMENT POLICYSHOW CREATE SEQUENCESHOW CREATE TABLESHOW CREATE USERSHOW DATABASESSHOW DRAINER STATUSSHOW ENGINESSHOW ERRORSSHOW [FULL] FIELDS FROMSHOW GRANTSSHOW INDEX [FROM|IN]SHOW INDEXES [FROM|IN]SHOW KEYS [FROM|IN]SHOW MASTER STATUSSHOW PLACEMENTSHOW PLACEMENT FORSHOW PLACEMENT LABELSSHOW PLUGINSSHOW PRIVILEGESSHOW [FULL] PROCESSSLISTSHOW PROFILESSHOW PUMP STATUSSHOW SCHEMASSHOW STATS_HEALTHYSHOW STATS_HISTOGRAMSSHOW STATS_METASHOW STATUSSHOW TABLE NEXT_ROW_IDSHOW TABLE REGIONSSHOW TABLE STATUSSHOW [FULL] TABLESSHOW [GLOBAL|SESSION] VARIABLESSHOW WARNINGSSHUTDOWNSPLIT REGIONSTART TRANSACTIONTABLETRACETRUNCATEUPDATEUSEWITH
- Data Types
- Functions and Operators
- Overview
- Type Conversion in Expression Evaluation
- Operators
- Control Flow Functions
- String Functions
- Numeric Functions and Operators
- Date and Time Functions
- Bit Functions and Operators
- Cast Functions and Operators
- Encryption and Compression Functions
- Locking Functions
- Information Functions
- JSON Functions
- Aggregate (GROUP BY) Functions
- Window Functions
- Miscellaneous Functions
- Precision Math
- Set Operations
- List of Expressions for Pushdown
- TiDB Specific Functions
- Clustered Indexes
- Constraints
- Generated Columns
- SQL Mode
- Table Attributes
- Transactions
- Garbage Collection (GC)
- Views
- Partitioning
- Temporary Tables
- Cached Tables
- Character Set and Collation
- Placement Rules in SQL
- System Tables
mysql- INFORMATION_SCHEMA
- Overview
ANALYZE_STATUSCLIENT_ERRORS_SUMMARY_BY_HOSTCLIENT_ERRORS_SUMMARY_BY_USERCLIENT_ERRORS_SUMMARY_GLOBALCHARACTER_SETSCLUSTER_CONFIGCLUSTER_HARDWARECLUSTER_INFOCLUSTER_LOADCLUSTER_LOGCLUSTER_SYSTEMINFOCOLLATIONSCOLLATION_CHARACTER_SET_APPLICABILITYCOLUMNSDATA_LOCK_WAITSDDL_JOBSDEADLOCKSENGINESINSPECTION_RESULTINSPECTION_RULESINSPECTION_SUMMARYKEY_COLUMN_USAGEMETRICS_SUMMARYMETRICS_TABLESPARTITIONSPLACEMENT_POLICIESPROCESSLISTREFERENTIAL_CONSTRAINTSSCHEMATASEQUENCESSESSION_VARIABLESSLOW_QUERYSTATISTICSTABLESTABLE_CONSTRAINTSTABLE_STORAGE_STATSTIDB_HOT_REGIONSTIDB_HOT_REGIONS_HISTORYTIDB_INDEXESTIDB_SERVERS_INFOTIDB_TRXTIFLASH_REPLICATIKV_REGION_PEERSTIKV_REGION_STATUSTIKV_STORE_STATUSUSER_PRIVILEGESVIEWS
METRICS_SCHEMA
- UI
- TiDB Dashboard
- Overview
- Maintain
- Access
- Overview Page
- Cluster Info Page
- Top SQL Page
- Key Visualizer Page
- Metrics Relation Graph
- SQL Statements Analysis
- Slow Queries Page
- Cluster Diagnostics
- Search Logs Page
- Instance Profiling
- Session Management and Configuration
- FAQ
- CLI
- Command Line Flags
- Configuration File Parameters
- System Variables
- Storage Engines
- Telemetry
- Errors Codes
- Table Filter
- Schedule Replicas by Topology Labels
- FAQs
- Release Notes
- All Releases
- Release Timeline
- TiDB Versioning
- v6.1
- v6.0
- v5.4
- v5.3
- v5.2
- v5.1
- v5.0
- v4.0
- v3.1
- v3.0
- v2.1
- v2.0
- v1.0
- Glossary
Unstable Result Set
This document describes how to solve unstable result set errors.
GROUP BY
For convenience, MySQL "extends" the GROUP BY syntax to allow the SELECT clause to refer to non-aggregated fields not declared in the GROUP BY clause, that is, the NON-FULL GROUP BY syntax. In other databases, this is considered a syntax ERROR because it causes unstable result sets.
For example, you have two tables:
stu_infostores the student informationstu_scorestores the student test scores.
Then you can write a SQL query statement like this:
SELECT
`a`.`class`,
`a`.`stuname`,
max( `b`.`courscore` )
FROM
`stu_info` `a`
JOIN `stu_score` `b` ON `a`.`stuno` = `b`.`stuno`
GROUP BY
`a`.`class`,
`a`.`stuname`
ORDER BY
`a`.`class`,
`a`.`stuname`;
Result:
+------------+--------------+------------------+
| class | stuname | max(b.courscore) |
+------------+--------------+------------------+
| 2018_CS_01 | MonkeyDLuffy | 95.5 |
| 2018_CS_03 | PatrickStar | 99.0 |
| 2018_CS_03 | SpongeBob | 95.0 |
+------------+--------------+------------------+
3 rows in set (0.00 sec)
The a.class and a.stuname fields are specified in the GROUP BY statement, and the selected columns are a.class, a.stuname and b.courscore. The only column that is not in the GROUP BY condition, b.courscore, is also specified with a unique value using the max() function. There is ONLY ONE result that satisfies this SQL statement without any ambiguity, which is called the FULL GROUP BY syntax.
A counterexample is the NON-FULL GROUP BY syntax. For example, in these two tables, write the following SQL query (delete a.stuname in GROUP BY).
SELECT
`a`.`class`,
`a`.`stuname`,
max( `b`.`courscore` )
FROM
`stu_info` `a`
JOIN `stu_score` `b` ON `a`.`stuno` = `b`.`stuno`
GROUP BY
`a`.`class`
ORDER BY
`a`.`class`,
`a`.`stuname`;
Then two values that match this SQL are returned.
The first returned value:
+------------+--------------+------------------------+
| class | stuname | max( `b`.`courscore` ) |
+------------+--------------+------------------------+
| 2018_CS_01 | MonkeyDLuffy | 95.5 |
| 2018_CS_03 | PatrickStar | 99.0 |
+------------+--------------+------------------------+
The second returned value:
+------------+--------------+------------------+
| class | stuname | max(b.courscore) |
+------------+--------------+------------------+
| 2018_CS_01 | MonkeyDLuffy | 95.5 |
| 2018_CS_03 | SpongeBob | 99.0 |
+------------+--------------+------------------+
There are two results because you did NOT specify how to get the value of the a.stuname field in SQL, and two results are both satisfied by SQL semantics. It results in an unstable result set. Therefore, if you want to guarantee the stability of the result set of the GROUP BY statement, use the FULL GROUP BY syntax.
MySQL provides a sql_mode switch ONLY_FULL_GROUP_BY to control whether to check the FULL GROUP BY syntax or not. TiDB is also compatible with this sql_mode switch.
mysql> select a.class, a.stuname, max(b.courscore) from stu_info a join stu_score b on a.stuno=b.stuno group by a.class order by a.class, a.stuname;
+------------+--------------+------------------+
| class | stuname | max(b.courscore) |
+------------+--------------+------------------+
| 2018_CS_01 | MonkeyDLuffy | 95.5 |
| 2018_CS_03 | PatrickStar | 99.0 |
+------------+--------------+------------------+
2 rows in set (0.01 sec)
mysql> set @@sql_mode='STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION,ONLY_FULL_GROUP_BY';
Query OK, 0 rows affected (0.01 sec)
mysql> select a.class, a.stuname, max(b.courscore) from stu_info a join stu_score b on a.stuno=b.stuno group by a.class order by a.class, a.stuname;
ERROR 1055 (42000): Expression #2 of ORDER BY is not in GROUP BY clause and contains nonaggregated column '' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
Run results: The above example shows the effect when you set ONLY_FULL_GROUP_BY for sql_mode.
ORDER BY
In SQL semantics, the result set is output in order only if the ORDER BY syntax is used. For a single-instance database, since the data is stored on one server, the results of multiple executions are often stable without data reorganization. Some databases (especially the MySQL InnoDB storage engine) can even output the result sets in order of the primary key or index.
As a distributed database, TiDB stores data on multiple servers. In addition, the TiDB layer does not cache data pages, so the result set order of SQL statements without ORDER BY is easily perceived as unstable. To output a sequential result set, you need to explicitly add the order field into the ORDER BY clause, which conforms to SQL semantics.
In the following example, only one field is added to the ORDER BY clause, and TiDB only sorts the results by that one field.
mysql> select a.class, a.stuname, b.course, b.courscore from stu_info a join stu_score b on a.stuno=b.stuno order by a.class;
+------------+--------------+-------------------------+-----------+
| class | stuname | course | courscore |
+------------+--------------+-------------------------+-----------+
| 2018_CS_01 | MonkeyDLuffy | PrinciplesofDatabase | 60.5 |
| 2018_CS_01 | MonkeyDLuffy | English | 43.0 |
| 2018_CS_01 | MonkeyDLuffy | OpSwimming | 67.0 |
| 2018_CS_01 | MonkeyDLuffy | OpFencing | 76.0 |
| 2018_CS_01 | MonkeyDLuffy | FundamentalsofCompiling | 88.0 |
| 2018_CS_01 | MonkeyDLuffy | OperatingSystem | 90.5 |
| 2018_CS_01 | MonkeyDLuffy | PrincipleofStatistics | 69.0 |
| 2018_CS_01 | MonkeyDLuffy | ProbabilityTheory | 76.0 |
| 2018_CS_01 | MonkeyDLuffy | Physics | 63.5 |
| 2018_CS_01 | MonkeyDLuffy | AdvancedMathematics | 95.5 |
| 2018_CS_01 | MonkeyDLuffy | LinearAlgebra | 92.5 |
| 2018_CS_01 | MonkeyDLuffy | DiscreteMathematics | 89.0 |
| 2018_CS_03 | SpongeBob | PrinciplesofDatabase | 88.0 |
| 2018_CS_03 | SpongeBob | English | 79.0 |
| 2018_CS_03 | SpongeBob | OpBasketball | 92.0 |
| 2018_CS_03 | SpongeBob | OpTennis | 94.0 |
| 2018_CS_03 | PatrickStar | LinearAlgebra | 6.5 |
| 2018_CS_03 | PatrickStar | AdvancedMathematics | 5.0 |
| 2018_CS_03 | SpongeBob | DiscreteMathematics | 72.0 |
| 2018_CS_03 | PatrickStar | ProbabilityTheory | 12.0 |
| 2018_CS_03 | PatrickStar | PrincipleofStatistics | 20.0 |
| 2018_CS_03 | PatrickStar | OperatingSystem | 36.0 |
| 2018_CS_03 | PatrickStar | FundamentalsofCompiling | 2.0 |
| 2018_CS_03 | PatrickStar | DiscreteMathematics | 14.0 |
| 2018_CS_03 | PatrickStar | PrinciplesofDatabase | 9.0 |
| 2018_CS_03 | PatrickStar | English | 60.0 |
| 2018_CS_03 | PatrickStar | OpTableTennis | 12.0 |
| 2018_CS_03 | PatrickStar | OpPiano | 99.0 |
| 2018_CS_03 | SpongeBob | FundamentalsofCompiling | 43.0 |
| 2018_CS_03 | SpongeBob | OperatingSystem | 95.0 |
| 2018_CS_03 | SpongeBob | PrincipleofStatistics | 90.0 |
| 2018_CS_03 | SpongeBob | ProbabilityTheory | 87.0 |
| 2018_CS_03 | SpongeBob | Physics | 65.0 |
| 2018_CS_03 | SpongeBob | AdvancedMathematics | 55.0 |
| 2018_CS_03 | SpongeBob | LinearAlgebra | 60.5 |
| 2018_CS_03 | PatrickStar | Physics | 6.0 |
+------------+--------------+-------------------------+-----------+
36 rows in set (0.01 sec)
Results are unstable when the ORDER BY values are the same. To reduce randomness, ORDER BY values should be unique. If you can't guarantee the uniqueness, you need to add more ORDER BY fields until the combination of the ORDER BY fields in ORDER BY is unique, then the result will be stable.
The result set is unstable because order by is not used in GROUP_CONCAT()
The result set is unstable because TiDB reads data from the storage layer in parallel, so the result set order returned by GROUP_CONCAT() without ORDER BY is easily perceived as unstable.
To let GROUP_CONCAT() get the result set output in order, you need to add the sorting fields to the ORDER BY clause, which conforms to the SQL semantics. In the following example, GROUP_CONCAT() that splices customer_id without ORDER BY causes an unstable result set.
Excluded
ORDER BYFirst query:
mysql> select GROUP_CONCAT( customer_id SEPARATOR ',' ) FROM customer where customer_id like '200002%'; +-------------------------------------------------------------------------+ | GROUP_CONCAT(customer_id SEPARATOR ',') | +-------------------------------------------------------------------------+ | 20000200992,20000200993,20000200994,20000200995,20000200996,20000200... | +-------------------------------------------------------------------------+Second query:
mysql> select GROUP_CONCAT( customer_id SEPARATOR ',' ) FROM customer where customer_id like '200002%'; +-------------------------------------------------------------------------+ | GROUP_CONCAT(customer_id SEPARATOR ',') | +-------------------------------------------------------------------------+ | 20000203040,20000203041,20000203042,20000203043,20000203044,20000203... | +-------------------------------------------------------------------------+Include
ORDER BYFirst query:
mysql> select GROUP_CONCAT( customer_id order by customer_id SEPARATOR ',' ) FROM customer where customer_id like '200002%'; +-------------------------------------------------------------------------+ | GROUP_CONCAT(customer_id SEPARATOR ',') | +-------------------------------------------------------------------------+ | 20000200000,20000200001,20000200002,20000200003,20000200004,20000200... | +-------------------------------------------------------------------------+Second query:
mysql> select GROUP_CONCAT( customer_id order by customer_id SEPARATOR ',' ) FROM customer where customer_id like '200002%'; +-------------------------------------------------------------------------+ | GROUP_CONCAT(customer_id SEPARATOR ',') | +-------------------------------------------------------------------------+ | 20000200000,20000200001,20000200002,20000200003,20000200004,20000200... | +-------------------------------------------------------------------------+
Unstable results in SELECT * FROM T LIMIT N
The returned result is related to the distribution of data on the storage node (TiKV). If multiple queries are performed, different storage units (Regions) of the storage nodes (TiKV) return results at different speeds, which can cause unstable results.