Basically, just do background development , Will be exposed to paging this requirement or function . Basically everyone can use MySQL Of LIMIT To deal with it , And the project I'm in charge of is written in the same way . But once the data is measured , Actually LIMIT It's going to be extremely inefficient , This article will talk about LIMIT Clause optimized .
Many business scenarios need to use paging , Basically, they use LIMIT To achieve .
Create a table and insert 200 Ten thousand data ：
# Create a new one t5 surface CREATE TABLE `t5` ( `id` int NOT NULL AUTO_INCREMENT, `name` varchar(50) NOT NULL, `text` varchar(100) NOT NULL, PRIMARY KEY (`id`), KEY `ix_name` (`name`), KEY `ix_test` (`text`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; # Create a stored procedure to insert 200 All the data CREATE PROCEDURE t5_insert_200w() BEGIN DECLARE i INT; SET i=1000000; WHILE i<=3000000 DO INSERT INTO t5(`name`,text) VALUES('god-jiang666',concat('text', i)); SET i=i+1; END WHILE; END; # Call the stored procedure to insert 200 All the data call t5_insert_200w();
In the case of less page turning ,LIMIT There will be no performance problems .
But if the user needs to find the last page number ？
Usually , We want to ensure that all pages can jump normally , Because I can't use order by xxx desc In this reverse order SQL To check the number of pages that follow , Instead, it uses the positive order to do paging query ：
select * from t5 order by text limit 100000, 10;
Adopt this kind of SQL Query paging , from 200 Take this out of ten thousand data 10 The cost of row data is very big , We need to sort it out first 1000010 Bar record , And then abandon the front 1000000 strip . my macbook pro It's going to cost you 5.578 second .
Let's take a look at , This one above SQL Statement execution plan ：
explain select * from t5 order by text limit 1000000, 10;
As can be seen from the implementation plan , In the case of large pagination ,MySQL No index scanning , Even if text I have added index to the field .
Why is that ？
go back to MySQL Indexes （ Two ） How to design index It is mentioned in ,MySQL The query optimizer of database is based on cost , And the cost of query is estimated based on CPU cost and IO cost .
If MySQL In query cost estimation , If you think that the whole table scanning method is more efficient than the index scanning method , You're going to give up the index , Direct full table scan .
That's why it's on big pages SQL Querying , Clearly index the field , however MySQL But the reason why we went through the full table scan .
Then we continue to use the above query SQL To test my guess ：
explain select * from t5 order by text limit 7774, 10;
explain select * from t5 order by text limit 7775, 10;
All the above experiments are in my mbp Running on , stay 7774 At this critical point ,MySQL Index scan and full table scan are used to optimize the query .
So you can say MySQL It queries the optimizer at its own cost to determine whether to use the index .
because MySQL The core of the algorithm of query optimizer is that we can't intervene manually , So our optimization idea is to start with how to maintain paging at the best paging critical point .
If one SQL sentence , The results of the query can be obtained directly through the index , No need to return to the table query , This index is called the overlay index .
stay MySQL Use in the database explain Keyword view execution plan , If extra This column shows Using index, It means this one SQL The statement uses an override index .
Let's compare the use of overlay indexes , How much more performance will be improved .
# No override index used select * from t5 order by text limit 1000000, 10;
This inquiry took 3.690 second , Let's see how much performance can be improved with overlay index optimization .
# Override index used select id, `text` from t5 order by text limit 1000000, 10;
From the contrast above , Large paging query , After using the overlay index , It took 0.201 second , Instead of using the overlay index 3.690 second , Improved 18 More than double , This is in actual development , It's a big performance optimization .（ The data is in my mbp Run on the top to get ）
Because in actual development , use SELECT Querying a column or two is very rare , Therefore, the scope of application of the above coverage index is relatively limited .
So we can sort out the page SQL The method of rewriting statements into subqueries can improve the performance .
select * from t5 where id>=(select id from t5 order by text limit 1000000, 1) limit 10;
In fact, in this way , The improved efficiency is basically the same as the coverage index used above .
But there are limitations in this optimization method ：
- This kind of writing , Primary key required ID It has to be continuous
- Where Clause does not allow additional conditions to be added
Similar to the above sub query method , We can use JOIN, Page operation is done on the index column first , Then go back to the table to get the columns you want .
select a.* from t5 a inner join (select id from t5 order by text limit 1000000, 10) b on a.id=b.id;
It can be concluded from the experiment that , When using JOIN After rewriting , Both of the above limitations have been lifted , and SQL And there is no loss in the efficiency of execution .
It's different from the method used above , The optimization idea of recording the last ending position is to use some variable to record the position of the last data , The next page is scanned directly from the location of this variable , To avoid MySQL Scan a lot of data and discard the operation .
select * from t5 where id>=1000000 limit 10;
According to the above experiment , It's not hard to get out , Because the primary key index is used for paging operations ,SQL Is the fastest performance .
- This paper introduces the causes of the poor performance of large paging query , And I share some ideas for optimization
- The optimization idea of large paging is to make paging SQL Try to perform in the best performance range , Don't trigger a full table scan
- Hope to share the above , You can be in MySQL Don't take detours on this road ～～～
- 《MySQL performance optimization 》 Chapter six Query optimization performance
- 《 The art of database query optimizer 》