编程知识 cdmana.com

Have you stepped on the 8 most flawed SQL usage?

sql The order in which statements are executed :

FROM <1>
ON <2>
JOIN <3>
WHERE <4>
GROUP BY <5>
HAVING <6>
SELECT <7>
DISTINCT <8>
ORDER BY <9>
LIMIT <10>
 Copy code 

1、LIMIT sentence
Paging query is one of the most common scenarios , But it's also usually the most problematic place . For example, for the following simple statement , commonly DBA The way to think of it is to type, name, create_time Add the combination index to the field . In this way, conditional sorting can effectively utilize index , Rapid performance improvement .

SELECT * 
FROM   operation 
WHERE  type = 'SQLStats' 
       AND name = 'SlowLog' 
ORDER  BY create_time 
LIMIT  1000, 10; 
 Copy code 

ok , Probably 90% The above DBA This is the end of the problem . But when LIMIT Clause becomes “LIMIT 1000000,10” when , Programmers still complain : I only take 10 Why are records still slow ?

You don't know the database 1000000 Where do the records start , Even if there is an index, it needs to be calculated from the beginning . There's this performance problem , Most of the time it's programmers who are lazy .

Browse the front-end data page , Or batch export of big data , The maximum value of the previous page can be used as a parameter as a query condition .SQL Redesigned as follows :

SELECT   * 
FROM     operation 
WHERE    type = 'SQLStats' 
AND      name = 'SlowLog' 
AND      create_time > '2017-03-16 14:00:00' 
ORDER BY create_time limit 10;
 Copy code 

Under the new design, the query time is basically fixed , It doesn't change with the amount of data .

2、 Implicit conversion
SQL Another common error is that query variables and field definition types don't match . Take the following statement :

mysql> explain extended SELECT * 
     > FROM   my_balance b 
     > WHERE  b.bpn = 14000000123 
     >       AND b.isverified IS NULL ;
mysql> show warnings;
| Warning | 1739 | Cannot use ref access on index 'bpn' due to type or collation conversion on field 'bpn'
 Copy code 

Which field bpn For the definition of varchar(20),MySQL The strategy is to convert a string to a number and then compare . Functions act on table fields , Index failure .

This may be the parameters that the application framework automatically fills in , Not the programmer's original intention . Now the application framework is very complex , It's convenient to use, but also be careful that it may dig a hole for itself .

3、 Association update 、 Delete
although MySQL5.6 Physical and chemical properties are introduced , But we need to pay special attention to the fact that it is only for query statement optimization . For updates or deletions, you need to manually rewrite JOIN.

For example, below UPDATE sentence ,MySQL What's actually going on is the loop / nested subqueries (DEPENDENT SUBQUERY), Its execution time can be imagined .

UPDATE operation o 
SET    status = 'applying' 
WHERE  o.id IN (SELECT id 
                FROM   (SELECT o.id, 
                               o.status 
                        FROM   operation o 
                        WHERE  o.group = 123 
                               AND o.status NOT IN ( 'done' ) 
                        ORDER  BY o.parent, 
                                  o.id 
                        LIMIT  1) t); 
 Copy code 

Implementation plan :

+----+--------------------+-------+-------+---------------+---------+---------+-------+------+-----------------------------------------------------+
| id | select_type        | table | type  | possible_keys | key     | key_len | ref   | rows | Extra                                               |
+----+--------------------+-------+-------+---------------+---------+---------+-------+------+-----------------------------------------------------+
| 1  | PRIMARY            | o     | index |               | PRIMARY | 8       |       | 24   | Using where; Using temporary                        |
| 2  | DEPENDENT SUBQUERY |       |       |               |         |         |       |      | Impossible WHERE noticed after reading const tables |
| 3  | DERIVED            | o     | ref   | idx_2,idx_5   | idx_5   | 8       | const | 1    | Using where; Using filesort                         |
+----+--------------------+-------+-------+---------------+---------+---------+-------+------+-----------------------------------------------------+
 Copy code 

Rewrite as JOIN after , Subquery selection mode from DEPENDENT SUBQUERY become DERIVED, The execution speed is much faster , from 7 Seconds down to 2 millisecond .

UPDATE operation o 
       JOIN  (SELECT o.id, 
                            o.status 
                     FROM   operation o 
                     WHERE  o.group = 123 
                            AND o.status NOT IN ( 'done' ) 
                     ORDER  BY o.parent, 
                               o.id 
                     LIMIT  1) t
         ON o.id = t.id 
SET    status = 'applying' 
 Copy code 

The implementation plan is simplified to :

+----+-------------+-------+------+---------------+-------+---------+-------+------+-----------------------------------------------------+
| id | select_type | table | type | possible_keys | key   | key_len | ref   | rows | Extra                                               |
+----+-------------+-------+------+---------------+-------+---------+-------+------+-----------------------------------------------------+
| 1  | PRIMARY     |       |      |               |       |         |       |      | Impossible WHERE noticed after reading const tables |
| 2  | DERIVED     | o     | ref  | idx_2,idx_5   | idx_5 | 8       | const | 1    | Using where; Using filesort                         |
+----+-------------+-------+------+---------------+-------+---------+-------+------+-----------------------------------------------------+
 Copy code 

4、 Mixed sort
MySQL You can't use indexes for mixed sorting . But in some cases , There is still a chance to use special methods to improve performance .

SELECT * 
FROM   my_order o 
       INNER JOIN my_appraise a ON a.orderid = o.id 
ORDER  BY a.is_reply ASC, 
          a.appraise_time DESC 
LIMIT  0, 20 
 Copy code 

The execution plan is displayed as a full table scan :

+----+-------------+-------+--------+-------------+---------+---------+---------------+---------+-+
| id | select_type | table | type   | possible_keys     | key     | key_len | ref      | rows    | Extra    
+----+-------------+-------+--------+-------------+---------+---------+---------------+---------+-+
|  1 | SIMPLE      | a     | ALL    | idx_orderid | NULL    | NULL    | NULL    | 1967647 | Using filesort |
|  1 | SIMPLE      | o     | eq_ref | PRIMARY     | PRIMARY | 122     | a.orderid |       1 | NULL           |
+----+-------------+-------+--------+---------+---------+---------+-----------------+---------+-+
 Copy code 

because is_reply Only 0 and 1 Two kinds of state , We rewrite it as follows , Execution time from 1.58 Seconds down to 2 millisecond .

SELECT * 
FROM   ((SELECT *
         FROM   my_order o 
                INNER JOIN my_appraise a 
                        ON a.orderid = o.id 
                           AND is_reply = 0 
         ORDER  BY appraise_time DESC 
         LIMIT  0, 20) 
        UNION ALL 
        (SELECT *
         FROM   my_order o 
                INNER JOIN my_appraise a 
                        ON a.orderid = o.id 
                           AND is_reply = 1 
         ORDER  BY appraise_time DESC 
         LIMIT  0, 20)) t 
ORDER  BY  is_reply ASC, 
          appraisetime DESC 
LIMIT  20; 
 Copy code 

5、EXISTS sentence
MySQL treat EXISTS When clause , The nested subquery is still used . Like the one below SQL sentence :

SELECT *
FROM   my_neighbor n 
       LEFT JOIN my_neighbor_apply sra 
              ON n.id = sra.neighbor_id 
                 AND sra.user_id = 'xxx' 
WHERE  n.topic_status < 4 
       AND EXISTS(SELECT 1 
                  FROM   message_info m 
                  WHERE  n.id = m.neighbor_id 
                         AND m.inuser = 'xxx') 
       AND n.topic_type <> 5 
 Copy code 

The execution plan is :

+----+--------------------+-------+------+-----+------------------------------------------+---------+-------+---------+ -----+
| id | select_type        | table | type | possible_keys     | key   | key_len | ref   | rows    | Extra   |
+----+--------------------+-------+------+ -----+------------------------------------------+---------+-------+---------+ -----+
|  1 | PRIMARY            | n     | ALL  |  | NULL     | NULL    | NULL  | 1086041 | Using where                   |
|  1 | PRIMARY            | sra   | ref  |  | idx_user_id | 123     | const |       1 | Using where          |
|  2 | DEPENDENT SUBQUERY | m     | ref  |  | idx_message_info   | 122     | const |       1 | Using index condition; Using where |
+----+--------------------+-------+------+ -----+------------------------------------------+---------+-------+---------+ -----+
 Copy code 

Get rid of exists Change to join, Avoid nested subqueries , Will execute from 1.93 Seconds reduced to 1 millisecond .

SELECT *
FROM   my_neighbor n 
       INNER JOIN message_info m 
               ON n.id = m.neighbor_id 
                  AND m.inuser = 'xxx' 
       LEFT JOIN my_neighbor_apply sra 
              ON n.id = sra.neighbor_id 
                 AND sra.user_id = 'xxx' 
WHERE  n.topic_status < 4 
       AND n.topic_type <> 5 
 Copy code 

New execution plan :

+----+-------------+-------+--------+ -----+------------------------------------------+---------+ -----+------+ -----+
| id | select_type | table | type   | possible_keys     | key       | key_len | ref   | rows | Extra                 |
+----+-------------+-------+--------+ -----+------------------------------------------+---------+ -----+------+ -----+
|  1 | SIMPLE      | m     | ref    | | idx_message_info   | 122     | const    |    1 | Using index condition |
|  1 | SIMPLE      | n     | eq_ref | | PRIMARY   | 122     | ighbor_id |    1 | Using where      |
|  1 | SIMPLE      | sra   | ref    | | idx_user_id | 123     | const     |    1 | Using where           |
+----+-------------+-------+--------+ -----+------------------------------------------+---------+ -----+------+ -----+
 Copy code 

6、 Push... Under conditions
External query conditions cannot be pushed down to complex views or subqueries :

1、 Aggregate subquery ;2、 contain LIMIT Subquery of ;3、UNION or UNION ALL Subquery ;4、 Subqueries in output fields ;

As the following statement , From the execution plan, we can see that its conditions act on the aggregation sub query :

SELECT * 
FROM   (SELECT target, 
               Count(*) 
        FROM   operation 
        GROUP  BY target) t 
WHERE  target = 'rm-xxxx' 

+----+-------------+------------+-------+---------------+-------------+---------+-------+------+-------------+
| id | select_type | table      | type  | possible_keys | key         | key_len | ref   | rows | Extra       |
+----+-------------+------------+-------+---------------+-------------+---------+-------+------+-------------+
|  1 | PRIMARY     |  | ref   | <auto_key0>   |  | 514     | const |    2 | Using where |
|  2 | DERIVED     | operation  | index | idx_4         | idx_4       | 519     | NULL  |   20 | Using index |
+----+-------------+------------+-------+---------------+-------------+---------+-------+------+-------------+
 Copy code 

Confirm that the semantic query criteria can be directly pushed back , Rewrite as follows :

SELECT target, 
       Count(*) 
FROM   operation 
WHERE  target = 'rm-xxxx' 
GROUP  BY target
 Copy code 

The execution plan becomes :

+----+-------------+-----------+------+---------------+-------+---------+-------+------+--------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+------+---------------+-------+---------+-------+------+--------------------+
| 1 | SIMPLE | operation | ref | idx_4 | idx_4 | 514 | const | 1 | Using where; Using index |
+----+-------------+-----------+------+---------------+-------+---------+-------+------+--------------------+
 Copy code 

7、 Reduce the scope ahead of time **
Start with the beginning SQL sentence :

SELECT * 
FROM   my_order o 
       LEFT JOIN my_userinfo u 
              ON o.uid = u.uid
       LEFT JOIN my_productinfo p 
              ON o.pid = p.pid 
WHERE  ( o.display = 0 ) 
       AND ( o.ostaus = 1 ) 
ORDER  BY o.selltime DESC 
LIMIT  0, 15 
 Copy code 

The SQL The original meaning of the sentence is : Start with a series of left connections , Then sort before 15 Bar record . As can be seen from the implementation plan , The last step estimates the number of sorting records as 90 ten thousand , Time consumption is 12 second .

+----+-------------+-------+--------+---------------+---------+---------+-----------------+--------+----------------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref             | rows   | Extra                                              |
+----+-------------+-------+--------+---------------+---------+---------+-----------------+--------+----------------------------------------------------+
|  1 | SIMPLE      | o     | ALL    | NULL          | NULL    | NULL    | NULL            | 909119 | Using where; Using temporary; Using filesort       |
|  1 | SIMPLE      | u     | eq_ref | PRIMARY       | PRIMARY | 4       | o.uid |      1 | NULL                                               |
|  1 | SIMPLE      | p     | ALL    | PRIMARY       | NULL    | NULL    | NULL            |      6 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+--------+---------------+---------+---------+-----------------+--------+----------------------------------------------------+
 Copy code 

Because in the end WHERE Conditions and sorting are for the leftmost main table , So we can first correct my_order Sorting reduces the amount of data in advance and then makes left connection .SQL Rewrite it as follows , The execution time is reduced to 1 Millisecond or so .

SELECT * 
FROM (
SELECT * 
FROM   my_order o 
WHERE  ( o.display = 0 ) 
       AND ( o.ostaus = 1 ) 
ORDER  BY o.selltime DESC 
LIMIT  0, 15
) o 
     LEFT JOIN my_userinfo u 
              ON o.uid = u.uid 
     LEFT JOIN my_productinfo p 
              ON o.pid = p.pid 
ORDER BY  o.selltime DESC
limit 0, 15
 Copy code 

Check the execution plan again : After materialization of subquery (select_type=DERIVED) Participate in JOIN. Although the estimated line scan is still 90 ten thousand , But using indexes and LIMIT After clause , The actual execution time becomes very small .

+----+-------------+------------+--------+---------------+---------+---------+-------+--------+----------------------------------------------------+
| id | select_type | table      | type   | possible_keys | key     | key_len | ref   | rows   | Extra                                              |
+----+-------------+------------+--------+---------------+---------+---------+-------+--------+----------------------------------------------------+
|  1 | PRIMARY     |  | ALL    | NULL          | NULL    | NULL    | NULL  |     15 | Using temporary; Using filesort                    |
|  1 | PRIMARY     | u          | eq_ref | PRIMARY       | PRIMARY | 4       | o.uid |      1 | NULL                                               |
|  1 | PRIMARY     | p          | ALL    | PRIMARY       | NULL    | NULL    | NULL  |      6 | Using where; Using join buffer (Block Nested Loop) |
|  2 | DERIVED     | o          | index  | NULL          | idx_1   | 5       | NULL  | 909112 | Using where                                        |
+----+-------------+------------+--------+---------------+---------+---------+-------+--------+----------------------------------------------------+
 Copy code 

8、 Intermediate result set push down
Let's look at the following example that has been preliminarily optimized ( The primary table in the left join takes precedence over the query criteria ):

SELECT    a.*, 
          c.allocated 
FROM      ( 
              SELECT   resourceid 
              FROM     my_distribute d 
                   WHERE    isdelete = 0 
                   AND      cusmanagercode = '1234567' 
                   ORDER BY salecode limit 20) a 
LEFT JOIN 
          ( 
              SELECT   resourcesid, sum(ifnull(allocation, 0) * 12345) allocated 
              FROM     my_resources 
                   GROUP BY resourcesid) c 
ON        a.resourceid = c.resourcesid
 Copy code 

Is there any other problem with this statement ? It's not hard to see the subquery c It's a full table aggregate query , When the number of tables is very large, the performance of the whole statement will be degraded .

In fact, for subqueries c, The left join final result set only cares about the ability and the main table resourceid Matching data . So we can rewrite the statement as follows , Execution time from original 2 Seconds down to 2 millisecond .

SELECT    a.*, 
          c.allocated 
FROM      ( 
                   SELECT   resourceid 
                   FROM     my_distribute d 
                   WHERE    isdelete = 0 
                   AND      cusmanagercode = '1234567' 
                   ORDER BY salecode limit 20) a 
LEFT JOIN 
          ( 
                   SELECT   resourcesid, sum(ifnull(allocation, 0) * 12345) allocated 
                   FROM     my_resources r, 
                            ( 
                                     SELECT   resourceid 
                                     FROM     my_distribute d 
                                     WHERE    isdelete = 0 
                                     AND      cusmanagercode = '1234567' 
                                     ORDER BY salecode limit 20) a 
                   WHERE    r.resourcesid = a.resourcesid 
                   GROUP BY resourcesid) c 
ON        a.resourceid = c.resourcesid
 Copy code 

But subquery a In our SQL Many times in the sentence . There is not only an extra cost in this way of writing , It also makes the whole sentence complex . Use WITH The statement rewrites :

WITH a AS 
( 
         SELECT   resourceid 
         FROM     my_distribute d 
         WHERE    isdelete = 0 
         AND      cusmanagercode = '1234567' 
         ORDER BY salecode limit 20)
SELECT    a.*, 
          c.allocated 
FROM      a 
LEFT JOIN 
          ( 
                   SELECT   resourcesid, sum(ifnull(allocation, 0) * 12345) allocated 
                   FROM     my_resources r, 
                            a 
                   WHERE    r.resourcesid = a.resourcesid 
                   GROUP BY resourcesid) c 
ON        a.resourceid = c.resourcesid
 Copy code 

summary
The database compiler generates the execution plan , Decide to SQL The actual implementation of . But the compiler just tries to serve , All database compilers are not perfect .
Most of the scenarios mentioned above , There are also performance problems in other databases . Understand the features of database compiler , To avoid its shortcomings , Write high performance SQL sentence .
In designing data models and writing SQL When the sentence is , Bring in the idea or consciousness of algorithm .
The writing is complicated SQL You should cultivate the use of WITH The habit of statement . Concise and clear thinking SQL Statement can also reduce the burden of database .
The above is all the content shared this time , Want to know more welcome to the official account :Python Programming learning circle , Daily dry goods sharing

版权声明
本文为[Programming learning network]所创,转载请带上原文链接,感谢
https://cdmana.com/2022/134/202205141227435733.html

Scroll to Top