Preface :

We all know the nature of affairs , Database to maintain these properties , Especially consistency and isolation , Generally, locking is used . At the same time, database is a highly concurrent application , There will be a lot of concurrent access at the same time , If over locking , It will greatly reduce the concurrent processing power . So for the lock processing , It can be said that the essence of database for transaction processing lies in . By analyzing MySQL in InnoDB Lock mechanism of engine , To cast a brick and draw a jade , Let readers better understand , What does the database do in transaction processing .

# A blockade or Two section lock ?
Because there's a lot of concurrent access , To prevent deadlock , Once blocking method is recommended in general application , It's at the beginning of the method , Already know in advance what data will be used , And lock it all , After method runs , Unlock all again . This method can effectively avoid cycle deadlock , But it doesn't work in databases , Because at the beginning of the transaction , The database doesn't know what data will be used .
The database follows the two-stage locking protocol , Divide the transaction into two phases , Locking stage and unlocking stage ( So it's called a two-stage lock )

  • Locking stage : Lock operation can be performed in this stage . Apply for and obtain any data before reading it S lock ( Shared lock , Other transactions can continue to add shared lock , But we can't lock it ), Apply for and obtain X lock ( Exclusive lock , No locks can be obtained for other transactions ). Lock failed , Then the transaction enters the waiting state , Not until the lock is successful .
  • Unlock phase : When the transaction releases a block , Transaction enters unlocking phase , At this stage, only the unlocking operation can be carried out, and the locking operation can no longer be carried out .
Business Lock / Unlock processing
begin;
insert into test ..... Add insert The corresponding lock
update test set... Add update The corresponding lock
delete from test .... Add delete The corresponding lock
commit; Transaction commit , Release at the same time insert、update、delete The corresponding lock

This way, though, can't avoid deadlock , But the two-stage locking protocol can guarantee the serialization of the concurrent scheduling of transactions ( Serialization is important , Especially during data recovery and backup ) Of .

# Lock mode in transaction

## Four isolation levels of transactions
In database operation , In order to effectively ensure the correctness of concurrent reading data , Proposed transaction isolation level . Our database lock , To build these isolation levels .

Isolation level Dirty reading (Dirty Read) It can't be read repeatedly (NonRepeatable Read) Fantasy reading (Phantom Read)
Uncommitted read (Read uncommitted) Probably Probably Probably
Read committed (Read committed) impossible Probably Probably
Repeatable (Repeatable read) impossible impossible Probably
Serializable (Serializable ) impossible impossible impossible
  • Uncommitted read (Read Uncommitted): Dirty reading allowed , That is, it is possible to read the uncommitted transaction modification data in other sessions
  • Submit to read (Read Committed): Only the submitted data can be read .Oracle Most databases are at this level by default ( Don't repeat )
  • Repeatable (Repeated Read): Repeatable . Queries within the same transaction are consistent at the beginning of the transaction ,InnoDB Default level . stay SQL In the standard , This isolation level eliminates non rereading , But there's still phantom reading
  • Serial read (Serializable): Read completely serially , Every time you read, you need to obtain a table level shared lock , Reading and writing will block each other

Read Uncommitted This level , Databases don't usually work , And no operation will be locked , I won't discuss it here .

##MySQL Type of middle lock
MySQL There are many kinds of middle lock , There are common table locks and row locks , There are new ones Metadata Lock wait , Watch lock is to lock a whole watch , Although it can be divided into read lock and write lock , But after all, it's locking the whole watch , It will lead to the decrease of concurrency , It's usually done ddl Use when processing .

Row lock is to lock data row , This locking method is more complex , But because only limited data is locked , No restrictions on other data , So it's very concurrent ,MySQL Generally, row locks are used to handle concurrent transactions . The main discussion here is row lock .

###Read Committed( Read submissions )
stay RC In the rank , Data is read unlocked , But data writing 、 Modification and deletion need to be locked . The effect is as follows

MySQL> show create table class_teacher \G\
Table: class_teacher
Create Table: CREATE TABLE `class_teacher` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `class_name` varchar(100) COLLATE utf8mb4_unicode_ci NOT NULL,
  `teacher_id` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_teacher_id` (`teacher_id`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
1 row in set (0.02 sec)
MySQL> select * from class_teacher;
+----+--------------+------------+
| id | class_name   | teacher_id |
+----+--------------+------------+
|  1 |  Class three one      |          1 |
|  3 |  Class one, grade two      |          2 |
|  4 |  Class two of junior high school      |          2 |
+----+--------------+------------+

because MySQL Of InnoDB The default is to use RR Level , So first we have to session Open as RC Level , And set up binlog The pattern of

SET session transaction isolation level read committed;
SET SESSION binlog_format = 'ROW';( Or is it MIXED)
Business A Business B
begin; begin;
update class_teacher set class_name=' Class two, grade three ' where teacher_id=1; update class_teacher set class_name=' Class three in junior high school ' where teacher_id=1;
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
commit;

In order to prevent modification conflicts in concurrent process , Business A in MySQL to teacher_id=1 Lock the data lines of , Not all the time commit( Release the lock ), The transaction B I couldn't get the lock ,wait Until timeout .

Now we need to notice ,teacher_id It's indexed , If there is no index class_name Well ?update class_teacher set teacher_id=3 where class_name = ' Class three one ';
that MySQL Will lock all data rows of the whole table . It sounds kind of incredible here , But when sql In the process of operation ,MySQL I don't know which data lines are class_name = ' Class three one ' Of ( There's no index ), If a condition cannot be quickly filtered through the index , At the storage engine level, all records will be locked and returned , Again by MySQL Server Layer to filter .

But in practice ,MySQL Some improvements have been made , stay MySQL Server Filter conditions , After finding dissatisfaction , Would call unlock_row Method , Release the lock for records that do not meet the conditions ( Violation of the two-stage locking protocol ). To do so , It ensures that only the locks on the condition records will be held in the end , But the lock operation of each record cannot be omitted . So even if it's MySQL, For the sake of efficiency, it's against the rules .( See 《 High performance MySQL》 Chinese Third Edition p181)

The same applies to MySQL The default isolation level of RR. So when batch modifying a table with a large amount of data , If the corresponding index cannot be used ,MySQL Server Filtering data is very slow , There will be some data that has not been modified , But they are still locked .

###Repeatable Read( Can be reread )
This is a MySQL in InnoDB Default isolation level . Let's divide up “ read ” and “ Write ” Two modules to explain .

#### read
To read is to reread , The concept of rereadable is that when multiple instances of a transaction read data concurrently , You will see the same data lines , A bit abstract , Let's take a look at the effect .

RC( Don't reread ) Presentation in mode

Business A Business B
begin;

begin;

select id,class_name,teacher_id from class_teacher where teacher_id=1;

id class_name teacher_id
1 Class two, grade three 1
2 Class three one 1

update class_teacher set class_name=' Class three in junior high school ' where id=1;

commit;

select id,class_name,teacher_id from class_teacher where teacher_id=1;

id class_name teacher_id
1 Class three in junior high school 1
2 Class three one 1

Read the business B Modified data , It's not the same as the first query , It's unreadable .

commit;


Business B modify id=1 After the data is submitted , Business A Same query , The result of the last one is different from that of the previous one , This is not to reread ( Rereading results in different ). This is likely to bring some problems , So let's take a look at it RR In the rank MySQL The performance of the :

Business A Business B Business C
begin;

begin;

begin;

select id,class_name,teacher_id from class_teacher where teacher_id=1;

id class_name teacher_id
1 Class two, grade three 1
2 Class three one 1

update class_teacher set class_name=' Class three in junior high school ' where id=1;

commit;

insert into class_teacher values (null,' Class three in junior high school ',1);

commit;

select id,class_name,teacher_id from class_teacher where teacher_id=1;

id class_name teacher_id
1 Class two, grade three 1
2 Class three one 1

I didn't read the business B Modified data , And the first time sql Read the same , It's repeatable .

I didn't read the business C New data .

commit;

We noticed that , When teacher_id=1 when , Business A I did a read first , Business B The middle has been modified id=1 The data of , and commit after , Business A The second reading is exactly the same as the first . So it's readable . that MySQL How to do it ? Let's sell it here , Let's look down .

#### The difference between nonrepeatable reading and unreal reading ####
Many people tend to confuse unreadable reading with unreal reading , It's true that the two are similar . But the point of non repetition is update and delete, And the point of phantom reading is this insert.

If we use the lock mechanism to achieve these two isolation levels , In repeatable reading , The sql After reading the data for the first time , Lock the data , Other transactions cannot modify the data , You can achieve repeatable reading . But this method can't be locked insert The data of , So when it comes to business A Previously read data , Or modify all the data , Business B Still can insert Data submission , When the transaction A You'll find that there's an inexplicable piece of data that didn't exist before , This is unreal reading , Can't be avoided by a row lock . need Serializable Isolation level , Read with read lock , Write with write lock , Read lock and write lock are mutually exclusive , This can effectively avoid unreal reading 、 It can't be read repeatedly 、 Dirty reading and so on , But it will greatly reduce the concurrency of the database .

So the biggest difference between nonrepeatable reading and unreal reading , It's about how to solve their problems through the lock mechanism .

As mentioned above , We use pessimistic lock mechanism to deal with these two problems , however MySQL、ORACLE、PostgreSQL Wait for a mature database , For performance reasons , They all use optimistic lock theory MVCC( Multi version concurrency control ) To avoid these two problems .

#### Pessimistic lock and optimistic lock ####

  • Pessimistic locking

As its name suggests , It refers to the data being exposed to the outside world ( Includes other current transactions of the system , And transactions from external systems ) The revision was conservative , therefore , In the whole data processing process , Lock the data . Pessimistic lock implementation , It often relies on the locking mechanism provided by the database ( Only the locking mechanism provided by the database layer can truly guarantee the exclusivity of data access , otherwise , Even in this system to implement the locking mechanism , There is no guarantee that external systems will not modify the data ).

In the case of pessimistic lock , To ensure the isolation of transactions , You need consistent lock reading . Lock when reading data , Other transactions cannot modify the data . When you modify or delete data, you also need to lock it , Other transactions cannot read the data .

  • Optimism lock

Relative to pessimistic locks , Optimistic locking mechanism adopts a more relaxed locking mechanism . Pessimistic lock mostly depends on the lock mechanism of database , To ensure maximum exclusivity of the operation . But then comes the huge cost of database performance , Especially for long term transactions , Such expenses are often unbearable .

The optimistic locking mechanism solves this problem to a certain extent . Optimism lock , Mostly based on data version ( Version ) Record mechanism implementation . What is a data version ? Add a version identity to the data , In the database table based version solution , Usually by adding a “version” Field to implement . When reading out data , Read out this version number together , When updated later , Add one... To this version number . here , Compare the version data of submitted data with the current version information recorded in the database table , If the submitted data version number is greater than the current version number of the database table , Is updated , Otherwise, it is considered to be outdated data .

It should be noted that ,MVCC There is no fixed specification for the implementation of , Each database will have a different implementation , What is discussed here is InnoDB Of MVCC.

####MVCC stay MySQL Of InnoDB In the implementation of
stay InnoDB in , Two additional hidden values are added after each row of data to implement MVCC, These two values record when this row of data is created , Another record when this row of data expires ( Or deleted ). In practice , It is not time that is stored , It's the version number of the transaction , Each new transaction is opened , The version number of the transaction is incremented . Can be rereaded Repeatable reads Under transaction isolation level :

  • SELECT when , Read creation version number <= Current transaction version number , Delete version number is empty or > Current transaction version number .
  • INSERT when , Save the creation version number with the current transaction version number as the row
  • DELETE when , Save the deletion version number with the current transaction version number as the row
  • UPDATE when , Insert a new record , Save the current transaction version number as the row creation version number , At the same time, save the current transaction version number to the original deleted line

adopt MVCC, Although each row of records requires additional storage space , More line inspection work and some extra maintenance work , But it can reduce the use of locks , Most read operations don't need to be locked , Easy to read data , Performance is very good , It also ensures that only rows that meet the standard will be read , It's only necessary to lock it .

We don't care what we learn from the database , Or from the Internet , Most of them are the four isolation levels of the first mock exam. ,RR Level is repeatable , But it can't solve unreal reading , And only in Serializable Level can solve unreal reading . So I added a business C To show the effect . In the transaction C Added a teacher_id=1 The data of commit,RR There should be unreal reading in the level , Business A In the query teacher_id=1 Transaction will be read when data of C New data . But after the test, we found that , stay MySQL This is not the case in China , In the transaction C After submission , Business A Still won't read this data . Visible in MySQL Of RR In the rank , It solves the problem of unreal reading . See the figure below

innodb_lock_1

Reading problem solved , according to MVCC The definition of , Conflicts occur when data is submitted concurrently , How to solve the conflict ? Let's see InnoDB in RR Level processing of write data .

####“ read ” And “ read ” The difference between
Some readers may wonder , The isolation level of transactions is actually the definition of read data , But here it is , It is divided into reading and writing modules to explain . This is mainly because MySQL Reading in Chinese , And read in transaction isolation level , It's different .

Let's see , stay RR In the rank , adopt MVCC Mechanism , While making data repeatable , But the data we read may be historical data , It's not timely data , It's not the current data in the database ! This is particularly sensitive to the timeliness of data , There's a good chance something's wrong .

For this way of reading historical data , We call it snapshot read (snapshot read), How to read the data of the current version of the database , Call current reading (current read). Obviously , stay MVCC in :

  • Read the snapshot : Namely select
    • select * from table ....;
  • The current reading : Special reading operations , Insert / to update / Delete operation , Belongs to the current reading , It's all current data , It needs to be locked .
    • select * from table where ? lock in share mode;
    • select * from table where ? for update;
    • insert;
    • update ;
    • delete;

The isolation level of transactions actually defines the current read level ,MySQL To reduce lock handling ( Including waiting for other locks ) Time for , Improve concurrency , The concept of snapshot read is introduced , bring select Don't lock it . and update、insert these “ The current reading ”, We need another module to solve this problem .

### Write (" The current reading ")
Although the isolation level of transaction only defines the requirements for reading data , In fact, it's also a requirement to write data . The above “ read ”, Actually speaking, snapshot reading ; And here it is “ Write ” It's the current reading .
In order to solve the problem of unreal reading ,MySQL Transaction used Next-Key lock .

####Next-Key lock
Next-Key Locks are row locks and GAP( Clearance lock ) The merger of , The line lock has been introduced above , Next up GAP Clearance lock .

Row locks can prevent data conflicts caused by data modification commits of different transaction versions . But how to avoid other transactions inserting data becomes a problem . We can see RR Level and RC Level comparison

RC Level :

Business A Business B
begin;

begin;

select id,class_name,teacher_id from class_teacher where teacher_id=30;

id class_name teacher_id
2 Class two, grade three 30


update class_teacher set class_name=' Class four, grade three ' where teacher_id=30;

insert into class_teacher values (null,' Class two, grade three ',30);

commit;

select id,class_name,teacher_id from class_teacher where teacher_id=30;

id class_name teacher_id
2 Class four, grade three 30
10 Class two, grade three 30


RR Level :

Business A Business B
begin;

begin;

select id,class_name,teacher_id from class_teacher where teacher_id=30;

id class_name teacher_id
2 Class two, grade three 30
update class_teacher set class_name=' Class four, grade three ' where teacher_id=30;

insert into class_teacher values (null,' Class two, grade three ',30);

waiting....

select id,class_name,teacher_id from class_teacher where teacher_id=30;

id class_name teacher_id
2 Class four, grade three 30
commit; Business Acommit after , Business B Of insert perform .

By comparison, we can find that , stay RC In the rank , Business A Modified all teacher_id=30 The data of , But when the business Binsert After entering new data , Business A I found that there was one more line teacher_id=30 The data of , And it wasn't update Statement modified , This is it. “ The current reading ” The illusory reading of .

RR In the rank , Business A stay update Rear lock , Business B Unable to insert new data , Such affairs A stay update Keep the data consistent before and after reading , Avoid unreal reading . This lock , Namely Gap lock .

MySQL That's how it works :

stay class_teacher In this list ,teacher_id It's an index , Then it will maintain a set of B+ Data relation of tree , In order to simplify the , We use chain structure to express ( It's actually a tree structure , But the principle is the same )

innodb_lock_2

As shown in the figure ,InnoDB Clustered index is used ,teacher_id As a secondary index , To maintain an index field and primary key id Tree structure of ( It's in the form of a linked list ), And keep the order .

Innodb Divide the data into several intervals

  • (negative infinity, 5],
  • (5,30],
  • (30,positive infinity);

update class_teacher set class_name=' Class four, grade three ' where teacher_id=30; Not only with row locks , The corresponding data row is locked ; It's also on both sides ,(5,30] and (30,positive infinity), All joined in gap lock . Such affairs B You can't be in these two ranges insert New data .

Limited by this implementation ,Innodb Most of the time, it will lock the area that does not need to be locked . As shown below :

Business A Business B Business C
begin; begin; begin;

select id,class_name,teacher_id from class_teacher;

id class_name teacher_id
1 Class three one

5

2 Class two, grade three 30
update class_teacher set class_name=' Class one of junior high school ' where teacher_id=20;

insert into class_teacher values (null,' Class five, grade three ',10);

waiting .....

insert into class_teacher values (null,' Class five, grade three ',40);
commit; Business A commit after , This statement is inserted successfully commit;
commit;

update Of teacher_id=20 Is in (5,30] Section , Even if no data has been modified ,Innodb It will also be added in this range gap lock , And other intervals will not affect , Business C Normal insert .

If you are using a field without an index , such as update class_teacher set teacher_id=7 where class_name=' Class 8, grade 3 ( Even if no data is matched )', Then we will add the whole table gap lock . meanwhile , It can't pass like the line lock above MySQL Server Filter to automatically release the lock that does not meet the conditions , Because there's no index , Then these fields are not sorted , There is no interval . Unless the transaction is committed , Otherwise, other transactions cannot insert any data .

Row locks prevent other transactions from modifying or deleting ,GAP Lock prevents other transactions from adding , Line locks and GAP A combination of locks Next-Key Locks work together RR The problem of unreal reading in data writing .

###Serializable
This level is very simple , Read and share lock , Write and lock , Reading and writing are mutually exclusive . The pessimistic lock theory used , Implement a simple , Data is more secure , But concurrency is very poor . If you have very few or no concurrent businesses , At the same time, if the data is required to be timely and reliable , You can use this mode .

Here is a sentence to make complaints about. , Don't see select It's said that it won't be locked , stay Serializable This level , It will still be locked !

Reference material




Errors were found in the article 、 Questions about the content , We can all focus on the WeChat official account. (meituantech), Leave us a message in the background . Each week, we pick out an enthusiastic partner , Send a nice little gift . Come scan code to follow us !