编程知识 cdmana.com

Elasticsearch advanced shard internal Secrets

When we understand ElasticSearch After some basic concepts , Will naturally want to know more about ElasticSearch Internal implementation , For example, how does it achieve near real-time query , And how to deal with the persistence of data . This article will introduce in detail shard Various internal optimizations and implementations for these problems .

invariance

We are 《ElasticSearch And Analysis Introduce 》 In order to make the text searchable , We need a location to create inverted Index, We will not introduce its concept in detail . An interesting question here is invariance . We say that we want to write to disk Inverted Index No more changes , Such a feature will bring many benefits :

  1. Our access to it no longer needs to be locked , Because no one will modify it , So you can read it at will .
  2. It can be very convenient to cache, As long as we have enough space cache, Then it can always be there .

Obviously, the reality will not be as beautiful as we think , Because we always need new documents to be searched , This requires an update index. So how can we do this ?

Dynamic update index

Now the problem we face is dynamic update inverted index, Without losing the advantages of the invariance we mentioned above . A common approach is to use multiple index, For example, we add the content that needs to be updated to a new inverted index in , Then, when querying, traverse all index, Finally, combine the results , In this way, we can preserve the advantages of previous invariance , You can add new text .

The idea is Lucene How is China realized ? We can argue that inverted index Divided into various segment, And then there's a commit point To show what segment At present, it can be queried . As shown in the figure below :

When new text is added , First, I'll put them in a in-memory Of index buffer Inside , As shown in the figure below :

When this buffer With enough data , I'll put it commit To the top commit point Inside :

  1. Create a new segment, And write it to disk .
  2. A new one contains this segment Of commit point Will be written to disk .
  3. Disk Your writing needs flush, To ensure that it is really written .

After this , This segment You can search for , As shown in the figure below :

I think this time , Smart you should ask , What if I modify or delete the original , We mentioned earlier segment It can't be modified , But we can use another method to achieve the same effect :

For deletion , That is to join a .del The file of , Every time we delete it, we add it to this file . So when searching , Actually deleted segment Still being searched , Just go to... Before returning the results .del Search the file , See if it has been deleted , If yes, , Then it won't return... In the result .

If it is an update, how to deal with it , Same , We need to do two things at the same time , Suppose a new version of segment, At the same time, the old version of segment Mark as delete . In this way, the same version will be searched , Just the old version because .del The existence of the file does not return , It's like it's really deleted .

Near real-time query

With our above per-segment After the treatment of , Add a new text , And the delay for him to be queried is much faster , Can reach the minute level , But obviously not fast enough . So do we have any good ways to further optimize ?

Before thinking about optimization , Let's take a look at why the performance bottleneck of the above implementation is , In fact, the main problem is the performance of disk writing , Because we're putting a segment Add to commit point Before , I hope it has been saved to disk , Because even if there is power down The risk of , There will be no loss of data .

Now that we know that the bottleneck is on the write of the disk , So is there any good way to solve it ?Lucene One way to use is to add segment Write to the system cache in , As long as the write is completed, it can be queried , See the gray part in the figure below , Then it can slowly write to disk , But when this write is complete, add it to the commit point in :

In this way, the performance of the whole query has been greatly improved .

We write this into the system cache The operation of is called refresh, By default , This refresh The frequency of is once per second . That's why we say we can do almost real-time query . Of course, not all scenarios need to be updated so frequently , You can modify the update frequency by using the following command , The setting can be set to -1 To turn off this update .

Persistence of change

We mentioned earlier that only after the data is actually written to disk , To join commit point in , In this way, even if there is an accident, such as power failure , And the data won't be lost .ElasticSearch Every time you start it, it's really based on commit point To decide what segment Is already commit the . Now in order to optimize the speed of query, we introduce refresh The concept of , Here is a problem to be solved , That is, these are already in system memory but have not been flush To disk Data in , How to make sure they don't lose .

Here we introduce a new concept , Namely translog, It keeps everything Elasticsearch Operation record of . Is that familiar , by the way , Almost all database persistence will do this , here ElasticSearch No exception . With this translog after , The whole process becomes as follows :

  1. When a text is index after , It will be added to in-memory Of buffer in , It will also join translog after .
  2. Then the steps are the same as before , Write to the system cache, Can be searched, etc .
  3. as time goes on , When translog After getting bigger and bigger , The whole index flush To disk , And create a new one index, stay flush When the disk is complete , You can put the old translog Deleted .

thus , stay ElasticSearch Every time it starts , Except from commit point To restore , You also need to put the corresponding Translog replay once .

About the above flush Timing except when translog Very big time , And a fixed interval, Generally speaking, it is 30 minute , That is, even if there are not many translog produce , The default for each 30 Minutes will flush once .

Segment The merger of

as time goes on , You'll find that Segment The amount of data will be larger and larger . Maintain too much segment Obviously not a good idea . There will be a merge The process of , in other words ElasticSearch In fact, there will be a backstage thread Come on merge Each small segment. Here you can do some optimization , For example, the deleted content does not need to be merged again , Just keep the final result . So this one merge How is the process realized ? In fact, there are the following steps :

  1. Put what already exists segment merge To a big segment in . But at this time, in fact, this big segment Can't be searched .
  • Take this big segment flush To disk, And then modify commit point Include this new big segment And at the same time remove what has been merge Small segment. Then the old ones segment Can be deleted .

summary

So far, this paper puts shard The internal implementation details are explained clearly , I hope it helped you .

版权声明
本文为[Dongge it notes]所创,转载请带上原文链接,感谢
https://cdmana.com/2021/10/20211002145749111t.html

Scroll to Top