编程知识 cdmana.com

Deep understanding of elasticsearch hot thread hot_ threads

1、 The source of actual combat problems is

problem 1: bosses GET /_nodes/hot_threads Check out this from the hotline API Is there any article explaining the result of the request ? Return to a pile of stack, do not understand ......

problem 2:ES Cluster is just one machine CPU Blasting height , but IO、heap_mem All normal . What's the matter ?hot_thread I checked , I got a piece of ,SOS

screwing Elasticsearch Wechat group of knowledge planet

Hence the article .

2、hot_threads What are you doing ? Can you eat? ?

In actual business scenarios , When we encounter clusters that respond slower than usual and CPU When the usage rate is high , We need to do troubleshooting , Find the root cause cluster to recover “ As smooth as silk ".

Elasticsearch Provides the ability to monitor hot threads , In order to understand the problem .

stay Java in , Hot thread (hot threads) It takes up a lot of CPU And a thread that takes a long time to execute .

The most common way to check the above problems is API Namely :hot_threads API.

GET /_nodes/hot_threads

GET /_nodes/<node_id>/hot_threads

Hot Threads API from CPU End return about ElasticSearch Which parts of the code are hotspots or return information about where the current cluster is stuck for some reason .

3、hot_threads Support list of parameters

  • ignore_idle_threads

( Optional , Boolean value )

If true, The known idle threads will be filtered out ( for example , Wait in socket selection , Or get a task from an empty queue ).

The default is true.

  • interval

( Optional , Time unit ) Sampling interval for hot thread execution .

The default is 500 millisecond .

  • snapshots

( Optional , Integers ) It's the stack trace to get ( A sequence of method calls nested at a specific point in time ) Number .

The default is 10.

  • threads

( Optional , Integers ) View by type The information of parameter determination ,ElasticSearch The specified number of the most “ hot ” Threads .

most “ hot ” The thread of , It's often our problem .

The default is 3. That is to return to TOP 3 Hot thread .

  • master_timeout

( Optional , Time unit ) Specifies the period of time to wait to connect to the master node .

If no response is received before the timeout expires , The request will fail with an error .

The default is 30 second .

  • timeout

( Optional , Time unit ) Specifies the period of time to wait for a response .

If no response is received before the timeout expires , The request will fail with an error .

The default is 30 second .

  • type

( Optional , character string ) The type to sample .

The available options are :

1)block —— Time of thread blocking state .

2)cpu —— Thread occupancy CPU Time .

3)wait —— The time that the thread waits for the state .

If you want to learn more about thread state , Please see the :

https://docs.oracle.com/javase/6/docs/api/java/lang/Thread.State.html

The default is :cpu.

4、hot_threads Practical examples

Combined with the parameters , A real fight . The following order will tell ElasticSearch Check to be in at one second intervals WAITING Thread in state .

GET /_nodes/hot_threads?type=wait&interval=1s

5、hot_threads API principle

With other returns JSON The results of API Different ,Hot Threads API Return formatted text , You can distinguish several parts of it . This is also the beginning of the article “ Return to a pile of stack, do not understand ” Why .

Before looking at the return stack result information , Let's look at something about Hot Threads API The logic behind it .

ElasticSearch Receive all running threads , And collect information about the cost of each thread CPU Time , The number of times a particular thread has been blocked or in a wait state , Blocked or waiting time, etc .

And then wait for a specific time interval interval( Specified by the time interval parameter ) after ,ElasticSearch Collect the same information again , And according to the running time ( Descending ) Sort hot threads .

Be careful , The above time is for type Of the given operation type specified by the parameter .

after , from ElasticSearch Before analysis N Threads ( among N Is a thread parameter  threads Number of threads specified ).

ElasticSearch What you do is take a snapshot of the thread stack trace every few milliseconds ( The number of snapshots is determined by the snapshot parameter snapshot Appoint ).

Final : Grouping stack traces to visualize changes in thread state , It's the execution that we see API Returned result information .

Above , hold hot_threads API In series with the relevant parameters of , I'm sure you'll be right about hot_threads Have a general understanding of .

Still don't understand what to do with the returned results ? take it easy , The following is the interpretation of .

6、hot_threads API Return results

Now? , Finally arrived hot_threads APi Return to the results section .

It is recommended to enlarge the picture to see .

6.1 The first part of the response

Contains the basic information of the node .

As shown below :

 {Data-(110.188)-1}{67A1DwgCR_eM5eFS-6MR1Q}{qTPWEpF-Q4GTZIlWr3qUqA}{10.6.110.188}{10.6.110.188:9301}{dil}

Through the above information , We can know Elasticsearch The node where the hot thread is located , When hot threads API When the call involves more than one node , This is very convenient .

6.2 The second part of the response

The next few lines can be broken down into sub sections .

6.2.1 The beginning is disassembled

78.4% (391.7ms out of 500ms) cpu usage by thread 'elasticsearch[Data-(110.188)-1][search][T#38]'
  • [search]  —— representative search Thread operation .

  • 78.4%  —— The representative's name is search Of the threads that complete the statistics occupy all CPU The time of the 78.4%.

  • cpu usage —— Indicates that we are using cpu The type of , The current thread is CPU The usage rate of .

  • block usage —— The blocking usage of threads in the blocked state .

  • wait usage —— Waiting usage rate of threads in wait state .

Be careful : Thread names are very important here , It's because of it , We can guess ElasticSearch Which of the features of can cause problems .

The above example , We can draw a preliminary conclusion that search Threads take up a lot of CPU.

In the actual combat , except search There are other threads , List the following :

  • recovery_stream —— Used to recover module Events

  • cache —— Used to cache Events

  • merge —— For segment merging threads

  • index —— For data indexing ( write in ) Threads wait .

6.2.2 The second part is to disassemble

Hot Threads API The next part of the response starts with the following information :

5/10 snapshots sharing following 35 elements

As shown above : Previous thread information will be accompanied by stack trace information .

In our example ,

  • 5/10 —— It's a shot 5 Snapshots have the same stack trace information .

This in most cases means that for the current thread , Half of the inspection time is spent on ElasticSearch In the same part of the code .

7、 Summary

Elasticsearch CPU The investigation with high utilization rate usually relies on :hot_thread API perhaps top jstack Locate the thread stack .

This article is about hot_thread API Application scenarios 、 Use 、 The returned results are interpreted in detail , I hope it helps you .

Welcome to comment on your understanding of hot threads or your practical experience .

If , You have a similar question at the beginning , It's not clear to check the official documents , You are also welcome to leave a message , According to the number of comments , We will write articles to sort out .

Be with you , screwing Elasticsearch!

Reference resources :

《Mastering Elasticsearch》

《Elasticsearch 7.0 bookbook》

https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-hot-threads.html


recommend :

blockbuster | screwing Elasticsearch Methodological cognitive checklist (2020 National day update )

If you can get a driver's license, you can pass Elastic Certification examination !


more short time more Learn quickly more More dry !

China near 50%+ Elastic Certified engineers come from !

And global  800+ Elastic Enthusiasts fight together Elasticsearch!

Add WeChat :elastic6, Ask for value 18 Tickets to the original planet

版权声明
本文为[Mingyi world]所创,转载请带上原文链接,感谢

Scroll to Top