编程知识 cdmana.com

Understand high performance and high concurrency from the root (1): go deep into the bottom of the computer, understand threads and thread pools

The original topic of this paper “ Chat TCP The things that take time to connect ”, This collection has been approved by the author , Reprint please contact the author . There are a few changes .

1、 Introduction to the series

1.1 The purpose of the article

As a developer of instant messaging technology , High performance 、 The concept of high concurrency related technology has long been known , What thread pool 、 Zero copy 、 Multiplexing 、 Event driven 、epoll And so on , For example, your technical framework may have the characteristics of :Java Of Netty、Php Of workman、Go Of nget And so on . But when it comes to face-to-face or technical practice, I encounter some doubts that I can't let go , Only then can we know that what we have mastered is only skin .

Recover one's original simplicity 、 Return to nature , What are the underlying principles behind these technical features ? How to be easy to understand 、 It's effortless to really understand the principles behind these technologies , It is 《 Understanding high performance from the root 、 High concurrency 》 What we want to share in this series .

1.2 The origin of the article

I (Jack Jiang) A lot of information about IM、 Information push and other instant messaging technology related resources and articles , From the beginning of open source IM frame MobileIMSDK, To network programming classics 《TCP/IP Detailed explanation 》 The online version of , Until then IM Developing programmatic articles 《 Beginner level one is enough : Develop mobile from scratch IM》, And network programming from shallow to deep 《 Introduction to network programming lazy people 》、《 Introduction to network programming with brain damage 》、《 High performance network programming 》、《 Unknown network programming 》 Series articles .

Go deeper and deeper into knowledge , The more I feel that I know little about instant messaging technology . So later , In order to give developers a better understanding of the network from the perspective of basic telecommunications technology ( Especially mobile networks ) characteristic , I've collected and sorted out my interdisciplinary work 《IM Introduction to zero basic communication technology for developers 》 A series of high-level articles . This series of articles is already the common instant messaging developer's network communication technology knowledge boundary , Plus the previous network programming information , It is enough to solve the blind spot of network communication .

For instant messaging IM For the development of this system , Network communication knowledge is really very important , But back to the nature of Technology , Realize these technical characteristics of network communication itself : Including the thread pool mentioned above 、 Zero copy 、 Multiplexing 、 Event driven and so on , What is their nature ? What is the underlying principle ? That's the purpose of this series of articles , I hope it works for you .

1.3 List of articles

Understanding high performance from the root 、 High concurrency ( One ): Go deep into the bottom of the computer , Understand threads and thread pools 》(* this paper ) 《 Understanding high performance from the root 、 High concurrency ( Two ): Deep into the operating system , understand I/O And zero copy technology ( Release later ..)》 《 Understanding high performance from the root 、 High concurrency ( 3、 ... and ): Deep into the operating system , Thoroughly understand I/O Multiplexing ( Release later ..)》 《 Understanding high performance from the root 、 High concurrency ( Four ): Deep into the operating system , A thorough understanding of synchronization and asynchrony ( Release later ..)》 《 Understanding high performance from the root 、 High concurrency ( 5、 ... and ): How to realize the high concurrency and high performance server ( Release later ..)》

1.4 An overview of this article

This is the beginning of the series , Mainly from CPU This layer explains the principle of multithreading and thread pool , Try to avoid complicated technical concepts , Try to be easy to understand 、 For young and old .

2、 The author of this article

At the request of the author , No real name , No personal photos are provided .

The author's main technical direction is Internet back end 、 High concurrency and high performance server 、 Search engine technology , Net name is “ Manon's island for survival ”, official account “ Manon's island for survival ”. Thank you for your selfless sharing .

3、 Everything has to start with CPU Speaking of

You may have questions , Why multithreading starts from CPU Speaking of it ? The reason is simple , There are no fashionable concepts here , You can see the essence of the problem more clearly .

The reality is that :CPU I don't know about threads 、 Concepts like process .

CPU Only two things :

1) Fetching instructions from memory ;

2) Execution instruction , Then back 1).

You see , ad locum CPU I really don't know what process 、 Concepts like threads .

The next question is CPU Where to get the instructions ? The answer is from a person called Program Counter( abbreviation PC) The register of , It's known as the program counter , Don't think of the register as too mysterious here , You can simply think of registers as memory , It's just faster access .

PC What's in the register ? Here is the address of the instruction in memory , What instructions ? yes CPU Next instruction to be executed .

So who set up PC What about the instruction address in the register ?

original PC The address in the register is automatically added by default 1 Of , Of course it makes sense , Because most of the time CPU It's all done one by one , When you meet if、else when , This order of execution is broken ,CPU When executing this kind of instruction, it will change dynamically according to the calculation result PC Value in register , such CPU You can jump to the instruction you need to execute correctly .

Smart you will ask , that PC How is the initial value set in ?

Before answering this question, we need to know CPU Where does the instruction come from ? It's from memory , crap , The instructions in memory are loaded from executable programs stored on disk , The executable program on disk is generated by the compiler , Where does the compiler generate machine instructions ? The answer is us Defined function .

Notice it's a function , Functions are compiled to form CPU Instructions executed , So it is natural , How can we make CPU How about executing a function ? Obviously, we just need to find the first instruction that the function is compiled into , The first instruction is the function entry .

Now you should know , We want to CPU Execute a function , Then just write the address of the first machine instruction corresponding to this function PC Register is OK , So the functions we write start to be CPU It's being carried out .

You may have questions , What does this have to do with threads ?

4、 from CPU To the operating system

We learned in the last section that CPU How it works , We want to make CPU Execute a function , Then you only need to load the first machine execution corresponding to the function PC Register is OK , So even without an operating system, we can make CPU Execution procedure , It's possible, but it's a very cumbersome process .

We need to :

1) Find a suitable size area in memory to load the program ;

2) Find the function entry , Set it up PC Register makes CPU Start the program .

These two steps are not that easy , If the programmer implements the above two procedures manually every time the program is executed, it will be crazy , So a smart programmer would like to write a program to complete the above two steps automatically .

Machine instructions need to be loaded into memory for execution , So you need to record the starting address and length of the memory ; At the same time, find the entry address of the function and write PC In the register , Think about whether you need a data structure to record this information .

The data structure is roughly as follows :

struct *** { void* start_addr; intlen; void* start_point; ... };

And then there's the moment to name .

This data structure must have a name , What information is this structure used to record ? It records the running state of the program when it is loaded into memory , What's a good idea for a program to run from disk to memory ? Just call process (Process) Okay , Our guiding principle is to sound mysterious , In a word, it's not easy for everyone to understand , I call it “ Don't understand the principles ”.

So the process was born .

CPU The first function to be executed also has a name , The first function to be executed sounds important , Just call main function Well .

A name should also be given to the procedure for completing the above two steps , according to “ Don't understand the principles ” This “ Simple ” The program is called operating system (Operating System) All right. .

So the operating system was born , If a programmer wants to run a program, he doesn't have to load it manually any more .

Now processes and operating systems have , Everything looks perfect .

5、 From single core to multi core , How to make full use of multi-core

One of the great characteristics of human beings is that life is endless , From single core to multi-core .

At this time , Suppose we want to write a program and use multiple cores separately ?

Some students may say that there is a process , Just a few more processes ?

It sounds reasonable , But there are several problems :

1) Processes need to take up memory space ( From the last energy saving, we can see that ), If multiple processes are based on the same executable , So the contents of the memory area of these processes are almost the same , This obviously causes a waste of memory ;

2) The task of computer processing may be more complex , This involves inter process communication , Because each process is in a different memory address space , Interprocess communication naturally needs the help of the operating system , This not only increases the difficulty of programming, but also increases the system overhead .

What to do ?

6、 From process to thread

Let me think about it again , A process is nothing more than a section of memory , This area is preserved in CPU The machine instructions executed and the stack information when the function runs , If you want a process to run , Just put main The first machine instruction address of the function is written to PC register , So the process runs .

The disadvantage of a process is that it has only one entry function , That is to say main function , Therefore, the machine instructions in the process can only be given by one CPU perform , So is there a way to get more than one CPU To execute machine instructions in the same process ?

Smart you should be able to think of , Since we can put main The first instruction address of the function is written to PC register , So other functions and main What's the difference between functions ?

The answer is no difference ,main What's special about functions is that they are CPU The first function to execute , There's nothing special about it , We can PC Register points to main function , You can put the PC Registers point to any function .

When we put PC Register points to non main Function time , Threads are born .

So far we have emancipated our minds , There can be multiple entry functions in a process , In other words, machine instructions belonging to the same process can be divided into multiple CPU At the same time .

Be careful : It's a different concept from process , When we create an area in the process that we need to load , And then put CPU Of PC Register points to main function , In other words, in the process There is only one execution flow .

But not now , Multiple CPU It can be under the same roof ( The area of memory occupied by the process ) Execute multiple entry functions belonging to the process at the same time , That is to say, in a process, you can There are multiple execution streams .

It's always called execution flow. It seems a little too easy to understand , Sacrifice again ” Don't understand the principles “, Make a name that's not easy to understand , Call Threads Well .

This is where threads come from .

The operating system maintains a heap of information for each process , Used to record the memory space of the process , This pile of information is recorded as a dataset A.

alike , The operating system also needs to maintain a bunch of information for threads , Used to record the entry function or stack information of the thread , This pile of data is recorded as a dataset B.

Obviously, data sets B It's better than data A It's less , And it's not like a process , When you create a thread, you don't need to find a memory space in memory , Because the thread is running in the address space of the process , This address space has been created when the program starts , At the same time, threads are created by the program during runtime ( After the process starts ), So when the thread starts running, this address space already exists , Threads can be used directly . This is why the creation thread mentioned in various textbooks is faster than the creation process ( There are other reasons, of course ).

It is worth noting that , With the concept of thread , We just need to create multiple threads after the process is started, and then we can make all the CPU Everyone is busy , This is called high performance 、 The essence of high concurrency .

It's simple , Just create the right number of threads .

Another thing to note : Because each thread shares the memory address space of the process , Therefore, the communication between threads does not need to rely on the operating system , This brings the programmer great convenience, but also brings endless trouble , Most of the problems encountered by multithreading come from the fact that communication between threads is so convenient that it is very error prone . The root of the mistake lies in CPU There is no concept of thread when executing instructions , The mutual exclusion and synchronization problems in multithreading programming need to be solved by programmers themselves , As for mutual exclusion and synchronization, we will not expand it in detail , Most of the operating system information is explained in detail .

Last but not least : Although the previous diagram on thread usage uses more than one CPU, But it doesn't mean that you have to have multiple cores to use multithreading , In the case of a single core, multiple threads can be created , The reason is that threads are an implementation at the operating system level , It doesn't matter how many cores there are ,CPU When executing machine instructions, you don't realize which thread the executed machine instructions belong to . Even if there is only one CPU Under the circumstances , The operating system can also make each thread through thread scheduling “ meanwhile ” Push forward , The way is to CPU The time slice of is distributed back and forth among threads , So multiple threads look like “ meanwhile ” It's running , But in fact, there is only one thread running at any time .

7、 Threads and memory

In the previous discussion, we learned about threads and CPU The relationship between , That is the CPU Of PC Register points to the entry function of the thread , So the thread can run , That's why we have to specify an entry function when we create a thread .

Whatever programming language you use , Creating a thread is basically the same :

// Set thread entry function DoSomething thread = CreateThread(DoSomething); // Let the thread run thread.Run();

So what does thread have to do with memory ?

We know that the data generated by a function when it is executed includes : Function parameter 、 local variable 、 Return address and other information . This information is stored in the stack , Before the concept of thread appeared, there was only one execution flow in the process , So there's only one stack , At the bottom of the stack is the entry function of the process , That is to say main function .

hypothesis main The function is called funA,funcA Call again funcB, As shown in the figure :

So what about threads ?

With threads, there are multiple execution entries in a process , That is, there are multiple execution flows at the same time , Then only one process executing the stream needs a stack to hold the runtime information , So obviously, when there are multiple execution streams, there need to be multiple stacks to store the information of each execution stream , In other words, the operating system allocates a stack for each thread in the address space of the process , That is, each thread has its own stack , It's crucial to be aware of this .

At the same time, we can see , Creating threads consumes process memory space , This is also worth noting .

8、 Use of threads

Now we have the concept of threads , So how do we use threads as programmers ?

From a life cycle perspective , There are two types of tasks that a thread processes : Long task and short task .

1) Long task (long-lived tasks):

seeing the name of a thing one thinks of its function , It's just that the mission survives for a long time , For example, we often use word For example , We are word The edited text needs to be saved on disk , Writing data to disk is a task , Then a better way is to create a thread to write to the disk , The life cycle of the write thread and word The process is the same , Just open word You need to create the write thread , When the user turns off word The thread is destroyed when the thread is destroyed , This is the long task .

This scenario is ideal for creating dedicated threads to handle specific tasks , This is a relatively simple case .

There's a long mission , And then there's the short task .

2) Short task (short-lived tasks):

The concept is also very simple , That is, the processing time of the task is very short , For example, a network request 、 A database query, etc , This kind of task can be done quickly in a short time . So short tasks are more common in all kinds of Server, image web server、database server、file server、mail server etc. , This is also the most common scene for students in the Internet industry , This scenario is what we want to focus on .

This scene has two characteristics : One is that task processing takes a short time ; The other is the huge number of tasks .

What if you're asked to handle this type of task ?

You might think , It's very simple , When server After receiving a request, a thread is created to process the task , After processing, destroy the thread ,So easy.

This method is often called thread-per-request, That is to say, a thread is created when a request is made :

If it's a long mission , So this method works well , But for a large number of short tasks, this method is easy to implement, but it has disadvantages .

Specifically, the following shortcomings :

1) We can see in the past few sections that , Thread is a concept in operating system ( User mode thread implementation is not discussed here 、 Xiecheng and so on ), Therefore, creating threads naturally needs the help of the operating system , It takes time for the operating system to create and destroy threads ;

2) Each thread needs to have its own independent stack , Therefore, when creating a large number of threads, it will consume too much memory and other system resources .

It's like you're a factory owner ( I wonder if I'm happy to have ), There are a lot of orders in hand , Every order brings in a group of workers , The product is very simple , The workers will soon be able to deal with it , After processing this batch of orders, the workers who have been painstakingly recruited will be dismissed , When there's a new order, you're going through a lot of trouble recruiting workers , work on a job 5 Minutes attract people 10 Hours , If you are not motivated to let the enterprise go out of business, you probably won't do it .

So a better strategy is to recruit a group of people and keep them on the spot , Deal with orders when there are orders , When there is no order, you can stay idle .

This is it. Thread pool The origin of .

9、 From multithreading to thread pool

The concept of thread pool is very simple , It's nothing more than creating a batch of threads , And then there's no more release , Tasks are submitted to these threads for processing , So there's no need to create... Frequently 、 Destruction of the thread , At the same time, because the number of threads in the thread pool is usually fixed , It doesn't consume too much memory , So the idea here is Reuse 、 controllable .

10、 How thread pools work

Some students may ask , How to submit a task to the thread pool ? How are these tasks assigned to threads in the thread pool ?

Obviously , Queues in the data structure are naturally suitable for this scenario , It's the producer who submits the task , The thread of consumption task is consumer , In fact, this is the classic producer - consumer problem .

Now you should know why the operating system course is about 、 This is the question for the interview , Because if you're interested in producers - If consumers don't understand , Essentially, you can't write the thread pool correctly .

Limited to space, I don't intend to explain in detail the problems of producers and consumers , You can get the answer by referring to the operating system . Here I'm going to talk about what a task that is generally submitted to a thread pool looks like .

In general, the task submitted to the thread pool consists of two parts :

1) Data that needs to be processed ;

2) Functions that process data .

Pseudo code describes :

struct task { void* data; // The data carried by the task handler handle; // The way data is processed }

Be careful : You can also put in the code struct Comprehend class, That's the object )

Threads in the thread pool will be blocked on the queue , When the producer writes data to the queue , A thread in the thread pool will be awakened , The thread takes the above structure from the queue ( Or object ), As a structure ( Or object ) And call the processing function .

Pseudo code is as follows :

while(true) { struct task = GetFromQueue(); // Take data out of the queue task->handle(task->data); // Processing data }

That's the core part of the thread pool .

By understanding this, you can understand how thread pools work .

11、 The number of threads in the thread pool

Now the thread pool has , So what is the number of threads in the thread pool ?

Think about it for yourself before you move on . If you can see here, it means you're not asleep .

You know, too few threads in the thread pool can't be fully utilized CPU, Too many threads will cause system performance degradation , Too much memory , The consumption caused by thread switching and so on . So the number of threads can't be too many or too few , How much should it be ?

Answer the question , You need to know what kinds of tasks the thread pool handles , Some students may say that you said there are two types ? Long task and short task , This is from a life cycle perspective , There are also two types of resources needed to process tasks , That's what it's all about ... Ah No , yes CPU Intensive and I/O intensive .

1)CPU intensive :

So-called CPU Intensive means that processing tasks don't need to rely on the outside I/O, Like scientific computing 、 Matrix operations and so on . In this case, as long as the number of threads and the number of cores are basically the same, we can make full use of CPU resources .

2)I/O intensive :

This kind of task may not take much time in the calculation part , Most of the time, for example, on disks I/O、 The Internet I/O etc. .

In this case, it's a little more complicated , You need to use performance testing tools to evaluate the use of I/O Waiting time , It's written here as WT(wait time), as well as CPU Calculate the time it takes , It's written here as CT(computing time), So for a N Nuclear systems , The appropriate number of threads is probably N * (1 + WT/CT) , hypothesis I/O The waiting time is the same as the calculation time , So you probably need to 2N It takes two threads to make the most of CPU resources , Notice that this is just a theoretical value , The specific settings need to be tested according to the real business scenarios .

Of course, make the most of CPU Not the only point to consider , As the number of threads increases , Memory footprint 、 System scheduling 、 Number of open files 、 The open socker The number of open database links and so on need to be considered .

So there's no universal formula , We should analyze the specific situation .

12、 Thread pools are not everything

Thread pool is just a form of multithreading , Therefore, multithreading pool cannot avoid the same problem , Like a deadlock problem 、race condition Questions, etc , You can also refer to the operating system related information to get the answer to this part , So the foundation is very important, old fellow .

13、 Best practices for using thread pools

Thread pool is a powerful weapon in the hands of programmers , Internet companies server You can almost see thread pools on the Internet .

But before you use thread pools, you need to think about :

1) Fully understand your mission , Is it a long task or a short task 、 yes CPU Intensive or I/O intensive , If you have both , A better way is to put these two kinds of tasks in different thread pools , This may be a better way to determine the number of threads ;

2) If the task in the thread pool has I/O operation , Be sure to set a timeout for this task , Otherwise, the thread processing the task may block all the time ;

3) Tasks in the thread pool should not wait for the results of other tasks synchronously .

14、 This paper summarizes

In this paper, we start from CPU Start all the way to the commonly used thread pool , From bottom to top 、 From hardware to software .

Be careful : There is no specific programming language in this article , Thread is not a language level concept ( We still don't consider user threads ), But when you really understand threads , I believe you can use many threads in any language , What you need to understand is Tao , After that, it was art .

I hope this article will help you understand threads and thread pools .

The next article will be working closely with thread pools to achieve high performance 、 Another key technology of high concurrency 《 Understanding high performance from the root 、 High concurrency ( Two ): Deep into the operating system , understand I/O And zero copy technology 》, Coming soon .

Participation of this paper Tencent cloud media sharing plan , You are welcome to join us , share .

版权声明
本文为[JackJiang]所创,转载请带上原文链接,感谢
https://cdmana.com/2020/12/20201225111921843P.html

Scroll to Top