Well, the interview series is over , It turns out it's still delicious , Um. , Think I found my Java The foundation has not been written , So this is a sequel , Please keep the first part of the sequel .
Talk about the difference between process and thread ？
A process is an execution of a program , It is an independent unit for resource allocation and scheduling , Its role is that the program can be executed concurrently to improve resource utilization and throughput .
Because process is the basic unit of resource allocation and scheduling , Because of the creation of the process 、 The destruction 、 Switching generates a lot of time and space overhead , The number of processes cannot be too many , Thread is a smaller basic unit that can run independently than a process , It's an entity of the process , It can reduce the time and space cost of concurrent program execution , Make the operating system have better concurrency .
Threads basically don't own system resources , There are only a few resources that are essential to the runtime , For example, program counter 、 Registers and stacks , Processes occupy the heap 、 Stack .
know synchronized How it works ？
synchronized yes java Atomic built-in lock provided , This built-in lock that is invisible to the user is also known as Monitor lock , Use synchronized after , After compiling, it will add monitorenter and monitorexit Bytecode instruction , He relies on the underlying mutex of the operating system to implement . His main role is to achieve atomic operations and solve the problem of memory visibility of shared variables .
perform monitorenter Command will attempt to acquire an object lock , If the object is not locked or already has a lock , Lock counter +1. At this time, other threads competing for locks will enter the waiting queue .
perform monitorexit When the command is given, the counter will be set -1, When the counter value is 0 when , Then the lock is released , Threads in the waiting queue continue to compete for locks .
synchronized It's an exclusive lock , When a thread gets a lock , Other threads must wait for the thread to release the lock before it can acquire the lock , And because of Java There is a one-to-one correspondence between threads in and native threads in the operating system , When the thread is blocked or awakened, it will switch from user mode to kernel mode , This conversion is very performance intensive .
In terms of memory semantics , The locking process clears shared variables from working memory , Read from main memory , The process of releasing the lock is to write the shared variables in working memory back to main memory .
In fact, most of the time I think it's about monitorenter That's it , But for a clearer description , Be more specific .
If you go further into the source code ,synchronized There are actually two queues waitSet and entryList.
- When multiple threads enter a synchronized code block , First of all to enter entryList
- A thread gets monitor After locking , Assign it to the current thread , And counter +1
- If thread calls wait Method , Will release the lock , The current thread is set to null, Counter -1, Enter at the same time waitSet Waiting to be awakened , call notify perhaps notifyAll And then it goes into entryList Competitive lock
- If the thread finishes executing , Release the lock as well , Counter -1, The current thread is set to null
Do you understand the optimization mechanism of the lock ？
from JDK1.6 After the version ,synchronized It's also constantly optimizing the lock mechanism , In some cases, it's not a very heavy lock . The optimization mechanism includes adaptive locking 、 spinlocks 、 Lock elimination 、 Lock coarsening 、 Lightweight locks and biased locks .
The state of the lock from low to high is unlocked -> Biased locking -> Lightweight lock -> Heavyweight lock , The process of upgrading is from low to high , Demotion is also possible under certain conditions .
spinlocks ： Because most of the time , The lock is occupied for a short time , The locking time of shared variables is also very short , There is no need to suspend threads , The back and forth context switch between user mode and kernel mode seriously affects the performance . The concept of spin is to let a thread execute a busy loop , It can be understood as doing nothing , To prevent the transition from user mode to kernel mode , Spin lock can be set by -XX:+UseSpining To open , The default number of spins is 10 Time , have access to -XX:PreBlockSpin Set up .
Adaptive lock ： Adaptive lock is an adaptive spin lock , Spin time is not fixed time , It is determined by the previous spin time on the same lock and the state of the lock holder .
Lock elimination ： Lock elimination means JVM Some synchronized blocks of code have been detected , There is no data race scenario at all , That is, there is no need to lock , The lock will be removed .
Lock coarsening ： Lock coarsening means that there are many operations that lock the same object , The synchronization scope of the lock will be extended beyond the whole operation sequence .
Biased locking ： When a thread accesses a synchronous block to acquire a lock , The thread biased to lock will be stored in the lock record in the object header and stack frame ID, After that, the thread does not need to enter the synchronization block again CAS To lock and unlock , Biased locks always lean toward the first thread that gets the lock , If no other thread has ever obtained the lock in the future , The thread holding the lock never needs to synchronize , conversely , When there are other threads competing for biased locks , The thread holding the biased lock releases the biased lock . You can use the settings -XX:+UseBiasedLocking Open the bias lock .
Lightweight lock ：JVM The object header of the object contains some flag bits for locks , When the code enters the synchronization block ,JVM Will use CAS Way to try to get the lock , If the update is successful, the status bits in the object header are marked as lightweight locks , If the update fails , The current thread tries to spin to get the lock .
The whole process of lock escalation is very complicated , I try to get rid of some useless links , Simply describe the whole upgrade mechanism .
To put it simply , A biased lock is a thread biased through the object head ID To compare , You don't even need to CAS 了 , And lightweight lock is mainly through CAS Modify the object's head lock record and spin to achieve , Heavyweight locks block all but the thread that owns the lock .
What exactly does the object header contain ？
In our common use of Hotspot In the virtual machine , Object layout in memory actually contains 3 Parts of ：
- Object head
- The instance data
- Alignment filling
The object header contains two parts ,Mark Word The contents in the will change with the lock flag bit , So just talk about the storage structure .
- The data required by the object itself to run , Also known as Mark Word, This is the key point for lightweight locking and biased locking . Specific content contains the object of hashcode、 Generational age 、 Lightweight lock pointer 、 Heavyweight lock pointer 、GC Mark 、 Biased locking thread ID、 Biased lock timestamp .
- Storage type pointer , That is, a pointer to the metadata of the class , This pointer is used to determine which class the object belongs to .
If it's an array , The length of the array is also included
For locking , Let's talk about it ReentrantLock principle ？ He and synchronized What's the difference? ？
Compared with synchronized,ReentrantLock You need to explicitly acquire and release locks , Now it's basically using JDK7 and JDK8 Version of ,ReentrantLock Efficiency and synchronized The difference can be basically balanced . The main differences between them are as follows ：
- Wait for interruptible , When the thread holding the lock does not release the lock for a long time , The waiting thread can choose to abandon the wait , Turn to other tasks .
- Fair lock ：synchronized and ReentrantLock Default is unfair lock , however ReentrantLock You can change it by passing parameters through the constructor . It's just that fair locks can lead to a sharp drop in performance .
- Bind multiple conditions ：ReentrantLock Can bind multiple at the same time Condition Conditions of the object .
ReentrantLock be based on AQS(AbstractQueuedSynchronizer Abstract queue synchronizer ) Realization . Don't mention it. , I know the problem ,AQS I'll tell you the principle .
AQS Internal maintenance one state Status bit , When you try to lock, go through CAS(CompareAndSwap) Modified value , If successfully set to 1, And put the current thread ID assignment , It means the lock is successful , Once you get the lock , Other threads will be blocked and spin into the blocking queue , The thread that acquired the lock will wake up the thread in the blocked queue when it releases the lock , When you release the lock, you put state Reset to 0, At the same time, the current thread ID Set to empty .
CAS How about the principle of ？
CAS be called CompareAndSwap, Compare and exchange , Mainly through the instructions of the processor to ensure the atomicity of the operation , It contains three operands ：
- Variable memory address ,V Express
- Old expectations ,A Express
- Ready to set the new value ,B Express
When executed CAS When the command , Only when V be equal to A when , Only use B To update V Value , Otherwise, the update operation will not be performed .
that CAS Do you have any shortcomings ？
CAS The main disadvantages are 3 spot ：
ABA problem ：ABA The problem is that in CAS In the process of updating , When the read value is A, And then when you're ready to assign, it's still A, But it's actually possible A The value of has been changed to B, And then it was changed back to A, This CAS An update vulnerability is called ABA. It's just ABA Most scenarios do not affect the final effect of concurrency .
Java There is AtomicStampedReference To solve this problem , He added the expected flag and the updated flag fields , It's not just checking values when updating , Also check that the current flag is equal to the expected flag , If all of them are equal, they will be updated .
The cycle time is long and the cost is high ： The spin CAS If it doesn't work for a long time , Will give CPU High cost .
Only one atomic operation of shared variables can be guaranteed ： Only one shared variable operation can guarantee atomicity , But not more than one , More than one can pass through AtomicReference To handle or use locks synchronized Realization .
good , say something HashMap Principle ？
HashMap It is mainly composed of arrays and linked lists , It's not thread safe . The core point is put The process of inserting data ,get How to query data and expand capacity .JDK1.7 and 1.8 The main difference lies in the modification of head insertion and tail insertion , It's easy to get stuck in the head HashMap List dead loop , also 1.8 After the addition of red and black trees to improve performance .
put Insert data flow
Go to map When you insert an element, you first insert it through the key hash Then with the array length -1 Operation and operation ((n-1)&hash), All are 2 So it's the same as taking modules , But bit operations are more efficient . After finding the position in the array , If there is no element in the array, it can be stored directly , On the contrary, judge key Are they the same? ,key The same covers , Otherwise, it will be inserted at the end of the list , If the length of the list exceeds 8, It will be converted into a red black tree , Finally, determine whether the array length exceeds the default length * The load factor is 12, If it exceeds, it will be expanded .
get Query data
It's relatively easy to query data , First, calculate hash value , Then go to the array to query , If it's a red black tree, go to the red and black tree to check , The linked list traverses the linked list query .
resize Expansion process
The process of expansion is to key Recalculate hash, Then copy the data to a new array .
How to use multithreading environment Map Well ？ConcurrentHashmap Do you know ？
Multithreaded environments can use Collections.synchronizedMap Synchronous locking mode , You can also use HashTable, But the way of synchronization is obviously not up to standard , and ConurrentHashMap It is more suitable for high concurrency scenarios .
ConcurrentHashmap stay JDK1.7 and 1.8 The version of is changed a lot ,1.7 Use Segment+HashEntry The way of segmented lock is realized ,1.8 And abandoned Segment, Change to use CAS+synchronized+Node Realization , Red and black trees are also added , Avoid performance problems caused by long list length .
1.7 Section lock
Structurally ,1.7 Version of ConcurrentHashMap It adopts a segmented lock mechanism , It contains a Segment Array ,Segment Inherit and ReentrantLock,Segment It contains HashEntry Array of ,HashEntry It is a chain structure in itself , With preservation key、value The ability to point to the next node .
In fact, it is equivalent to each of them Segment It's all one HashMap, default Segment The length is 16, That is to support 16 Concurrent writing of threads ,Segment They don't affect each other .
put technological process
In fact, the whole process and HashMap Very similar , It's just about positioning Segment, And then through ReentrantLock To operate , I've simplified the process , Because and HashMap It's basically the same .
- Calculation hash, Locate the segment,segment If it is empty, initialize it first
- Use ReentrantLock Lock , If lock acquisition fails, try spinning , Spin more than a few times and block the acquisition , Make sure you get the lock successfully
- Traverse HashEntry, That's right. HashMap equally , Array key and hash Just replace it directly , If it doesn't exist, insert the linked list again , The same is true for linked lists
get technological process
get It's also very simple. ,key adopt hash Locate the segment, Then traverse the linked list to locate specific elements , It should be noted that value yes volatile Of , therefore get It doesn't need to be locked .
1.8 Discard the block lock , Conversion to use CAS+synchronized To achieve , Again HashEntry Change it to Node, Also added to the implementation of the red black tree . The main thing is to see put The process of .
put technological process
- First calculate hash, Traverse node Array , If node If it's empty , Just through CAS+ Spin way initialization
- If the current array position is empty, it will pass through CAS Spin write data
- If hash==MOVED, Explain the need for expansion , Perform the expansion
- If they are not satisfied , Just use synchronized Write data , Writing data also determines the linked list 、 Red and black trees , Linked list write and HashMap In the same way ,key hash Cover the same , On the contrary, the tail insertion method , The length of the list exceeds 8 Turn it into a red black tree
get Inquire about
get It's simple , adopt key Calculation hash, If key hash The same goes back to , If it is a red black tree, get it according to the red black tree , They are not just traversing the linked list to get .
volatile Do you know the principle ？
comparison synchronized To solve the problem of shared memory ,volatile It's a lighter choice , It has no extra overhead cost of context switching . Use volatile Declared variables , This ensures that the value is immediately visible to other threads when it is updated .volatile Use memory barrier to ensure that instruction reordering does not occur , Solved the problem of memory visibility .
We know , Threads read shared variables from main memory to working memory for operation , Write the result to main memory after completion , But that brings visibility issues . for instance , Now let's say we're a two-level cache with dual cores CPU framework , contain L1、L2 Two level cache .
- Threads A First get the variable X Value , Because the first two levels of cache are empty , So read directly from main memory X, hypothesis X The initial value is 0, Threads A After reading, put X Values are changed to 1, Write back to main memory at the same time . At this time, the cache and main memory are shown in the following figure .
Threads B Also read variables X Value , because L2 The cache already has a cache X=1, So directly from L2 Cache reads , After that thread B hold X It is amended as follows 2, At the same time, write back L2 And main memory . By this time X The value is shown in the figure below .
So thread A If you want to get a variable again X Value , because L1 The cache already has x=1 了 , So the variable memory is invisible at this time ,B It is amended as follows 2 The value of is right A There's no perception .image-20201111171451466
that , If X Variable usage volatile Embellished words , When a thread A Read variables again X Words ,CPU The thread will be forced according to the cache consistency protocol A Reload the latest value from main memory to your working memory , Instead of just using the values in the cache .
Let's talk about the memory barrier ,volatile After modification, different memory barriers will be added to ensure that the visibility problem can be executed correctly . The barrier here is based on the content provided in the book , But actually because of CPU Different architectures , The strategy for reordering is different , The memory barrier provided is also different , such as x86 On the platform , Only StoreLoad A memory barrier .
- StoreStore barrier , Make sure the common writing on it doesn't match volatile The writes are reordered
- StoreLoad barrier , Guarantee volatile Write with the following possible volatile There is no reordering between read and write
- LoadLoad barrier , prohibit volatile The read is reordered with the following normal read
- LoadStore barrier , prohibit volatile Read and common write reorder after
So tell me about you JMM Understanding of memory model ？ Why JMM？
Itself with CPU And memory development speed difference problem , Lead to CPU Is much faster than memory , So now CPU Added cache , Cache can be divided into L1、L2、L3 Three level cache . Based on the above example, we know that this leads to cache consistency problems , So we add cache consistency protocol , At the same time, it leads to the problem of memory visibility , And the compiler and CPU The reordering leads to the problem of atomicity and orderliness ,JMM Memory model is just a series of specification constraints under multithreading operation , Because it's impossible to make employee Chen's code compatible with all the CPU, adopt JMM We've blocked the memory access differences between different hardware and operating systems , That's a guarantee Java Program in different platforms to achieve consistent memory access effect , At the same time, it is also to ensure that the program can execute correctly when it is efficient and concurrent .
Atomicity ：Java Memory model through read、load、assign、use、store、write To ensure atomic operation , Besides, there are lock and unlock, It corresponds directly to synchronized Keywords monitorenter and monitorexit Bytecode instruction .
visibility ： The question of visibility has been said in the above answer ,Java Ensuring visibility can be considered through volatile、synchronized、final To achieve .
Orderliness ： Ordering problems due to processor and compiler reordering ,Java adopt volatile、synchronized To guarantee .
happen-before The rules
Although instruction rearrangement improves concurrency performance , however Java Virtual Opportunities make some rules for instruction rearrangement , It is not possible to change the execution position of all instructions at will , There are mainly the following points ：
- Single thread per operation ,happen-before Any subsequent operation in the thread
- volatile Write happen-before And the subsequent reading of this variable
- synchronized Unlock happen-before Lock this lock later
- final Variable writing happen-before On final Read domain objects ,happen-before Follow up on final Variable reading
- Transitive rules ,A Precede B,B Precede C, that A It must precede C happen
Said along while , What is working memory and main memory ？
Main memory can be thought of as physical memory ,Java The memory model is actually a part of virtual machine memory . And working memory is CPU cache , It could be a register, it could be L1\L2\L3 cache , It's all possible .
say something ThreadLocal principle ？
ThreadLocal It can be understood as thread local variables , He will create a copy in each thread , Then it's OK to access internal replica variables between threads , The threads are isolated from each other , Compared with synchronized The idea is to trade space for time .
ThreadLocal There is a static inner class ThreadLocalMap,ThreadLocalMap There's another Entry Array ,Entry Itself is a weak reference , His key It's pointing ThreadLocal The weak references ,Entry With preservation key value The ability of key value pairs .
The purpose of weak references is to prevent memory leaks , If it's a strong reference, then ThreadLocal Objects cannot be recycled until the thread ends , A weak reference will be the next time GC When it's recycled .
But there is still a memory leak problem , If key and ThreadLocal After the object is recycled ,entry Exist in key by null, however value Valuable entry object , But it's never going to be accessible , Again, unless the thread ends running .
But as long as ThreadLocal Use it properly , After use, call remove Methods to remove Entry object , In fact, it won't happen .
What are the reference types ？ What's the difference? ？
There are four types of quotation: strong, weak and empty ：
- Strong references refer to the common way of assigning values in code , such as A a = new A() such . Strongly reference the associated object , Never be GC Recycling .
- Soft references can be used with SoftReference To describe , Refers to objects that are useful but not necessary . The system will recycle such referenced objects before memory overflow occurs .
- Weak references can be used with WeakReference To describe , His intensity is a little lower than the soft quote , Weakly referenced objects next time GC When it comes to recycling , And whether there's enough memory .
- Virtual references are also called phantom references , It's the weakest citation relationship , It can be used PhantomReference To describe , He has to talk to ReferenceQueue Use it together , The same thing happens when GC When , Virtual references are also recycled . Virtual references can be used to manage out of heap memory .
Do you know the thread pool principle ？
First of all, thread pool has several core parameter concepts ：
Maximum number of threads maximumPoolSize
Number of core threads corePoolSize
Active time keepAliveTime
Blocking queues workQueue
Refusal strategy RejectedExecutionHandler
When a new task is submitted to the thread pool , The specific implementation process is as follows ：
- When we submit tasks , The thread pool will corePoolSize Size creates a number of tasks, threads execute tasks
- When the number of tasks exceeds corePoolSize Number , Subsequent tasks will enter the blocking queue, blocking the queue
- When the blocking queue is full , Then it will continue to create (maximumPoolSize-corePoolSize) Number of threads to perform the task , If the task is done ,maximumPoolSize-corePoolSize Additional created threads wait keepAliveTime And then it's automatically destroyed
- If it reaches maximumPoolSize, Is the queue blocked or full , Then it will be processed according to different rejection strategies
What are the rejection strategies ？
There are mainly 4 Type of rejection policy ：
- AbortPolicy： Discard tasks directly , Throw an exception , This is the default strategy
- CallerRunsPolicy： Only the thread of the caller is used to process the task
- DiscardOldestPolicy： Discard the most recent task in the waiting queue , And carry out the current task
- DiscardPolicy： Discard tasks directly , Don't throw an exception
- END -