# Dried the chicken soup
When we are in the minority , We have to have the courage to be ourselves ; When we are the majority , We have to be open-minded and tolerant of others .
-- Ralph ·W· Sokman
Today, I summarize and sort out the knowledge about memory management of operating system .
1. What is physical memory
2. What are the disadvantages of using physical memory ？
3. What is virtual memory ？
4. How virtual memory maps to physical memory
5. What is paging memory management ？
6. What is page break ？
7. What are page replacement algorithms ？
8. What is segmented memory management ？
What is physical memory ？
We often say that the physical memory size refers to the size of the memory module , When you buy a computer, you will see how large the memory module is , If the memory module size is 100G, What about this 100G Can they all be used ？ Not necessarily , More still to see CPU The number of bits on the address bus , If the address bus only has 20 position , So its addressing space is 1MB, Even if you can install 100G It doesn't make sense , You can only see the physical memory size as 1MB.
What are the disadvantages of using physical memory ？
In this way, each program can directly access the physical memory , There are two situations ：
1. There is only one process running in the system ： If the user program can manipulate any address in the physical address space , They can easily destroy the operating system inadvertently , All kinds of strange problems appear in the system ;
2. The system has multiple processes running at the same time ： Pictured , Ideally, you can make the process A And processes B Each occupies one side of physical memory , They don't interfere with each other , But it's only ideal , Who can make sure the program doesn't bug Well , process B It's running normally in the background , The programmer is debugging the process A It is possible to misoperate the process B Physical memory in use , Leading to the process B Abnormal operation , Both programs operate on the same address space , The first program writes a value in an address space , The second program writes different values at the same address , This can cause problems with the program running , So using physical memory directly will make all processes Security is not guaranteed .
How to solve the above problems ？
Consider creating new abstractions for memory ： address space , Address space creates an abstract memory for programs , Is a set of addresses that a process can use to address memory , At the same time, each process has its own address space , The address space of one process is independent of that of other processes .
How to create a separate address space for a program ？
The simplest way is to map the address space of each process to different parts of physical memory . This ensures that different processes use separate address spaces .
Give each process a base address A And boundaries B, The space used in the process is x, Then the corresponding physical address is A + x, At the same time, we need to guarantee A + x < B, If the access address exceeds the limit , You need to generate an error and abort access . For the purpose CPU Two special hardware registers are configured ： Base register and bound register , When a process is running , The starting physical address and length of the program are loaded into the base register and the boundary register respectively , Processes access memory , Before each memory address is sent to memory , The contents of the base register will be added first .
shortcoming ： Every access to memory requires addition and comparison , Comparison is fast , But the addition operation is due to the problem of carry passing events , It's slow without special circuits .
Besides , Each process will occupy a certain amount of physical memory , If the physical memory is large enough to hold many processes running at the same time , But in reality, the size of physical memory is limited , There may be Out of memory The situation of , What do I do ？
Method 1 ： If it's because the program is too big , Big enough to exceed the capacity of memory , You can use manual Overlay Technology , Save only the required instructions and data in memory .
Method 2 ： If it's because of too many programs , The memory capacity is exceeded , You can use automatic switching technology , Move programs that are not needed to be executed temporarily to external memory .
The program follows its own logical structure , It is divided into several program modules with independent functions , Modules that don't execute at the same time can share the same memory area , Run in chronological order ：
Resident the code and data needed by common functions in memory ;
Divide the infrequently used functions into independent program modules , Usually put it in the external storage , Load the corresponding module into memory when necessary ;
Those modules that have no calling relationship do not need to be loaded into memory , They can share a memory area , Load into memory when needed , When you don't need it, switch out to external storage ;
Multiple programs running at the same time , Can be temporarily unable to run the program to the external memory , Get more free memory , The operating system takes the entire address space content of a process Swap out To the external deposit , Then, the whole address space information of a process in the external memory is added Exchange in Into memory , The size of the swap in and out content is the address space of the entire program .
Switching technology has many difficulties in implementation ：
Need to determine when the exchange will occur ： The simple way is to swap out some programs when the memory space is not enough ;
The swap area must be large enough ： When multiple programs are running , Exchange area （ External storage ） It has to be big enough , Large enough to hold all the address space information needed by the program ;
How the program switches into ： A program is swapped out and then switched back in , The swapped in memory location may not be the same as the memory location of the last program , This requires a dynamic address mapping mechanism .
The comparison between overlay technology and switching technology
Coverage can only occur between program modules that have no calling relationship with each other , Therefore, the programmer must give the logical covering structure between the modules in the program .
Switching technology is based on the size of the program in memory , It does not require the programmer to give the logical coverage structure between the modules .
Generally speaking ： The overlay happens inside the program , Exchange takes place between programs .
But both technologies have shortcoming ：
Coverage technology ： It needs the programmer to divide the whole program into several small function modules , And determine the coverage relationship between each module , It increases the burden of programmers , Few programmers are good at this technology ;
Switching technology ： Process as the unit of exchange , The entire address space of the process needs to be swapped in and out , Increased processor overhead , It also needs a large enough external storage .
Is there a better way to solve the above problems ？ The answer is Virtual memory technology .
What is virtual memory ？
Virtual memory , That's the virtual memory , The basic idea is to make sure that each program has its own address space , The address space is divided into blocks , Each block has a continuous address space , At the same time, physical space is also divided into blocks , The block size is the same as the block size of the virtual address space , The operating system will automatically map the virtual address space to the physical address space , The program is only concerned with virtual memory , Virtual memory is also requested , In fact, the real use of physical memory .
Virtual memory technology has the function of overlay technology , But it doesn't put everything in memory , So it can run programs that are larger than the current free memory space . It's better than Overlay Technology , The whole process is done automatically by the operating system , No programmer intervention ;
Virtual memory technology has the function of switching technology , It can realize the exchange between memory and external memory , So get more free memory space . It's better than switching technology , It only exchanges part of the process between memory and external memory .
The realization of virtual memory technology ：
Virtual memory technology is generally used in page management （ Let's introduce ） On the basis of
When loading a program , You don't have to load it all into memory , And just load some pages that need to be executed into memory , You can start the program ;
During program execution , If the instruction to be executed or the data to be accessed is not yet in memory （ It's called missing page ）. The processor informs the operating system to load the corresponding page into memory , Then go ahead with the program ;
On the other hand , The operating system will save the temporarily unused pages in the memory in the external memory , This will free up more free space for the program to be loaded and the page to be called in .
Characteristics of virtual memory technology ：
Big user space ： By combining physical memory with external memory , The virtual memory space provided to users is usually larger than the actual physical memory , That is to realize the separation of the two . Such as 32 The virtual address of bits can theoretically be accessed 4GB, And maybe there's only 256M Physical memory , But the capacity of the hard disk is larger than 4GB;
Partial exchange ： Compared with switching technology , The transfer in and out of virtual storage is to part of the virtual address space ;
Continuity ： Programs can use a series of contiguous contiguous virtual addresses to map contiguous large memory buffers in physical memory ;
Security ： Virtual addresses used by different processes are isolated from each other . Code in one process cannot change the physical memory being used by another process or operating system .
How virtual memory maps to physical memory ？
Pictured ,CPU There's a memory management unit in there （Memory Management Unit）, abbreviation MMU, Virtual memory is not sent directly to the memory bus , It's to give first MMU, from MMU To map virtual addresses to physical addresses , The program only needs to manage the virtual memory , The logic of mapping is automatically processed by other modules .
How the operating system represents whether the memory is occupied or idle ？
Paging memory management
Divide the virtual address space into blocks , Each block has a fixed size , The physical address space is also divided into blocks , Each block also has a fixed size , The block size of the physical address space is equal to that of the virtual address space , Virtual address space these blocks are called page , Physical address space these blocks are called frame .
There's a problem with pagination , What is the appropriate size of the page ？ The page is too big to waste space , If the program only uses 1 Bytes are allocated 10M The page of , Isn't it a great waste , Too small a page can lead to a page table （ Let's introduce ） Take up too much space , So the page needs to be eclectic in choosing the right size , Most systems currently use 4KB As the size of the page .
Above about how to map virtual memory to physical memory program meow only introduced MMU, however MMU How it works has not been described ,MMU adopt A page table This tool converts virtual addresses to physical addresses .32 The virtual address of bits is divided into two parts （ Virtual page number and offset ）,MMU The physical page number corresponding to the virtual page number is found through the page table , Physical page number + The offset is the actual physical address .
See the picture for details ：
The diagram only shows the general function of the page table , The structure of the page table is actually very complicated , I'll give you a detailed introduction .
The purpose of a page table is to map virtual pages to page frames in physical memory , Page table can be understood as a mathematical function , The input to the function is the virtual page number , The output of the function is the physical page number , Through this function, you can map the virtual page to the physical page number , To determine the physical address . Different machines have different page table structures , Generally, the structure of page table is as follows ：
Page frame number ： The main one , The main purpose of page table is to find the physical page number ;
Effective bit ：1 Indicates valid , Indicates that the entry is valid , If 0, Indicates that the virtual page corresponding to the table entry is not in memory now , Access to this page will cause a page break , After a page break, you will go to the physical space to find an available page box and fill it back into the page table ;
Protection position ： Indicates what type of access a page allows , Readable, writable or executable ;
Modify bit ： This bit reflects the state of the page , Useful when operating system redistributes page boxes , This bit is automatically set by the hardware when a page is written , When reassigning page boxes , If a page has been modified , You have to write the dirty page back to disk , If it hasn't been modified , Indicates that the page is clean , Its copy on disk is still valid , Just discard the page .
Access to a ： This bit is mainly used to help the operating system to select the page to be eliminated in case of page fault interruption , Pages that are no longer in use are obviously more suitable for obsolescence than those that are in use , This bit plays an important role in page replacement algorithm .
Cache forbidden bits ： This bit is used to prevent the page from being cached .
How to speed up address mapping ？
Every memory access requires mapping from virtual address to physical address , Each mapping needs to access the page table once , All instructions must be executed through memory , Many instructions also need to access operands in memory , Therefore, each instruction execution will perform multiple page table queries , For the speed of the program , Instructions must be executed in a very short time , The page table query mapping can not be the bottleneck of instruction execution , So we need to improve the speed of page table query mapping .
How to improve the speed ？ You can provide a cache for the page table , Mapping via cache is faster than mapping through page tables , This cache is a small hardware device , It's called a watch （TLB）,MMU Every time you do a virtual address translation , First go TLB Search for , If a valid physical page box is found, it will return directly , If it is not found, normal page table access is performed , When found in the page table, it will be updated TLB, from TLB To eliminate an entry in , Then replace it with the newly found entry , In this way, the next time the same page comes over, you can directly hit TLB Find the corresponding physical address , Faster , There is no need to continue to access the page table .
The reason for this is that TLB It depends on The principle of program locality , The principle of program locality refers to a short time in the process of program execution , The address of the instruction to be executed and the data to be accessed are usually limited to a single area , It can be divided into temporal locality and spatial locality ：
Temporal locality ： One and the next execution of an instruction , A data access and the next access are concentrated in a short period of time ;
Spatial locality ： The current instruction and several adjacent instructions , The currently accessed data and several adjacent data are concentrated in a small area .
For a detailed introduction and application of the principle of program locality, see my previous article ：
adopt TLB It can speed up the conversion from virtual address to physical address , One more question , Now are all 64 Bit operating system , There's a lot of virtual address space , If the virtual address space is large, the corresponding page table will be very large , Plus multiple processes, multiple page tables , Most of the space in the computer is used to store page tables , Is there a better way to solve the problem of large page table ？ The answer is Multi level page table .
tips： Why is the page table large ？32 Bit environment , Virtual address has space 4GB, A page size is 4KB, Then the whole page table needs 100 Ten thousand pages , Each page entry requires 4 Bytes , Then the whole page table needs 4MB Of memory space , Because each process has its own page table , In the case of multiple processes , This is a disaster .
Pictured , With a 32 For example, a secondary page table with a bit virtual address , take 32 Bit virtual address is divided into 10 Bit PT1 Domain ,10 Bit PT2 Domain , as well as 12 Bit offset Domain , When a virtual address is sent to MMU when ,MMU First extract PT1 Field and use its value as an index to access the first level page table , Then extract PT2 The field uses its value as an index to access the second level page table , Then according to offset Find the corresponding page frame number .
32 Bit virtual address space ： Every page 4KB, And each page table item accounts for 4B：
First level page table ： The process needs 1M Page table items （4GB / 4KB = 1M, 2^20 Page table items ）, Page table （ Each process has a page table ） Occupy 4MB（1M * 4B = 4MB） Of memory space .
Second level page table ： First level page table mapping 4MB（2^22）、 Second level page table mapping 4KB, You need to 1K A first level page entry （4GB / 4MB = 1K, 2^10 A first level page entry ）、 Each primary page entry corresponds to 1K Secondary page table items （4MB / 4KB = 1K）, This page table takes up 4.004MB（1K * 4B + 1K * 1K * 4B = 4.004MB） Of memory space .
The space occupied by the secondary page table seems to be getting larger , Why is it said that multi-level page table saves memory ？
Every process has 4GB Virtual address space for , And obviously for most programs , The space it uses is far from 4GB, Why map the space that can't be used ？
in other words , The first level page table covers the whole 4GB Virtual address space , But if the page table entry of a level-1 page table is not used , There is no need to create the secondary page table corresponding to this page table item , That is, you can create a secondary page table when you need it . Do a simple calculation , Assuming that only the 20% The first level page table item of is used , Then the page table takes up only 0.804MB（1K*4B+0.2*1K*1K*4B=0.804MB）, Compared with single level page table 4M Is it a huge saving ？
So why can't a hierarchical page table save memory like this ？ From the nature of the page table , The page table stored in main memory is responsible for translating virtual address into physical address . If the virtual address cannot find the corresponding page table entry in the page table , The computer system won't work . So the page table must cover all the virtual address space , The page table without grading needs to have 1M Page table entries to map , The secondary page table needs at least 1K Page table items （ At this time, the first level page table covers all the virtual address space , Secondary page tables are created when needed ）.
The secondary page table may not be in memory ： In fact, it's like taking a page table as a page . When you need to use a page , Load this page from disk to memory ; When the memory page is full , Page out of memory to disk , This is to take advantage of the local principle of program operation . We can naturally find that , The virtual memory address has locality , Then, of course, the page table entries that are responsible for mapping virtual memory addresses also have locality ！ So let's look at the secondary page table , According to the principle of locality ,1024 In the second level page table , Only a small part of it will be in use at one time , Can we put all secondary page tables on disk , Call into memory when needed ？
We think about the extreme , Only one level page table is in memory , There is only one secondary page table in memory , The rest is on disk （ Although this is very inefficient ）, Then the page table takes up 8KB（1K*4B+1*1K*4B=8KB）, Compare with the previous one 0.804MB, The occupied space has been reduced many times ！（ Here refer to the answer of the big man in the link below ）
06 What is page break ？
Page missing interrupt means that the page to be accessed is not in main memory , The operating system needs to load the page into main memory and then access it , The execution of the command will be temporarily stopped , Generate an exception that the page does not exist , The corresponding exception handler will call into memory from the selected page , After calling into memory, the previous exception instruction can continue to execute .
The process of page break processing is as follows ：
If there are free physical pages in memory , Assign a physical page frame r, And then turn to 4 Step , Or turn to 2 Step ;
Choose a page replacement algorithm , Select a physical page frame to be replaced r, Its corresponding logical page is q, If the page has been modified during memory , Write it back to external storage ;
take q The corresponding page table entry is modified , Put the dwell position 0;
Will need to visit the page p Load to physical page r in ;
modify p The content of the corresponding page table entry , Put the dwell position 1, Set the physical page frame number to x;
Rerun the interrupted instruction .
What are page replacement algorithms ？
When page break occurs , Need to load new pages into memory , And when the memory is full , Choosing which physical page in memory to be replaced is a science , This introduces a variety of page replacement algorithms , Committed to reducing the number of page swapping in and out as much as possible （ Number of page breaks ）. Try to replace the pages that are no longer used in the future or less used in the short term , Usually under the guidance of the principle of program locality, the prediction is based on the past statistical data .
Optimal page replacement algorithm ： When a page break occurs , For every logical page stored in memory , Calculate before its next visit , How long will it take to wait , Choose the one with the longest waiting time , As a replacement page . Notice that this is just a kind of Ideal situation , It can't be realized in the actual system , Because the operating system can't predict the future , I don't know how long it will take for each page to be visited again . This algorithm can be used as the basis of performance evaluation of other algorithms （ Running a program on a simulator , And record every page visit , The optimal algorithm can be used in the second run ）.
Fifo algorithm ： The first page entered is eliminated first , This algorithm is very simple , It's just more introduction .
Most recently unused algorithm ： The legendary LUR Algorithm , When a page break occurs , Choose the most recent page that has not been used for a long time , The algorithm will give each page a field , Used to record the time since the last visit T, When a page needs to be eliminated , Select an existing page T The page with the largest value will be eliminated .
Second chance page replacement algorithm ： An updated version of the FIFO algorithm , It's just a little bit changed on the basis of FIFO algorithm , Because the FIFO algorithm may replace the frequently used pages , This method will give these pages one more chance , Set a modification bit to the page R, Every time the oldest page is eliminated , Check the oldest page R position , If R Is it 0, So it means that this page is old and has not been used again , Direct elimination , If this page has R Is it 1, Indicates that the page has been visited twice , take R Location 0, And put the page at the end of the list , As if the page is the latest in , And then continue to weed out the oldest pages in this way .
Clock page replacement algorithm ： Second chance page algorithm upgrade , Although the second chance page algorithm is a more reasonable algorithm , But it needs to move pages frequently in the linked list , Low efficiency , The clock page replacement algorithm is shown in the figure , The algorithm saves all the pages in a clock like ring list , A watch needle points to the oldest page , When a page break occurs , The algorithm first checks the page that the pointer points to , If it's R Is it 0 Eliminate the page , And insert the new page into this location , Then the needle moves to the next position , If R Is it 1 will R Location 0 And move the needle to the next position , Repeat the process until you find one R Is it 0 And then eliminate .
Clock page replacement algorithm
The least common algorithm ： When a page break occurs , Choose the page with the least number of visits to eliminate . The algorithm can set a counter for each page , When interviewed , The access counter for this page +1, When it comes to elimination , Select the page with the smallest counter value .
Here's the problem ： If a page is visited a lot at the beginning , But never again , Then it may never be eliminated , But it does need to be eliminated , What shall I do? ？ The value of each page can be reduced periodically , A common method is to move the page counter one bit to the right periodically .
tips： The least common algorithm （LFU） And the most recently unused algorithm （LRU） The difference between ：LRU The investigation is the longest visit , The shorter the time, the better , and LFU It's about the number or frequency of visits , The more visits, the better .
Working set page replacement algorithm
When we introduce the algorithm, we first introduce what is the working set .
A working set is a collection of pages currently in use by a process , You can use binary functions W(t, s) Express ：
t Represents the current execution time ）s Represents the working set window , Represents a fixed period of time
W(t, s) At the present moment t Previous s A collection of all visited pages in a time period
Work gatherings change at different times , Pictured ：
After the process starts to execute, a more stable working set is gradually established as the new page is visited
When the location of the local region of memory access is approximately stable ( Just visit those pages When there is no big change ) The working set size is also roughly stable
When the position of the local area changes ( The first thing in the process is done To do the next thing ) The working set quickly expands and shrinks to the next stable value
The working set replacement algorithm is mainly to change out the pages that are not in the working set , The sample is shown in figure ：
The first 0 visit e： Page missing , Load e
The first 1 visit d： Page missing , Load d
The first 2 visit a： Page missing , Load a
The first 3 visit c： Page missing , Load c
The first 4 visit c： hit , Time window 【1-4】, Eliminate e
The first 5 visit d： hit , Time window 【2-5】
The first 6 visit b： Page missing , Time window 【3-6】, Eliminate a, Load b
The first 7 visit c： hit , Time window 【4-7】
The first 8 visit e： Page missing , Time window 【5-8】, Load e
The first 9 visit c： hit , Time window 【6-9】, Eliminate d, Load c
The first 10 visit e： hit , Time window 【7-10】, Eliminate b
The first 11 visit a： Page missing , Time window 【8-11】, Load a
The first 12 visit d： Page missing , Time window 【9-12】, Load d
Clock page replacement algorithm of working set
In the working set page replacement algorithm , When a page break occurs , You need to scan the entire page table to get to the state of the page , Then we can determine which page is eliminated , So it's time consuming , So the working set clock page algorithm is introduced . Similar to the clock algorithm which improves the FIFO algorithm , Working set page replacement algorithm + Clock algorithm = Clock page replacement algorithm of working set . It avoids the overhead of scanning the whole page table every time a page fault breaks .
What is segmented memory management ？
What we usually see most about segmented memory management should be Linux Executable code segments, data segments and so on , The best way to understand segmentation is to understand its history . Segmentation originated from 8086CPU, At that time, the program access memory or directly give the physical address of the corresponding unit , In order to facilitate the concurrent execution of multi-channel programs , Need to support relocation of individual programs , If relocation is not supported , When it comes to memory access, you need to write the address to death , Then load a program into a fixed range of physical memory . Through the segmentation mechanism , The program only needs to use the relative address of the segment , Then change the base address of the segment , To facilitate the relocation of the program . and 8086CPU The width of the address line is 20 position , The addressable range can reach 1MB, But their registers are all 16 position , Use it directly 1 individual 16 It is impossible for bit registers to access memory 1MB, So the paragraph is introduced , Segment registers are introduced , Segment register shift left 4 position + Offset can generate 20 Address of bit , So as to achieve 1MB Addressing range of .
At today's level of Technology , In fact, it is no longer necessary to use this method of segment shift and offset to save , Segmentation is more of a historical burden , It doesn't have much practical effect , What's more, we often see code segments and data segments in executable programs, which are more for the purpose of constructing the organizational structure of programs more clearly and orderly .Linux In fact, instead of using segmentation, only paging management is used , It's going to be easier , The present segmentation is actually more about making Logic is clearer . A company , In order to facilitate management, it is divided into many departments , This is actually similar to the logic of segmentation , There's no physical meaning, but the logic is clearer .
The memory knowledge about the operating system is introduced here , I hope that's helpful ！
https://yuerer.com/ Operating system - Virtual storage page replacement algorithm /
《 Modern operating system 》
《B Station Tsinghua operating system teaching video 》
《B Station Harbin Institute of technology operating system teaching video 》
Some of the pictures in this article come from the Internet , If there is any infringement , Please contact to delete .
• Make complaints about Wu's real name LeetCode One of the last questions ...• When the interview byte jumps , I came across the original question ……• How to practice programming for computer majors to master programming ？• Why? MySQL Use B+ Trees • A simple interview question of byte skipping algorithm
Welcome to my official account. “ Five minutes to learn the algorithm ”, If you like , Please “ Looking at ”, Click on the bottom left to read the original text , Get Google's algorithm brush notes .
本文为[Senior brother Wu, programmer]所创，转载请带上原文链接，感谢