编程知识 cdmana.com

Comprehensive analysis of advanced core knowledge in Java -- redis

One 、5 Basic data structures

1.Redis brief introduction

Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker.” —— Redis It's an open source (BSD The license ) In memory data structure storage , Use as database , Caching and message broker . ( From website )

Redis It's open source , Advanced key value storage and a suitable solution , For building high performance , Extensible Web Applications .Redis Also called data structure server by the author , This means that the user can use some commands , Based on with TCP A simple server for sockets - client Protocol to access a group of Variable data structures .( stay Redis Key value pairs are used in , It's just that the corresponding data structure is different )

1)Redis The advantages of

Here are Redis Some of the advantages of :

  • Very fast - Redis Very fast , It can execute about per second 110000 Secondary settings (SET) operation , About executable per second 81000 Second reading / obtain (GET) operation .
  • Support rich data types - Redis Supports most of the data types commonly used by developers , For example, a list of , aggregate , Sort sets and hashes, etc . This makes Redis It's easy to use to solve all kinds of problems , Because we know which problems can be better used and which data types to deal with .
  • Atomic operation - all Redis Operations are all atomic operations , This ensures that if two clients access concurrently ,Redis The server can receive the updated value .
  • Multi utility - Redis It's a multi utility , It can be used in a variety of use cases , Such as : cache , Message queue (Redis Local support releases / subscribe ), Any short-term data in the application , for example ,web Sessions in the application , Page hit count, etc .

2)Redis Installation

This step is relatively simple , You can find many satisfactory tutorials on the Internet , No more details here .

For a rookie tutorial installation tutorial as a reference :https://www.runoob.com/redis/redis-install.html

3) Test local Redis performance

When you're done installing , You can do it first redis-server Give Way Redis Start it up , And then run the command redis- benchmark -n 100000 -q To detect local simultaneous execution 10 Performance at 10000 requests :

Of course, there will be performance gaps between different computers due to various reasons , You can think of this test as a kind of 「 fun 」 Just fine .

2.Redis Five basic data structures

Redis Yes 5 Basic data structures , They are :string( character string )list( list )hash( Dictionaries )set( aggregate ) and zset( Ordered set ). this 5 Species is Redis The most basic of relevant knowledge 、 The most important part , Now we combine the source code and some practice to explain to you respectively .

Be careful :

The underlying data structure has its own encoding , And it's multiple implementations , such Redis Will choose the right internal code in the right scene .
You can see that each data structure has more than two internal coding implementations , for example string The data structure contains raw、int and embstr Three internal codes .
meanwhile , Some internal codes can be used as internal implementations of multiple external data structures , for example ziplist Namely hash、list and zset Common internal code .

1) character string string

Redis The string in is a Dynamic string , This means that users can modify , Its underlying implementation is a bit like Java Medium ArrayList, There's an array of characters , From source sds.h/sdshdr file Can be seen in Redis The underlying definition of strings SDS, namely Simple Dynamic String structure :

/* Note: sdshdr5 is never used, we just access the flags byte directly. 
	* However is here to document the layout of type 5 SDS strings. */ 
struct __attribute__ ((__packed__)) sdshdr5 {
   
    
	unsigned char flags; /* 3 lsb of type, and 5 msb of string length */ 
	char buf[]; 
};
struct __attribute__ ((__packed__)) sdshdr8 {
   
    
	uint8_t len; /* used */ 
	uint8_t alloc; /* excluding the header and null terminator */ 
	unsigned char flags; /* 3 lsb of type, 5 unused bits */ 
	char buf[]; 
};
struct __attribute__ ((__packed__)) sdshdr16 {
   
    
	uint16_t len; /* used */ 
	uint16_t alloc; /* excluding the header and null terminator */ 
	unsigned char flags; /* 3 lsb of type, 5 unused bits */ 
	char buf[]; 
};
struct __attribute__ ((__packed__)) sdshdr32 {
   
    
	uint32_t len; /* used */ 
	uint32_t alloc; /* excluding the header and null terminator */ 
	unsigned char flags; /* 3 lsb of type, 5 unused bits */ 
	char buf[]; 
};
struct __attribute__ ((__packed__)) sdshdr64 {
   
    
	uint64_t len; /* used */ 
	uint64_t alloc; /* excluding the header and null terminator */ 
	unsigned char flags; /* 3 lsb of type, 5 unused bits */ 
	char buf[]; 
};

You'll find the same set of structures Redis Defined many times with generics , Why not use it directly int Type?

Because when the string is short ,len and alloc have access to byte and short To express ,Redis For the ultimate optimization of memory , Strings of different lengths are represented by different structs .

①、SDS And C The difference between strings

Why not consider direct use of C Language strings ? because C Language is a simple string representation Do not conform to the Redis On strings in security 、 Efficiency and functional requirements . We know ,C The language uses a length of N+1 To represent the length of N String , And the last element of the character array is always '\0' .( The picture below shows C The median value in language is “Redis” One An array of characters )

Such a simple data structure may cause the following problems :

  • Get the string length as O(N) Level of operation → because C Do not save the length of the array , You need to traverse the entire array every time ;
  • Can't put an end to out of buffer / Memory leak The problem of → The reason is the same as above , If splicing is performed or Shorten string operation , If the operation is improper, it is easy to cause the above problems ;
  • C character string Only text data can be saved → because C A string in a language must conform to some encoding ( such as ASCII), For example, in the middle ‘\0’ It may be judged as an early terminated string and cannot be recognized ;

Let's take the operation of appending strings as an example ,Redis Source code is as follows :

/* Append the specified binary-safe string pointed by 't' of 'len' bytes to the 
	* end of the specified sds string 's'. 
	*
	* After the call, the passed sds string is no longer valid and all the 
	* references must be substituted with the new pointer returned by the call. */ 
sds sdscatlen(sds s, const void *t, size_t len) {
   
    
	//  Get the length of the original string  
	size_t curlen = sdslen(s); 
	
	//  Adjust the space as needed , If the capacity is not enough to accommodate additional content , The byte array is reallocated and the contents of the original string are copied to the new array  
	s = sdsMakeRoomFor(s,len); 
	if (s == NULL) return NULL; //  Out of memory  
	memcpy(s+curlen, t, len); //  Append target string to byte array  
	sdssetlen(s, curlen+len); //  Set the added length  
	s[curlen+len] = '\0'; //  Let the string be  \0  ending , Easy to debug and print  
	return s; 
}
  • notes :Redis Specifies that the length of a string must not exceed 512 MB.

②、 Basic operations on strings

Install well Redis, We can use redis-cli Come on Redis Perform command line operations , Of course Redis The official also provides an online debugger , You can also type in commands to operate :http://try.redis.io/#run

③、 Set and get key value pairs

> SET key value 
OK
> GET key 
"value"

As you can see , We usually use SET and GET To set and get string values .

Values can be any kind of string ( Including binary data ), For example, you can save a picture under a key .jpeg picture , Just be careful not to exceed 512 MB To the maximum extent .

When key In existence , SET The command will override the value you set last time :

> SET key newValue 
OK
> GET key 
"newValue"

You can also use EXISTS and DEL Keyword to query the existence and deletion of key value pairs :

> EXISTS key 
(integer) 1 
> DEL key 
(integer) 1 
> GET key 
(nil)

④、 Set key value pairs in batch

> SET key1 value1 
OK
> SET key2 value2 
OK
> MGET key1 key2 key3 #  Return a list  
1) "value1" 
2) "value2" 
3) (nil) 
> MSET key1 value1 key2 value2 
> MGET key1 key2 
1) "value1" 
2) "value2"

⑤、 Expired and SET Command extension

It can be done to key Set expiration time , The time will be automatically deleted , This function is often used to control the cache expiration time .( Expiration can be Any data structure )

> SET key value1 
> GET key 
"value1" 
> EXPIRE name 5 # 5s  After expired  
... #  wait for  5s 
> GET key 
(nil)

Equivalent to SET + EXPIRE Of SETEX command :

> SETEX key 5 value1 
... #  wait for  5s  Post acquisition  
> GET key 
(nil) 

> SETNX key value1 		#  If  key  Nonexistence  SET  success  
(integer) 1 
> SETNX key value1 		#  If  key  There are  SET  Failure  
(integer) 0 
> GET key 
> "value" 				#  There is no change 

⑥、 Count

If value It's an integer , It can also be used with INCR Command to Atomicity Self - increment operation , This means that multiple customers are responding to the same key To operate , It will never lead to competitive situations :

> SET counter 100 
> INCR counter 
(integer) 101 
> INCRBY counter 50 
(integer) 151

⑦、 Returns the original value of GETSET command

The string , One more GETSET It's more interesting , Its function is the same as its name : by key Set a value and return the original value :

> SET key value 
> GETSET key value1 
"value"

This can be used for some statistics that need to be done at intervals key Easy to set up and view , for example : Every time a user enters the system, you use INCR Command to operate a key, When you need statistics, you put this key Use GETSET The command is reassigned to 0, In this way, the purpose of statistics is achieved .

2) list list

Redis The list of is equivalent to Java In language LinkedList, Notice that it's a linked list, not an array . It means list The insert and delete operations of are very fast , The time complexity is O(1), But index positioning is slow , The time complexity is O(n).

We can get it from the source code adlist.h/listNode To see the definition of it :


episode :
More Ali 、 tencent 、 Meituan 、 Jingdong and other first-line Internet companies Java The real question of the interview ; contain : Basics 、 Concurrent 、 lock 、JVM、 Design patterns 、 data structure 、 Reflection /IO、 database 、Redis、Spring、 Message queue 、 Distributed 、Zookeeper、Dubbo、Mybatis、Maven、 Facial Scripture, etc .
more Java Programmer technology advanced tips ; For example, efficient learning ( How to learn and read code 、 In the face of boring and large amount of knowledge ) Communicate effectively ( Communication methods and skills 、 Communication technology )
more Java Some career sharing documents shared by Daniel



Please click on Add here 》》》》》》》》》 community , Free access


Better opponents than you are learning , Your enemy is sharpening his knife , Your best friend is losing weight , Old Wang next door is practicing waist , We must keep learning , Otherwise, we will be surpassed by learners !
young , Work hard , Give future self an account !

/* 
Node, List, and Iterator are the only data structures used currently. */

typedef struct listNode {
   
       
	struct listNode *prev; 
	struct listNode *next; 
	void *value; 
} listNode; 

typedef struct listIter {
   
       
	listNode *next; 
	int direction; 
} listIter;

typedef struct list {
   
       
	listNode *head; 
	listNode *tail; 
	void *(*dup)(void *ptr); 
	void (*free)(void *ptr); 
	int (*match)(void *ptr, void *key); 
	unsigned long len; 
} list;

You can see , Multiple listNode Can pass prev and next The pointer forms a double linked list :

Although only multiple listNode Structure can form a linked list , But use adlist.h/list Structure to hold the linked list , It will be more convenient to operate :



①、 Basic operations of linked lists

  • LPUSH and RPUSH The difference can be made to list Left side ( Head ) And on the right ( The tail ) Add a new element ;
  • LRANGE Commands can be obtained from list Take out a certain range of elements ;
  • LINDEX Commands can be obtained from list The elements specified in the following table are taken out , amount to Java In linked list operation get(int index) operation ;

demonstration :

> rpush mylist A 
(integer) 1 
> rpush mylist B 
(integer) 2 
> lpush mylist first 
(integer) 3 
> lrange mylist 0 -1 # -1  Represents the last but one element ,  This means from the first element to the last element , That is all 
1) "first" 
2) "A" 
3) "B"

②、list Implementation queue

Queues are first in first out data structures , It is often used in message queuing and asynchronous logic processing , It ensures the order in which the elements are accessed :

> RPUSH books python java golang 
(integer) 3 
> LPOP books 
"python" 
> LPOP books 
"java" 
> LPOP books 
"golang" 
> LPOP books 
(nil)

③、list Implementation stack

A stack is a data structure that goes first and then out , It's the opposite of the queue :

> RPUSH books python java golang 
> RPOP books 
"golang" 
> RPOP books 
"java" 
> RPOP books 
"python" 
> RPOP books 
(nil)

3) Dictionaries hash

Redis The dictionary in is equivalent to Java Medium HashMap, The internal implementation is similar , It's all through " Array + Linked list " To address part of the chain Hash Collisions , At the same time, this structure also absorbs the advantages of two different data structures . The source code definition is as follows dict.h/dictht Definition :

typedef struct dictht {
   
       
	//  Hash table array  
	dictEntry **table; 
	//  Hash table size  
	unsigned long size; 
	//  Hash table size mask , Used to calculate index values , Always equal to size - 1 
	unsigned long sizemask; 
	//  The number of existing nodes in the hash table  
	unsigned long used;
} dictht;

typedef struct dict {
   
       
	dictType *type; 
	void *privdata; 
	//  There are two inside dictht structure  
	dictht ht[2]; 
	long rehashidx; /* rehashing not in progress if rehashidx == -1 */ 
	unsigned long iterators; /* number of iterators currently running */ 
} dict;

table Property is an array , Each element in the array is a point to dict.h/dictEntry Pointer to structure , And each dictEntry The structure holds a key value pair :

typedef struct dictEntry {
   
       
	//  key  
	void *key; 
	//  value  
	union {
   
       
		void *val; 
		uint64_t u64; 
		int64_t s64; 
		double d; 
	} v; 
	//  Point to next hash table node , Form a linked list  
	struct dictEntry *next; 
} dictEntry;

You can see from the source code above , In fact, the dictionary structure contains two hashtable, Usually there is only one hashtable There is a value , But when the dictionary expands and shrinks , Need to allocate new hashtable, Then proceed Gradual relocation ( Here's why ).

①、 Progressive type rehash

The expansion of large dictionaries is time-consuming , You need to re apply for a new array , Then all the elements in the linked list of the old dictionary will be reattached to the new array , This is a O(n) Level of operation , As a single thread Redis It's hard to bear such a time-consuming process , therefore Redis Use Progressive type rehash Move in small steps :

Progressive type rehash Will be in rehash At the same time , Keep the old and the new hash structure , As shown in the figure above , Two will be queried at the same time hash structure , And then on the following scheduled tasks and hash In operation instruction , Move the contents of the old dictionary into the new one step by step . When the move is done , Will use the new hash Structure takes the place of .

②、 Conditions for expansion and contraction

Under normal circumstances , When hash In the table When the number of elements is equal to the length of the first dimension array , Will start to expand , The new array for expansion is Original array size 2 times . But if Redis Being done bgsave( Persistent command ), In order to reduce the memory too much separation ,Redis Try not to expand , But if hash The watch is very full , Reaching the length of the first dimension array 5 The Times , This time it will be Forced expansion .

When hash When the table becomes more and more sparse because the elements are gradually deleted ,Redis Would be right hash The watch is shrunk to reduce hash The first dimension array space of the table occupies . The condition used is The number of elements is less than the length of the array 10%, Shrink will not consider Redis Are you doing bgsave.

③、 Basic operation of dictionary

hash There are also shortcomings. ,hash Structure consumes more storage than a single string , So it's time to use hash Or string , It needs to be weighed again and again according to the actual situation :

> HSET books java "think in java" 		#  If the string on the command line contains spaces, you need to wrap it in quotation marks  
(integer) 1 
> HSET books python "python cookbook" 
(integer) 1 
> HGETALL books # key  and  value  The interval appears  
1) "java" 
2) "think in java" 
3) "python" 
4) "python cookbook" 
> HGET books java 
"think in java" 
> HSET books java "head first java" 
(integer) 0 						#  Because it's the update operation , So back  0 
> HMSET books java "effetive java" python "learning python" 		#  The batch operation  OK

4) aggregate set

Redis The set of is equivalent to Java In language HashSet, Its internal key value pairs are out of order 、 Unique . Its internal implementation is equivalent to a special dictionary , All in the dictionary value It's all a value NULL.

①、 aggregate set Basic use of

Because the structure is simple , Let's go straight to see how it's used :

> SADD books java 
(integer) 1 
> SADD books java			 #  repeat  
(integer) 0 
> SADD books python golang 
(integer) 2 
> SMEMBERS books 			#  Order of attention ,set  Is chaotic  
1) "java" 
2) "python" 
3) "golang" 
> SISMEMBER books java 		#  Query a  value  Whether there is , amount to  contains 
(integer) 1 
> SCARD books 				#  To obtain the length of the  
(integer) 3 
> SPOP books 				#  Pop up a  
"java"

5) Ordered list zset

This may make Redis One of the most distinctive data structures , It is similar to Java in SortedSet and HashMap The combination of , On the one hand, it is a set, Guaranteed the interior value Uniqueness , On the other hand, it can be for every value Give one score value , The weight used to represent the sort .

Its internal implementation is called 「 Skip list 」 Data structure of , Because of the complexity , So just mention the principle here :


Imagine you're the boss of a startup company , At first, there were only a few people , We all sit on equal footing . Later, with the development of the company , More and more , The cost of team communication is increasing , Gradually, the system of group leader was introduced , Divide the team , So some people It's an employee and a team leader .

later , The size of the company has expanded further , The company needs to move to another level : department . Then a minister was elected from each department .

Jump tables are similar to this mechanism , All the elements in the bottom layer will be strung together , It's all employees , And then every few elements you pick out a representative , Then string these representatives together with another level of pointer . And then pick out the secondary representatives from these representatives , String it up again. . Finally, a pyramid structure was formed .

Think about your current location : Asia > China > A province > A city > …, It's such a structure !

①、 Ordered list zset Basic operation

> ZADD books 9.0 "think in java" 
> ZADD books 8.9 "java concurrency" 
> ZADD books 8.6 "java cookbook" 

> ZRANGE books 0 -1 #  Press  score  Sort list , Parameter range is rank range  
1) "java cookbook" 
2) "java concurrency" 
3) "think in java" 

> ZREVRANGE books 0 -1 #  Press  score  List in reverse order , Parameter range is rank range  
1) "think in java" 
2) "java concurrency" 
3) "java cookbook" 

> ZCARD books #  amount to  count() 
(integer) 3 

> ZSCORE books "java concurrency" #  Get specified  value  Of  score 
"8.9000000000000004" #  Inside  score  Use  double  Type to store , So there is the problem of decimal point accuracy 

> ZRANK books "java concurrency" #  ranking  
(integer) 1 

> ZRANGEBYSCORE books 0 8.91 #  Traverse according to the score interval  zset 
1) "java cookbook" 
2) "java concurrency" 

> ZRANGEBYSCORE books -inf 8.91 withscores #  According to the score range  (-, 8.91]  Traverse  zset, Also returns the score .inf  representative  infinite, Infinite meaning . 
1) "java cookbook" 
2) "8.5999999999999996" 
3) "java concurrency" 
4) "8.9000000000000004" 

> ZREM books "java concurrency" #  Delete  value 
(integer) 1 
> ZRANGE books 0 -1 
1) "java cookbook" 
2) "think in java"

Two 、 Skip list

1. Introduction to jump table

Skip list (skiplist) It's a kind of randomized data structure , from William Pugh In the paper 《Skip lists: a probabilistic alternative to balanced trees》 It is proposed that , It's as balanced as a tree —— lookup 、 Delete 、 Add and other operations can be completed in the logarithmic expected time , Here is a typical example of a jump table :


episode :
More Ali 、 tencent 、 Meituan 、 Jingdong and other first-line Internet companies Java The real question of the interview ; contain : Basics 、 Concurrent 、 lock 、JVM、 Design patterns 、 data structure 、 Reflection /IO、 database 、Redis、Spring、 Message queue 、 Distributed 、Zookeeper、Dubbo、Mybatis、Maven、 Facial Scripture, etc .
more Java Programmer technology advanced tips ; For example, efficient learning ( How to learn and read code 、 In the face of boring and large amount of knowledge ) Communicate effectively ( Communication methods and skills 、 Communication technology )
more Java Some career sharing documents shared by Daniel



Please click on Add here 》》》》》》》》》 community , Free access


Better opponents than you are learning , Your enemy is sharpening his knife , Your best friend is losing weight , Old Wang next door is practicing waist , We must keep learning , Otherwise, we will be surpassed by learners !
young , Work hard , Give future self an account !


We mentioned in the last article Redis Of the five basic structures of , There's one called Ordered list zset Data structure of , It is similar to Java Medium SortedSet and HashMap The combination of , On the one hand, it is a set Guaranteed the interior value Uniqueness , On the other hand, it can be given to everyone value Assign a sort weight value score, In order to achieve Sort Purpose .

Its internal implementation relies on a method called **「 Skip list 」** Data structure of .

1) Why use jump tables

First , because zset To support random insertion and deletion , So it It is not appropriate to use arrays to implement , On the sorting problem , It's easy for us to think of Red and black trees / Balance tree This kind of tree structure , Why? Redis Without such structures ?

  1. Performance considerations : In the case of high concurrency , The tree structure needs to perform something similar to rebalance This may involve the operation of the whole tree , Relatively speaking, the change of jump table only involves local ( Let's talk about it in detail );
  2. Implementation considerations : With the same complexity as a red black tree , Jump tables are easier to implement , It also looks more intuitive ;

Based on the above considerations ,Redis be based on William Pugh After making some improvements, we adopted Skip list Such a structure .

2) The essence is to solve the search problem

Let's first look at a common linked list structure :

We need this list according to score Value to sort , That means , When we need to add new elements , We need to locate the insertion point , In this way, we can continue to ensure that the list is in order , Usually we use Dichotomy search , But the binary search is an ordered array , The linked list can't locate the position , We go through the whole process and find the first node larger than the given data ( Time complexity O(n)) There seems to be no better way .

But if we add a pointer between every two adjacent nodes , Let the pointer point to the next node , Here's the picture :

In this way, all the newly added pointers form a new linked list , But it contains only half of the original data ( In the picture are 3,11).

Now suppose we want to find data , You can look up according to this new linked list , If you encounter a node larger than the data to be searched , Go back to the original linked list to find , such as , We want to find 7, The search path is along the direction indicated by the red pointer in the figure below :


This is a slightly extreme example , But we can still see , Find through the newly added pointer , We no longer need to compare with each node in the linked list one by one , After this improvement, the number of nodes to be compared is only about half of the original .

In the same way , We can put it on the new list , Continue to add a pointer to every two adjacent nodes , So as to generate the third level of linked list :

In this new three-tier linked list structure , We try to lookup 13, So the first comparison along the top link list is 11, Find out 11 Than 13 Small , So we know that we just need to get to 11 Continue to look for , So it's all over the place 11 All the nodes in the front .

As you can imagine , When the list is long enough , Such a multi-layer linked list structure can help us to skip many lower level nodes , In order to speed up the efficiency of the search .

3) Further jump tables

Skip list skiplist It is inspired by this multi-layer linked list structure that we designed . Follow the way to generate the linked list above , The number of nodes in each layer of the above list , It's half the number of nodes in the next layer , So the search process is very similar to a binary search , The time complexity of searching can be reduced to O(logn).

however , This method has big problems when inserting data . After a new node is inserted , It will disrupt the number of nodes on the two adjacent layers of the list 2:1 Correspondence of . If this correspondence is to be maintained , You have to put all the nodes after the newly inserted node ( It also includes newly inserted nodes ) Re adjust , This will allow time complexity to degenerate back into O(n). Deleting data has the same problem .

skiplist To avoid this problem , It does not require strict correspondence between the number of nodes between two adjacent layers of linked list , It is Give each node a random number of layers (level). such as , It's a random number of nodes 3, Then chain it into the first 1 Layer to tier 3 In the list of three layers . To express clearly , The following figure shows how to form a step-by-step insertion operation skiplist The process of :


episode :
More Ali 、 tencent 、 Meituan 、 Jingdong and other first-line Internet companies Java The real question of the interview ; contain : Basics 、 Concurrent 、 lock 、JVM、 Design patterns 、 data structure 、 Reflection /IO、 database 、Redis、Spring、 Message queue 、 Distributed 、Zookeeper、Dubbo、Mybatis、Maven、 Facial Scripture, etc .
more Java Programmer technology advanced tips ; For example, efficient learning ( How to learn and read code 、 In the face of boring and large amount of knowledge ) Communicate effectively ( Communication methods and skills 、 Communication technology )
more Java Some career sharing documents shared by Daniel



Please click on Add here 》》》》》》》》》 community , Free access


Better opponents than you are learning , Your enemy is sharpening his knife , Your best friend is losing weight , Old Wang next door is practicing waist , We must keep learning , Otherwise, we will be surpassed by learners !
young , Work hard , Give future self an account !


From the above creation and insertion process, we can see that , The number of layers per node (level) It's random , And the new insertion of a node does not affect the number of layers of other nodes , therefore , The insert operation only needs to modify the pointer before and after the node , You don't need to adjust multiple nodes , This reduces the complexity of the insert operation .

Now let's assume that we look for 23 This nonexistent number , Then the search path will look like the figure below :

2. The realization of jump table

Redis The jump table in is made by server.h/zskiplistNode and server.h/zskiplist Two structural definitions , The former is a hop table node , The latter keeps the information about the hop node , Same as before aggregate list The structure is similar to , In fact, only zskiplistNode It can be realized , But the latter is introduced for more convenient operation :

/* ZSETs use a specialized version of Skiplists */ 
typedef struct zskiplistNode {
   
             
	// value 
	sds ele; 
	//  The score is  
	double score; 
	//  Back pointer  
	struct zskiplistNode *backward; 
	//  layer  
	struct zskiplistLevel {
   
             
		//  Forward pointer  
		struct zskiplistNode *forward; 
		//  span  
		unsigned long span; 
	} level[]; 
} zskiplistNode; 

typedef struct zskiplist {
   
             
	//  Jump head pointer  
	struct zskiplistNode *header, *tail; 
	//  Number of nodes in the table  
	unsigned long length; 
	//  The number of layers of the node with the largest number of layers in the table  
	int level; 
} zskiplist;

Just like the standard jump chart at the beginning of the article .

1) Random number of layers

For each newly inserted node , We need to call a random algorithm to assign it a reasonable number of layers , The source code in t_zset.c/zslRandomLevel(void) Is defined in :

int zslRandomLevel(void) {
   
             
	int level = 1; 
	while ((random()&0xFFFF) < (ZSKIPLIST_P * 0xFFFF)) 
		level += 1; 
	return (level<ZSKIPLIST_MAXLEVEL) ? level : ZSKIPLIST_MAXLEVEL; 
}

The intuitively expected goal is 50% The probability of being assigned to Level 1 ,25% The probability of being assigned to Level 2 ,12.5% The probability of being assigned to Level 3, And so on … Yes 2-63 The probability of being assigned to the top level , Because the promotion rate of each level here is 50%.

Redis By default, the maximum number of layers allowed in jump table is 32, In the source code ZSKIPLIST_MAXLEVEL Definition , When Level[0] Yes 264 Element time , To achieve 32 layer , So define 32 It's enough .

2) Create a jump table

This process is relatively simple , In the source code t_zset.c/zslCreate Is defined in :

zskiplist *zslCreate(void) {
   
             
	int j; 
	zskiplist *zsl; 
	
	//  Apply for memory space  
	zsl = zmalloc(sizeof(*zsl)); 
	//  The number of initialization layers is  1 
	zsl->level = 1; 
	//  The initialization length is  0 
	zsl->length = 0; 
	//  Create a layer number of  32, The score is  0, No,  value  Value of the jump header node  
	zsl->header = zslCreateNode(ZSKIPLIST_MAXLEVEL,0,NULL); 
	
	//  Jump header node initialization  
	for (j = 0; j < ZSKIPLIST_MAXLEVEL; j++) {
   
             
		//  All forward pointers of the header node will be skipped  forward  Set to  NULL 
		zsl->header->level[j].forward = NULL; 
		//  All spans of the header node will be skipped  span  Set to  0 
		zsl->header->level[j].span = 0; 
	}
	//  Jump back pointer of header node  backward  Set as  NULL 
	zsl->header->backward = NULL; 
	//  The pointer of the header pointing to the node at the end of the jump table is set to  NULL 
	zsl->tail = NULL; 
	return zsl; 
}

That is, after the execution, the initialization jump table with the following structure is created :

3) Insert node implementation

This is almost the most important piece of code , But the general idea is clear and simple , If you understand the principle of the skip table mentioned above , Then it's easy to sort out several actions that occur when inserting a node ( It's almost like a linked list ):

  1. Find the current location I need to insert ( It includes the same score Time processing );
  2. Create a new node , The pointer before and after adjustment points to , Finish inserting ;

For ease of reading , I put the source code t_zset.c/zslInsert The defined insertion function is broken down into several parts

The first part : Declare the variables that need to be stored

//  Store search path  
zskiplistNode *update[ZSKIPLIST_MAXLEVEL], *x; 
//  Store the span of nodes that pass through  
unsigned int rank[ZSKIPLIST_MAXLEVEL]; 
int i, level;

The second part : Search the current node insertion position

serverAssert(!isnan(score)); 
x = zsl->header; 
//  Step by step search for the target node to degrade , obtain  " Search path " 
for (i = zsl->level-1; i >= 0; i--) {
   
             
	/* store rank that is crossed to reach the insert position */ 
	rank[i] = i == (zsl->level-1) ? 0 : rank[i+1]; 
	//  If  score  equal , We need to compare  value  value  
	while (x->level[i].forward && 
			(x->level[i].forward->score < score || 
				(x->level[i].forward->score == score && 
					sdscmp(x->level[i].forward->ele,ele) < 0))) 
	{
   
             
		rank[i] += x->level[i].span; 
		x = x->level[i].forward; 
	}
	//  Record  " Search path " 
	update[i] = x; 
}

Discuss : There is an extreme situation , It's all in the jump table score The values are all the same ,zset Will our search performance degrade to O(n) Well ?

From the source code above, we can find zset The sorting elements of are not just looking at score value , It will also compare value value ( String comparison )

The third part : Generate insert nodes

/* we assume the element is not already inside, since we allow duplicated 
	* scores, reinserting the same element should never happen since the 
	* caller of zslInsert() should test in the hash table if the element is 
	* already inside or not. */ 
level = zslRandomLevel(); 
//  If it's randomly generated  level  More than the current maximum  level  Need to update jump table information  
if (level > zsl->level) {
   
             
	for (i = zsl->level; i < level; i++) {
   
             
		rank[i] = 0; 
		update[i] = zsl->header; 
		update[i]->level[i].span = zsl->length; 
	}
	zsl->level = level; 
}
//  Create a new node  
x = zslCreateNode(level,score,ele);

The fourth part : Rearranging forward pointers

for (i = 0; i < level; i++) {
   
             
	x->level[i].forward = update[i]->level[i].forward; 
	update[i]->level[i].forward = x; 
	
	/* update span covered by update[i] as x is inserted here */ 
	x->level[i].span = update[i]->level[i].span - (rank[0] - rank[i]); 
	update[i]->level[i].span = (rank[0] - rank[i]) + 1; 
}

/* increment span for untouched levels */ 
for (i = level; i < zsl->level; i++) {
   
             
	update[i]->level[i].span++; 
}

The fifth part : Reorder the backward pointer and return

x->backward = (update[0] == zsl->header) ? NULL : update[0]; 
if (x->level[0].forward) 
	x->level[0].forward->backward = x; 
else
	zsl->tail = x; 
zsl->length++; 
return x;

4) Node deletion implementation

The deletion process is controlled by t_zset.c/zslDeleteNode Definition , Similar to the insertion process , You need to put this first " Search path " To find out , Then rearrange the forward and backward pointers for the relevant nodes of each layer , At the same time, pay attention to update the maximum number of layers maxLevel, Put the source code directly ( If you understand, it's easy to understand here ):

/* Internal function used by zslDelete, zslDeleteByScore and zslDeleteByRank */ 
void zslDeleteNode(zskiplist *zsl, zskiplistNode *x, zskiplistNode **update) {
   
             
	int i; 
	for (i = 0; i < zsl->level; i++) {
   
            
		if (update[i]->level[i].forward == x) {
   
             
			update[i]->level[i].span += x->level[i].span - 1; 
			update[i]->level[i].forward = x->level[i].forward; 
		} else {
   
             
			update[i]->level[i].span -= 1; 
		} 
	}
	if (x->level[0].forward) {
   
             
		x->level[0].forward->backward = x->backward; 
	} else {
   
             
		zsl->tail = x->backward; 
	}
	while(zsl->level > 1 && zsl->header->level[zsl->level-1].forward == NULL) zsl->level--; 
	zsl->length--; 
}

/* Delete an element with matching score/element from the skiplist. 
	* The function returns 1 if the node was found and deleted, otherwise 
	* 0 is returned. 
	*
	* If 'node' is NULL the deleted node is freed by zslFreeNode(), otherwise 
	* it is not freed (but just unlinked) and *node is set to the node pointer, 
	* so that it is possible for the caller to reuse the node (including the 
	* referenced SDS string at node->ele). */

int zslDelete(zskiplist *zsl, double score, sds ele, zskiplistNode **node) {
   
             
	zskiplistNode *update[ZSKIPLIST_MAXLEVEL], *x; 
	int i;
	
	x = zsl->header; 
	for (i = zsl->level-1; i >= 0; i--) {
   
             
		while (x->level[i].forward && 
				(x->level[i].forward->score < score || 
					(x->level[i].forward->score == score && 
						sdscmp(x->level[i].forward->ele,ele) < 0))) 
		{
   
             
			x = x->level[i].forward; 
		}
		update[i] = x; 
	}
	/* We may have multiple elements with the same score, what we need 		
	 * is to find the element with both the right score and object. */ 
	x = x->level[0].forward; 
	if (x && score == x->score && sdscmp(x->ele,ele) == 0) {
   
             
		zslDeleteNode(zsl, x, update); 
		if (!node) 
			zslFreeNode(x); 
		else
			*node = x; 
		return 1; 
	}
	return 0; /* not found */ 
}

5) Node update implementation

When we call ZADD When the method is used , If the corresponding value non-existent , That's the insertion process , If this value Already exist , Just adjust score Value , Then we need to go through an update process .

Suppose this new score Values don't change the sort , Then there's no need to adjust the position , Directly modify the score The value will do , But if the sort position changes , Then you need to adjust the position , How to adjust ?

From the source t_zset.c/zsetAdd function 1350 You can see ,Redis It's a very simple strategy :

/* Remove and re-insert when score changed. */ 
if (score != curscore) {
   
             
	zobj->ptr = zzlDelete(zobj->ptr,eptr); 
	zobj->ptr = zzlInsert(zobj->ptr,ele,score); 
	*flags |= ZADD_UPDATED; 
}

Delete this element and insert this , We need to go through two path searches , From this point of view ,Redis Of ZADD There seems to be room for further code optimization .

6) The implementation of element ranking

The jump table itself is ordered ,Redis stay skiplist Of forward The pointer is optimized , To every one. forward The pointer has been increased span attribute , be used for Represents the previous node along the current layer forward How many nodes will the pointer skip when it jumps to the current node . We can also see it in the source code above Redis In the insert 、 Delete operations will be carefully updated span Value size .

therefore , Along “ Search path ”, Put all the spans through the nodes span The final value of the current element can be calculated by adding the values rank The value of :

/* Find the rank for an element by both score and key. 
 * Returns 0 when the element cannot be found, rank otherwise. 
 * Note that the rank is 1-based due to the span of zsl->header to the 
 * first element. */ 
unsigned long zslGetRank(zskiplist *zsl, double score, sds ele) {
   
             
	zskiplistNode *x; 
	unsigned long rank = 0; 
	int i; 
	
	x = zsl->header; 
	for (i = zsl->level-1; i >= 0; i--) {
   
             
		while (x->level[i].forward && 
			(x->level[i].forward->score < score || 
				(x->level[i].forward->score == score && 
					sdscmp(x->level[i].forward->ele,ele) <= 0))) {
   
             
				// span  Add up  
				rank += x->level[i].span; 
				x = x->level[i].forward; 
			}
			
			/* x might be equal to zsl->header, so test if obj is non-NULL */ 
			if (x->ele && sdscmp(x->ele,ele) == 0) {
   
             
				return rank; 
			} 
		}
		return 0; 
	}

Reference material :《Java Comprehensive analysis of middle and advanced core knowledge 》 limited 100 Share , Some people have already obtained it from my previous articles !
The quota is limited on a first come, first served basis !!!
Students who want to get this learning material can Click here for free 》》》》》》》

版权声明
本文为[osc_ tko37abm]所创,转载请带上原文链接,感谢
https://cdmana.com/2020/12/20201224114549516m.html

Scroll to Top