ORACLE The main and backup are DATAGUARD, That is to say DG. The main library is called primary, Backup call standby
ORACLE Read only library of ：ADG,ACTIVE DATAGUARD.oracle 11g Start supporting , You can open the standby database as readonly state , It can be shared IO Reading business pressure . In other words, the standby database is not opened readonly be called DG, Open for readonly be called ADG.
oracle The synchronization of ： Physical backup and logical backup .
Physical backup database ： take redo It is directly applied to the standby database , take block Block to cover , yes block Physical replication , Follow sql irrelevant .
Logical backup ： take redo It can be interpreted as sql, adopt sql Repeat on the standby database to complete the synchronization of the standby database .
The advantages and disadvantages of backup logical library
The advantages of physical backup database ：
1. Efficiency . From the bottom block Sync , With the upper class sql irrelevant , Data synchronization is efficient . For example, the main database sql perform 10s, change 1000 Block , The physical standby database only needs to change this 1000 Just one block , Don't care sql Its own execution plan . about sql Longer execution , Finally change 1、2 The case of row data , The synchronization efficiency of physical standby database will be higher .
2. Data consistency . Physical backup database and upper layer sql irrelevant , As long as there's no delay , There will be no data inconsistency .redo It's the physical form itself , Generated by the master database , Spread to the standby database through the network , Middle pair redo There's no conversion , The pattern is simple and there is no pit . The table of the logical standby database must have a primary key to ensure data consistency , When the standby database is applied, it is necessary to put the physical redo It's logic sql, In theory, the data will be consistent , Even so, data consistency does not have the strong consistency advantage of physical backup . If the data of a row in the logical standby database is inconsistent for some reason , This little “ The stain ” Not easy to find , And it's likely to expand , There will be more and more inconsistent data , It's also very difficult to repair .
The advantages of logical standby database ：
1. flexibility . The logical standby library can be in the open state , Business allows to modify asynchronous data or even synchronous data （ Don't suggest ）, Flexible data customization .
2. Handling of bad blocks . The physical backup library is used directly redo Cover block, The physical bad block of the main database will also cause the bad block of the standby database . The main and backup physical layer of the logical backup library block The structure is different ,sql You can execute the instructions sql It is available in the standby database , It's a kind of protection for bad blocks .
oracle Selection of physical backup and logical backup ：
Simple and crude ： Physical backup database . There is almost no application of logical backup Library in traditional industries , The data consistency and reliability of physical backup database is very good for traditional industries ,oracle Physical backup database has been widely used , Physics DG The architecture is very mature , Industry certification . We're talking about oracle In general, it refers to dg Physical backup database . That is, the logical backup library may step into a pit , If you really want a logical backup Library , Why not mysql Open source libraries, etc , More flexible , and for free~
oracle Protection mode of
Maximum protection mode ： stay Maximum protection Next , It can ensure that the data of the slave database and the master database are exactly the same , Achieve zero data loss. The transaction is committed on both sides at the same time , It's the end of the business . If the slave database is down or there is a network problem , Master slave database cannot communicate , The main library also went down immediately . In this way , With the highest level of protection . But this mode has a great impact on the performance of the main library , High speed network connection is required .
Maximum available mode ： stay Maximum availability In mode , If the connection to the slave library is normal , It works in the same way Maximum protection Pattern , Transactions are also committed by the master and slave databases at the same time . If the slave database loses contact with the master database , Then the main library automatically switches to Maximum performance Run in mode , Ensure maximum availability of the main library .
Maximum performance mode ： stay Maximum performance, The main library files archived log adopt arch The process is passed to the slave library , In this way , The main database has the highest performance , But there is no guarantee that the data will not be lost , And the lost data is affected by redo log The size of the influence . stay redo log If it's too big , Maybe I didn't file a log for a day , You can manually switch logs to reduce data loss .
The choice of protection mode
In most cases, the traditional industry is the maximum performance mode , A few maximum available modes , The maximum protection mode has almost no , The main database is down and can't bear this pot .
oracle DG Build
In Physics DG During the construction process , The key environment is the generation of full data . The total data are 2 Ways of planting , One is to pull from the backup of the main database ,restore recover The process of , One is duplicate（11g Start supporting ）, Copy the full data directly from the primary database to the standby database . When the full data is done , The database is in a consistent state based on a point in time , At this time, synchronous tracing is turned on redolog Observe lag Etc ,DG Even if it's done .
oracle adg Building manual ：
mysql The synchronization mode of
Asynchronous replication （Asynchronous replication）
MySQL The default replication is asynchronous , The main database will return the result to the client immediately after executing the transaction submitted by the client , It doesn't matter if the slave library has been received and processed , There will be a problem , Lord if crash It fell off , At this time, the submitted transactions on the master may not be transferred to the slave , If at this time , Force will come first from ascension , May result in incomplete data on the new master .
Full synchronous replication （Fully synchronous replication）
When the master library completes a transaction , All the slave libraries execute the transaction before it is returned to the client . Because you need to wait for all the transactions from the library to complete before returning , So the performance of full synchronous replication will be seriously affected .
Semi-synchronous replication （Semisynchronous replication）
Between asynchronous replication and full synchronous replication , The main library does not return to the client immediately after executing a transaction committed by the client , Instead, wait for at least one of the slave libraries to receive and write relay log Is returned to the client . As opposed to asynchronous replication , Semi-synchronous replication improves data security , At the same time, it also causes a certain degree of delay , The delay is at least one TCP/IP Round-trip time . therefore , Semi synchronous replication is best used in low latency networks .
oracle and mysql The pattern difference of
It can be seen that oracle Maximum protection mode and mysql Full synchronization is similar to ,oracle Asynchronous replication and mysql The maximum performance pattern for is similar to . however oracle The maximum available mode and mysql The semi synchronization of is different . Semi synchronization is defined for the entire active and standby set , Only one standby database receives binlog And write relay, The main database can submit . The maximum available mode is for active and standby , It is the intermediate mode of maximum protection and maximum performance , Usually in maximum protection mode , When the standby database is not available , Switch to maximum performance mode .
mysql The synchronization of
mysql Of binlog
In explanation mysql Before the synchronization mode of , I need to know something about mysql Double write . Double writing is both redo and binlog Write at the same time ,redo It's a physical data page redolog,mysql There is no concept of archiving .binlog It's a logical data change log （oracle Only redo, No, binlog Logic writes ）.binlog Yes 3 Patterns , The default is row Pattern , It's not pure sql, It's not a physical block , It's a complete change of the line .
About mysql Thinking of double writing ,mysql Why is there a double writing ,oracle There is no ？
Many people say that double writing is to ensure the recoverability of data , What then? oracle Just need redo, No one questions its reliability ？ I think it has something to do with the design architecture ,oracle It is a whole from the beginning of design ,redo Write to the disk in advance to ensure the persistence of the transaction （ACID Medium D persistence ）. and mysql Of binlog yes server Layer of ,server Layer design needs to consider in the case of unpredictable storage engine architecture , How to ensure the persistence of transactions , Then we can only start from server Layer write file system ,mysql We need to apply the binlog（ Standby relay）.redo yes innodb Storage engine layer , The reliability of storage engine must be considered in the design of storage engine , How to recover the database when it crashes ？ When redo The log is written, but the data is still on the disk , database start Instance recovery is required , Instance recovery requires application redo Roll forward . about mysql Come on , It is necessary to double write .（ The reason for this double writing only represents personal thinking ）
binlog Of GTID
GTID yes binlog Things in ID, No, binlog There is no GTID.GTID Follow mysql Of lsn Number or oracle Of scn There is a difference .lsn and scn It can be understood as a timeline in a database ,GTID It exists in binlog Things in ID Number , The backup database can be based on GTID To track the log .GTID The default is off , Suggest opening .
mysql Synchronization of
mysql By passing binlog To synchronize ,binlog It is called after it is transferred to the standby database relay log, Backup library application relay. The database is not open GTID Under the circumstances , The backup library can only pass pos（position, Byte number ） Find the starting time to synchronize .pos There may be problems with synchronization , because binlog and relay It doesn't have to be one-to-one ,log The sign does not necessarily correspond to ,pos It doesn't necessarily correspond to . But if you turn on GTID Sync , The standby database interrupt only needs to find GTID that will do , It can be set when synchronization is turned on master_auto_postion=1, and pos Synchronization needs to specify binlog file and binlog pos, Such as ：master_log_file='master-bin.000002',master_log_pos=154.
mysql Master slave building
If open GTID,mysql Master slave synchronization is relatively simple , Use change master After the command starts synchronization, the master database passes binlog, The backup library accepts binlog And write relay, Backup library application relay. because mysql The slave library is the logical backup library , Full data pull is also more flexible , From the database layer ,mysql The total data are 2 Kind of way ： With three tools xtrabackup（ Physics ）,mysqldump（ Logic ）.xtrabackup Hot backup is available （ Lock only innodb Metadata , Data rows will not be locked ）, according to lsn Number scan up log, And will lsn Record the business after the , stay prepare When an recover（ similar oracle Of recover）,xtrabackup It will be recorded at the time of backup pos And put it on xtrabackup_binlog_pos_innodb, When the master and slave are turned on, they can be turned on GTID You can directly specify auto position, Did not open GTID You need to specify the xtrabackup_binlog_pos_innodb Medium pos value .mysqldump Lock table is needed to ensure data consistency .mysql You can open it directly from the library （ It is recommended to set from the library to readonly state ）.
mysql xtrabackup Build master slave ：
mysql xtrabackup Hot standby and oracle rman The difference between hot standby ：
mongodb It's a document database , Non relational database .mongodb Of Master-Slaver Master slave architecture is no longer recommended ,mongo The mainstream master-slave replica architecture is Replica Set（ Replica set ）.
Replica Set Replica set
mongodb The recommended minimum architecture is 1 Lord 2 from （ If you look at the electoral mechanism in detail, you can see why the smallest is 3 Nodes ）. The master node receives all write operations , The slave node replicates and applies the master node asynchronously oplog.oplog Record all changes on the database .mongodb There is no semi synchronous or protected mode to protect the strong reliability of data , Replication operations are asynchronous .
1 Lord 2 From replica set ：
（secondary There are also heardbeat Of , To determine the availability of the node ）
The election mechanism
heardbeat Will determine whether the node is available , When at least n/2+1 When all nodes detect that the primary node is unavailable , Then these nodes need to elect the highest leader primary, A new replica set service has been formed . Every secondary Will initiate to become primary Vote for , At this time, each affirmative vote may be +1, But each negative vote is a large negative number , such as -10000,（ A replica set can only have at most 50 Nodes and 7 Election nodes ） So as long as there is 1 A node thinks that this node cannot be its master node , Then this node cannot be the master node , For example, when the data update of the selected node is slower than that of other nodes , Then it cannot be elected as the master node , Other nodes will vote against . So in n/2+1 One of the nodes must be up to date , It will get n/2 A yes vote , This node is elected as the master node .
The concept of election node ：arbiter For election nodes , Only in charge of the election , Not responsible for copying storage data and providing external services
1 Lord 1 from 1 Election node replica set ：
1：1 The drawbacks of Architecture ： When building a replica set, use 1 Lord 1 Architecture from , So when 2 When the network between nodes is not available , If it were not for primary Continued provision of services , Do not be secondary Promotion provides services for the main node , So which one to choose ？ stay mongodb In the replica set architecture , When a breakdown occurs , There has to be n/2+1 When nodes are alive , To provide external services , So in 1：1 As long as the network is not available ,2 No node can be a master node . And so on , A replica set of even nodes , When they use 55 It is divided into 2 In a computer room , When 2 Network is not available in the computer room , Then this replica set is not available , Unable to provide external services .
At this point, if we integrate replication into an odd number , And then we can solve this problem , Singular if resources are not enough , There's no way to join secondary What do I do ？ It can be used at this time arbiter Election node . Election nodes don't store data , No, mongo server process . It eliminates the even hidden dangers of the election process . When the election node follows secondary A computer room , When the network is not available ,primary The node becomes the backup node ,secondary and arbier Become a new available replica set . It is recommended to use an odd number of nodes as much as possible , And all of them are data nodes .
ReplicaSet Synchronization of
rs.initiate The operation will scan except local All collections of all databases outside the library , And insert into secondary.secondary Members start replication immediately after initialization oplog, And Application oplog Change operations on .mongo4.4 Start ,sync Support stream replication oplog.
redis yes key-value Memory database .
redis The persistence of
redis Do not have ACID Medium durable persistence .redis It's a memory database , Data is written in memory and not written to disk , And there's no redolog, Once the host crashes and other unexpected situations occur ,redis There is a risk of data loss for . For some applications ,redis The data part allows the loss of some data .redis Don't give priority to redo Pre write and drop disk , That's why redis“ fast ” Important reasons . One of the important modules of memory data is persistence （ That is, the data is put on the disk ）,redis There is also persistence .redis Persistence has 2 Means ：NDB and AOF.
RDB and AOF
redis Yes 2 A persistent way ,rdb Persistence and aof Persistence and RDB+AOF The persistence of
RDB Baseline full persistence , Full persistence keeps baseline data ,rdb File store key Key value , Trigger at a specific point in time ,save/bgsave Manual trigger rdb Persistence .save Will block all operations ,bgsave Will not be , Both will write all the data in memory to RDB The file of .
AOF Command persistence , Incremental persistence preserves change commands , similar binlog.appendfsync It's only enabled when it's turned on aof Persistence , The default is off .appendfsync Yes 3 In this case ,always,everysec,no.
redis4 Start supporting RDB+AOF The persistence of , here aof Just do incremental persistence , Does not store full logs .redis Load first on restart rdb Reload aof
redis Of ACID persistence
If the power goes off , The lost data depends on the persistence strategy .
If it is RDB Persistence , So generally speaking, there must be data loss （ Unless everything follows SAVE or BGSAVE, It's basically impossible ）
If it is AOF Persistence ,appendfsync Not for always, There's bound to be data loss . But if it is always, Everything can be written , You lose some performance .RDB+AOF Persistence is the same thing .
redis Principle of principal and subordinate
1. Sent from the slave server to the master server SYNC command
2. Primary server execution BGSAVE command , Generate RDB file , And use a buffer to record from bgsave All the writing commands at the beginning
3. master server BGSAVE After the execution , take RDB Send to slave , Load... From the server RDB file , Update your status to the master server's BGSAVE State of
4. The master server sends the buffer write command to the slave server , Execute the write command from the server , Update the current state of the master server from the server
Copy buffer and interrupt reconnection ：
1 The copy buffer is the primary server fixed length （ Default 1mb） First in first out queue
2 The master library puts the write operation in the copy buffer , After disconnecting from the library , Send the offset from the slave to the master , Find if there is an offset in the copy buffer , Send it if you have continue Continue to write from this offset , Reinitialize if not .
redis Master slave building
slaveof Command can complete the initialization and command propagation operations , One command is enough .redis The master-slave is also asynchronous .
redis Master slave building ：
Distributed database OCEANBASE
ob The minimum building requirement is 3 Nodes .ob It's a quasi memory database , Distributed database
ob The durability of
ob It's a quasi memory database , When the transaction is completed, it is not written to disk immediately , Write to disk only when a dump occurs . that ob How to keep data persistent ？ stay paxos Agreement , Business can only be done when at least （n/2+1） When nodes complete the transaction , This business is really done . such as 3 Of nodes ob, At the very least 2 Nodes complete the transaction , This business is really finished . Even if a node goes down , Even if there is dirty data not written to disk , Things don't get lost ,ob Have transaction persistence
ob The persistence of —— Dump and merge
Dump is to store memory data on disk in order
Merging is the discrete storage of baseline and memory data on disk
The cost of merging changes , Merging reads the baseline data and merges it with the memory data , Then go back and write to the original location on the disk .
Dumps also read baseline data , Read all tables involved in incremental data and write them to disk sequentially .
Baseline data is stored sequentially on disk , Read the memory and form b+ Trees . Data is not directly modified in memory , Instead, change data is stored in the form of a transaction linked list , It may only change one column , But it doesn't read the entire block into memory , Only when a dump or merge occurs , To write to disk .
ob Not suitable for reading or writing large amounts of data , It is more suitable for high concurrency and small amount of data reading or writing .
mongodb Of ReplicaSet Follow ob Some similar , They all see the replica set as a whole , however mongodb No, paxos The protocol supports the consistency of master replica data .ob Data update does not drop the disk feature with redis And it's similar to , They all gave up like oracle and mysql Such a real-time disk drop operation , To dramatically improve performance , and redis It's not like that ob That way, the data will not be lost from the architecture . and ob In turn, dumping and merging results in ob Not suitable for large amount of data query and update ,ob For operations with high concurrency and results that only affect a small amount of data is very suitable .
Every database has its own characteristics , such as oracle The goal is to integrate all the functions in one , Very powerful , Of course, the cost is also very high .mysql It's also relational data , The structure of cluster index is also suitable for retrieving data , For less data , Low pressure applications are very suitable , Most of them are stored in mysql The application only needs “ Storage ” function ,mysql It's very suitable .redis and mongo More targeted . There is no particularly high requirement for data security , But data responsive scenarios , It is suitable for use redis, Maybe many developers will not use it as a database .mongodb It's a non relational database , Data storage is very flexible ,mongodb There's also the concept of bonds , A document can have multiple keys , It's better to use it than redis It's a lot more complicated .oceanbase In the application of double 11, I believe that no one doubts its strength ,ob It's very suitable for high concurrency and low result set scenarios . Its data mirroring architecture is different from the traditional database , Most nodes write on the premise that their data will not be lost .