Elasticsearch Introduction to basic concepts

1. Indexes (Index)
Elasticsearch An index is a collection of documents with common characteristics . Every index (index) Contains multiple types (type), These types in turn contain multiple documents (document), Each document contains multiple fields (Fields). stay Elasticsearch The index consists of multiple JSON Document composition . stay Elasticsearch There can be multiple indexes in the cluster .
stay ELK in , When logstash Of JSON The document is sent to Elasticsearch when , They are sent as the default index mode “logstash-%{+YYYY.mm.dd}”. It's indexed by day , In order to search and delete the index easily when needed . This mode can be changed in the output plug-in of log storage .

2. file (document)
Elasticsearch A document is a... Stored in an index JSON file . Each document has a type and a corresponding ID, This is the only one .
Such as :

			  "_index" : "packtpub",
			  "_type" : "elk",
			  "_id" : "1",
			  "_version" : 1,
			  "found" : true,
				book_name : "learning elk"

3. Field (Field)
A basic unit within a document , Key value pair form (book_name : "learning elk")

4. type (Type)
Type is used to provide a logical partition in the index . It basically represents a similar type of document . An index can have multiple types , We can remove them according to the context .

5. mapping (Mapping)
Mapping is used to map each... Of a document field And its corresponding data type , Like strings 、 Integers 、 Floating point numbers 、 Double precision number 、 Date, etc . During index creation ,elasticsearch It will automatically create one for fields Mapping , And according to a specific type of requirement , You can easily query or modify these mappings .

6. Fragmentation (Shard)
Fragmentation is the actual physical entity used to store the data of each index . Each index can have a large number of primary and secondary partitions . Fragmentation is distributed among all nodes in the cluster , You can move from one node to another when a node fails or a new node is added .

7. The primary shard (Primary shard) Split with backup (replica shard)
Backup shards usually reside on a different node , Not the main piece , In the case of fail over and load balancing , Multiple requests can be satisfied .

8. colony (Cluster)
A cluster is a collection of nodes that store index data .elasticsearch Provides horizontal scalability to store data in the cluster . Each cluster is represented by a cluster name , Different nodes indicate that cluster names are connected together . Cluster name in elasticsearch.yml Medium clustersearch.name Property settings for , It defaults to “elasticsearch”:

9. node (Node)
The node is a single running elasticsearch example , It belongs to a cluster . By default ,elasticsearch Each node in is joined by a node named “elasticsearch” The cluster of . Every node can be in elasticsearch Use your own elasticsearch.yml, They can have different settings for memory and resource allocation .

Divide into 3 class :
Data nodes (Data Node)
The data node indexes the document and performs a search on the indexed document . It is recommended to add more data nodes , To improve performance or expand the cluster . By means of elasticsearch Set these properties in , You can make a node a data node .elasticsearch.yml To configure

node.master = false

The management node (Master Node)
The master node is responsible for the management of the cluster . For large clusters , Three dedicated master nodes are recommended ( One primary node and two backup nodes ), They just act as master nodes , Do not store indexes or perform searches . stay elasticsearch.yml Configure declaration node as master :

node.master = true

Routing node is also called load balancing node (Routing Node or load balancer node)
These nodes do not act as master or data nodes , But just perform load balancing , Or route search requests , Or weave the document into the appropriate node . This is very useful for high-volume search or index operations .

node.master = false