编程知识 cdmana.com

Reverse proxy, load balancing! How does a good nginx do it?

You must have heard of Nginx, If you haven't heard of it , Then I must have heard of it " Colleague "Apache Well !

Nginx The birth of

Nginx Same as Apache It's the same Web The server . be based on REST Architectural style , To unify resource descriptors (Uniform Resources Identifier)URI Or uniform resource locator (Uniform Resources Locator)URL As a basis for communication , adopt HTTP The protocol provides all kinds of network services .

However , At the beginning of the design, these servers were limited by the environment at that time , For example, the scale of users at that time , network bandwidth , Product features and other limitations and their positioning and development are not the same . It also makes each Web Servers have their own distinct characteristics .

Apache It's a long time , And it's the undisputed world's largest server . It has many advantages : Stable 、 Open source 、 Cross platform and so on .

It's been around too long , When it rose , The Internet industry is far from what it is now . So it's designed to be a heavyweight .

It doesn't support highly concurrent servers . stay Apache Running tens of thousands of concurrent accesses on , Will cause the server to consume a lot of memory .

The switching between processes or threads by the operating system also consumes a lot of CPU resources , Lead to HTTP The average response speed of requests is reduced .

It's all decided Apache It can't be high performance Web The server , Lightweight high concurrency server Nginx And that's what happened .

Russian engineers Igor Sysoev, He's doing it for Rambler Media During work , Use C Language development Nginx.

Nginx As Web The server has always been Rambler Media Provide excellent and stable service . so what ,Igor Sysoev take Nginx Open source code , And license free software .

Because of the following , therefore ,Nginx became angry :

  • Nginx Use event driven architecture , So that it can support millions of levels of TCP Connect .

  • High modularity and free software license make third-party modules emerge in endlessly ( This is an open source era ).

  • Nginx It's a cross platform server , Can run in Linux、Windows、FreeBSD、Solaris、AIX、Mac OS On the operating system .

  • The stability of these excellent designs .

Nginx Where it comes in handy

Nginx It's a free 、 Open source 、 High performance HTTP Servers and reverse proxy servers ; It's also a IMAP、POP3、SMTP proxy server .

Nginx As a HTTP The server carries on the website publishing processing , in addition Nginx Can be used as a reverse agent for load balancing .

About agents

When it comes to agency , First of all, we need to define a concept , The so-called agent is a representative 、 One channel ; There are two roles involved , One is the represented role , One is the target character .

The process by which the delegated role accesses the target role to complete some tasks is called the agent operation process ; It's like a boutique in life , Guests arrive adidas The store bought a pair of shoes , This store is the agent , The represented role is adidas manufacturer , The target role is the user .

Forward agency

Before talking about reverse proxy , Let's first look at the forward agent , Positive agency is also the most commonly encountered agency mode , We will talk about the processing mode of forward agent from two aspects , Respectively from the software and life to explain what is a positive agent .

In today's Network Environment , If we want to visit some foreign websites due to technical needs , At this time, you will find that there is no way for us to access a website located abroad through a browser .

At this time, you may use an operation FQ Visit ,FQ The main way is to find a proxy server that can visit foreign websites , We send the request to the proxy server , Proxy server to visit foreign websites , Then pass the accessed data to us !

The above-mentioned agent mode is called forward agent , The biggest feature of forward proxy is that the client is very clear about the server address to be accessed ; The server only knows which proxy the request comes from , It's not clear from which specific client ; Forward proxy mode shields or hides real client information .

Let's see a schematic diagram ( I put the client and forward proxy box together , Belong to the same environment , I'll introduce you later ):

The client must set up a forward proxy , Of course, the premise is to know the forward proxy server IP Address , And the port of the agent .

Here's the picture :

In conclusion : Forward agency ," It proxies the client ", Is a location on the client and the original server (Origin Server) Server between , To get content from the original server , The client sends a request to the agent and specifies the destination ( Original server ).

The agent then forwards the request to the original server and returns the obtained content to the client . The client must make some special settings to use the forward proxy .

The purpose of a forward agent :

  • Access to previously inaccessible resources , Such as Google.

  • You can do caching , Speed up access to resources .

  • Authorization of client access , Go online for certification .

  • Agents can record user access records ( Online behavior management ), Hide user information .

Reverse proxy

Understand what is a positive agent , Let's continue to look at how reverse agents are handled , For example, a treasure website in China , The number of visitors connected to the website at the same time every day has exploded , A single server is far from satisfying people's growing desire to buy .

At this time, a familiar noun appeared : Distributed deployment ; That is to say, multiple servers are deployed to solve the problem of limited access .

Most of the functions in a website are also used directly Nginx For reverse proxy implementation , And through encapsulation Nginx And other components followed by a tall name :Tengine.

Interested in children's shoes can visit Tengine To view specific information :

http://tengine.taobao.org/

So what kind of distributed cluster operation is reverse proxy implemented in , Let's first look at a schematic diagram ( I frame the server and reverse proxy together , Belong to the same environment , I'll introduce you later ):

You can see clearly through the above illustration , Requests sent by multiple clients to the server ,Nginx After the server receives it , According to certain rules, it is distributed to the back-end business processing server for processing .

At this point, the source of the request, that is, the client, is clear , But it's not clear which server handles the request ,Nginx It's a reverse agent role .

The client is the existence of an agent without awareness , Reverse agents are transparent to the outside , Visitors don't know they're visiting an agent . Because the client does not need any configuration to access .

Reverse proxy ," It represents the server ", It is mainly used in the case of distributed deployment of server cluster , The reverse proxy hides the server information .

The role of reverse agent :

  • Ensure the safety of the intranet , The reverse proxy is usually used as the public access address ,Web The server is the intranet .

  • Load balancing , Optimize the load of the website through reverse proxy server .

Project scenario

Usually , When we operate the actual project , Forward and reverse agents are likely to exist in the same application scenario , Forward proxy client requests to access the target server , The target server is a reverse single interest server , The reverse proxy has many real business processing servers .

The specific topology is as follows :

A graph is cut to illustrate the difference between forward agent and reverse agent , Here's the picture :

The illustration :

  • In a forward agent ,Proxy and Client Belong to the same LAN( In the box in the picture ), Hide client information .

  • In reverse proxy ,Proxy and Server Belong to the same LAN( In the box in the picture ), Hide the server information .

actually ,Proxy What you do in both agents is to send and receive requests and responses for the server , But in terms of structure, it's just the right and left interchanges , So we call it reverse proxy .

Load balancing

We have defined the concept of proxy server , So next ,Nginx Acting as a reverse proxy server , What rules does it follow to distribute requests ? Project application scenarios not used , Can the rules of distribution be controlled ?

The client mentioned here sent 、Nginx The number of requests received by the reverse proxy , That's what we call load capacity .

The number of requests is distributed according to certain rules , Rules for handling to different servers , It's a kind of equilibrium rule .

So the process of distributing the requests received by the server according to the rules , It's called load balancing .

Load balancing is in the process of actual project operation , There are hardware load balancing and software load balancing , Hardware load balancing is also called hard load , Such as F5 Load balancing , It's relatively expensive and expensive .

But the stability and security of data are well guaranteed , For example, China Mobile China Unicom will choose hard load to operate .

More companies consider cost reasons , Will choose to use software load balancing , Software load balancing is a message queue distribution mechanism realized by using existing technology and host hardware .

Nginx The supported load balancing scheduling algorithms are as follows :

①weight polling ( Default ): The received requests are allocated to different back-end servers one by one , Even in use , One of the back-end servers is down ,Nginx The server will be automatically removed from the queue , The acceptance of the request will not be affected in any way .

In this way , You can set a weight value for different back-end servers (weight), Used to adjust the allocation rate of requests on different servers .

Larger weight data , The more likely it is to be assigned to a request ; The weight value , It mainly adjusts the hardware configuration of different back-end servers in the actual working environment .

②ip_hash: Each request is based on the ip Of hash Match results , This algorithm is next fixed ip Clients with addresses always access the same back-end server , This also solves the problem of cluster deployment to some extent Session The problem of sharing .

③fair: Intelligent adjustment scheduling algorithm , Dynamically balance allocation according to the time from request processing to response of back-end server .

Servers with short response time and high processing efficiency are more likely to be allocated to requests , Long response time and low efficiency servers allocate less requests , It is a scheduling algorithm that combines the advantages of the first two .

But here's the thing Nginx Not supported by default fair Algorithm , If you want to use this scheduling algorithm , Please install upstream_fair modular .

④url_hash: According to the URL Of hash Result allocation request , Per requested URL Will point to a fixed server at the back end , Can be in Nginx Improve cache efficiency as a static server .

Also pay attention Nginx This scheduling algorithm is not supported by default , Installation is required to use Nginx Of hash software package .

Web Server comparison

Several commonly used Web The server comparison is as follows :

summary

Last , The editor summed up 2020 Interview questions , This interview question contains modules which are divided into 19 A module , Namely : Java Basics 、 Containers 、 Multithreading 、 Reflection 、 Object Copy 、Java Web 、 abnormal 、 The Internet 、 Design patterns 、Spring/Spring MVC、Spring Boot/Spring Cloud、Hibernate、MyBatis、RabbitMQ、Kafka、Zookeeper、MySQL、Redis、JVM .

Follow my public number : Programmer Bai Nannan , Access to the above information .

版权声明
本文为[Programmer Bai Nannan]所创,转载请带上原文链接,感谢
https://cdmana.com/2020/12/20201224225605031C.html

Scroll to Top