## Data structure and algorithm

### BP Neural network algorithm

#### 1. Algorithm principle

##### 1.1 summary

Artificial neural network There is no need to determine in advance between inputs and outputs The mapping relationship Mathematical equation of , Only through their own training , Learn some rules , When the input value is given, the result closest to the expected output value is obtained . As an intelligent information processing system , The core of artificial neural network to realize its function is algorithm .BP Neural network is a kind of neural network Error back propagation ( It's called error back propagation ) Trained multilayer feedforward networks , The algorithm is called BP Algorithm , The basic idea is this Gradient descent method , Using gradient search technology , In order to minimize the error mean square deviation between the actual output value and the expected output value of the network .
BP The calculation process of neural network from Forward calculation process and Reverse calculation process form . Positive propagation process , The input mode is processed layer by layer from the input layer through the hidden unit layer , And turn to the output layer , The state of each layer of neurons only affects the state of the next layer of neurons . If the desired output cannot be obtained at the output layer , Then turn to back propagation , Return the error signal along the original connection path , By modifying the weights of each neuron , Minimize the error signal .

##### 1.2 Algorithm analysis

Multilayer neural network structure Usually a multilayer neural network consists of $L$​​ Layer neurons , The first floor is called == Input layer ==, The last layer is called == Output layer ==, The middle layer is == Hidden layer ==.

The basic element of multilayer neural network is neuron , The model of a single neuron is as follows ： Input layer input vector :$X=(x_1,x_2,...,x_i,...,x_m);$​

The first $$l$$​​​​​ Hidden layer vector of layer ：$$H^l=(h_1^l,h_2^l,...,h_j^l,...,h_{s_l}^l) (l=2,3,...,L-1,j=1,2,...,s_l);$$​​​​

Output layer output vector ：$$Y=(y_1,y_2,...,y_k,...,y_n);$$

set up $$w_{ij}^l$$ For from $$l-1$$ Layer of the first $$i$$ The first neuron is connected with the second neuron $$l$$ Layer of the first $$j$$​ The connection weight between neurons ,$$b_j^l$$ For the first time $$l$$ Layer $$j$$​ The bias of neurons .

So get ：

$h_j^l=f(net_j^l) \\ net_j^l=\sum_{j=1}^{s_{l-1}}{w_{ij}^l+b_j^l}$

among $$net_j^l$$ For the first time $$l$$ Layer $$j$$ The input of neurons ,$$f(\cdot)$$​ Is the activation function .

Activation function

effect ： Introduce nonlinear factors , The model can better approximate the nonlinear function .

BP The activation function commonly used in neural network algorithms ：

• Sigmod function ：

$f(x)=\frac{1}{1+e^x}$ - Tanh function （ Hyperbolic tangent function ） $$f(x)=\frac{e^x-e^{-x}}{e^x+e^{-x}}$$ ** bias **：

effect ： It can be understood as adding a and input $$X$$ Independent constant term , Make the effect of approximation better . If you use $y=x$​​​ To approach , The effect is not satisfactory , Instead, if you add a constant term , bring $y=x+2$​, The effect will be much better . ** error function **:

effect ： Measure the difference between the output and the expected output

Suppose there is $$p$$​ Training samples $$\{(x(1),y(1)),(x(2),y(2)),...,(x(p),y(p))\}$$​,$$d(i)$$​ Corresponding $$x(i)$$​ Expected output , Suppose a single training sample has $$n$$​ Outputs . Define the error function ：

$E=\frac{1}{p}\sum_{i=1}^p{E(i)}$

among $$E(i)$$ Is the training error of a single sample ：

$E(i)=\frac{1}{2}\sum_{k=1}^n(d_k(i)-y_k(i))^2$

So the global error function ：

$E=\frac{1}{2p}\sum_{i=1}^p\sum_{k=1}^n{(d_k(i)-y_k(i))^2}$

How to update weights and offsets

Error back propagation updates weights and offsets

The gradient descent method is generally used to update the weight and offset ：

$w_{ij}^l=w_{ij}^l-\alpha \frac{\partial E}{\partial w_{ij}^l} \\ b_{j}^l=b_j^l-\alpha \frac{\partial E}{\partial b_j^l}$

among $\alpha $$Is the learning rate ,$$\alpha\in(0,1)$$.BP The key of neural network algorithm is how to solve the above two partial derivatives , The specific derivation is complicated , There is no narration here , Relevant references will be attached at the end of the text$$^{}$.

##### 1.3 review

Finally, let's go through a schematic diagram , review BP The whole process of neural network algorithm . It is mainly used in the following four aspects ：

• Function approximation
• pattern recognition
• classification
• data compression

Inferiority

• Slow learning speed , It takes a lot of learning to converge
• Gradient descent method , Easy to fall into local minimum
• Network layers 、 There is no theoretical guidance for the selection of the number of neurons , Mainly based on experience
• Limited network promotion ability

#### 2. Matlab Realization

##### 2.1 Algorithm implementation steps

(1) Data preprocessing

(2) establish BP Neural network model

(3) Use samples for training

##### 2.2 Case study

​ In establishment BP Neural network model and training （ That is, update the weight and offset ）Matlab It has its own functions , In the realization of BP When neural network algorithm , We can call these functions directly .

​ In order to understand the implementation process of the algorithm more clearly , Here, select the relatively simple data for demonstration .

Case a ： Curve fitting

subject ： establish BP neural network

Input vector $$P=[0,1,2,3,4,5,6,7,8,9,10];$$

Expected output $$T=[0,1,2,3,4,3,2,1,2,3,4];$$

The scatter diagram is as follows ： The trial BP The above figure is fitted by neural network algorithm , And the fitting effect is displayed in a drawing .

Matlab Code

close all; clearvars; clear; % Empty the work environment
P = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
T = [0, 1, 2, 3, 4, 3, 2, 1, 2, 3, 4];
% because feedforwardnet The function automatically normalizes and divides the samples 、 verification 、 Test set , Therefore, there is no need to normalize the data manually , But I don't know if it's out of order
net = feedforwardnet(5, 'traingd'); % yes '5' It means that the hidden layer has 5 Neurons , There is only one hidden layer , The number of neurons in multiple hidden layers is set to [5,3,...]
net.trainParam.lr = 0.01; % Learning rate
net.trainParam.epochs = 10000; % Maximum training times
net.trainParam.goal = 1e-6; % Minimum error , Achieve this accuracy , Stop training
net.trainParam.show = 50; % Every time 50 Show training results for the first time
net = train(net, P, T); % Training
Y = net(P); % Output
perf = perform(net, Y, T);% error
plot(P, T, P, Y, 'r-')



Some pictures with good results    Because there are too few training samples , So the result is not very satisfactory .

Case 2 ： Classification of midges

subject ： Based on the length of antennae and wings , It has been measured 9 the Af and 6 the Apf The data are as follows ：
Af: (1.24,1.72),(1.36,1.74),(1.38,1.64),(1.38,1.82),(1.38,1.90),(1.40,1.70),
(1.48,1.82),(1.54,1.82),(1.56,2.08).
Apf: (1.14,1.78),(1.18,1.96),(1.20,1.86),(1.26,2.00),(1.28,2.00),(1.30,1.96).

The tentacle and wing length of the test pair are (1.24,1.80),(1.28,1.84) And (1.40,2.04) Of 3 Three specimens were identified .

Matlab Code

clearvars; close all; % Empty the work environment
% Import data , The first column is the antenna length , The second column is the wing length
x_1 = [1.24, 1.72; 1.36, 1.74; 1.38, 1.64; 1.38, 1.82;
1.38, 1.90; 1.40, 1.70; 1.48, 1.82; 1.54, 1.82; 1.56, 2.08]; %Af a kind of midge
x_2 = [1.14, 1.78; 1.18, 1.96; 1.20, 1.86; 1.26, 2.00; 1.28, 2.00;
1.30, 1.96]; %Apf a kind of midge
x = [x_1; x_2]'; % Merge transpose , because feedforwardnet Function takes a column as a single sample

goal = [ones(1, 9), zeros(1, 6); zeros(1, 9), ones(1, 6)]; %(1,0) Expressed as Af a kind of midge ,(0,1) Express Apf a kind of midge
x_recognize = [1.24, 1.80; 1.28, 1.84; 1.40, 2.04]'; % Identified samples

plot(x_1(:, 1), x_1(:, 2), 'ro', 'DisplayName', 'Af'); % draw Af The scatter diagram of
hold on;
plot(x_2(:, 1), x_2(:, 2), 'bo', 'DisplayName', 'Apf'); % draw Apf The scatter diagram of
plot(x_recognize(1, :), x_recognize(2, :), 'yo', 'DisplayName', ' distinguish ' ); % Draw the scatter diagram of the identification sample
xlabel(' Antenna length ');
ylabel(' Wing length ');
legend;

net = feedforwardnet([3, 2], 'trainlm'); % Two hidden layers , The number of corresponding neurons is 3 and 2, use L-M optimization algorithm , It works better
net.trainParam.max_fail = 1000;
net.trainParam.lr = 0.05; % Learning rate
net.trainParam.epochs = 10000; % Maximum training times
net.trainParam.goal = 1e-15; % Minimum error , Achieve this accuracy , Stop training
net.trainParam.show = 50; % Every time 50 Show training results for the first time
net = train(net, x, goal); % Training
y0 = sim(net, x) % Output
perf = perform(net, goal, y0)% error
ym = sim(net, x_recognize) % distinguish



The following figure is a scatter diagram of midges , It can be seen that these three samples are still difficult to classify , It's almost hard for the naked eye to judge . utilize BP The results obtained by neural network algorithm sometimes have great differences , It's also normal , It's really not easy to distinguish only by the length of antennae and wings . This is an output when the training error is relatively low , Displays the first of the identification samples 、 The second is Af Types of midges , The third is Apf Types of midges . #### 3. Reference source

https://cdmana.com/2021/08/20210804224623220T.html

Scroll to Top