## Data structure and algorithm

### BP Neural network algorithm

#### 1. Algorithm principle

##### 1.1 summary

** Artificial neural network ** There is no need to determine in advance between inputs and outputs The mapping relationship Mathematical equation of , Only through their own training , Learn some rules , When the input value is given, the result closest to the expected output value is obtained . As an intelligent information processing system , The core of artificial neural network to realize its function is algorithm .BP Neural network is a kind of neural network Error back propagation ( It's called error back propagation ) Trained multilayer feedforward networks , The algorithm is called BP Algorithm , The basic idea is this Gradient descent method , Using gradient search technology , In order to minimize the error mean square deviation between the actual output value and the expected output value of the network .

**BP The calculation process of neural network ** from Forward calculation process and Reverse calculation process form . Positive propagation process , The input mode is processed layer by layer from the input layer through the hidden unit layer , And turn to the output layer , The state of each layer of neurons only affects the state of the next layer of neurons . If the desired output cannot be obtained at the output layer , Then turn to back propagation , Return the error signal along the original connection path , By modifying the weights of each neuron , Minimize the error signal .

##### 1.2 Algorithm analysis

** Multilayer neural network structure **：

The basic element of multilayer neural network is neuron , The model of a single neuron is as follows ：

The first \(l\) Hidden layer vector of layer ：\(H^l=(h_1^l,h_2^l,...,h_j^l,...,h_{s_l}^l) (l=2,3,...,L-1,j=1,2,...,s_l);\)

Output layer output vector ：\(Y=(y_1,y_2,...,y_k,...,y_n);\)

set up \(w_{ij}^l\) For from \(l-1\) Layer of the first \(i\) The first neuron is connected with the second neuron \(l\) Layer of the first \(j\) The connection weight between neurons ,\(b_j^l\) For the first time \(l\) Layer \(j\) The bias of neurons .

So get ：

among \(net_j^l\) For the first time \(l\) Layer \(j\) The input of neurons ,\(f(\cdot)\) Is the activation function .

** Activation function **：

effect ： Introduce nonlinear factors , The model can better approximate the nonlinear function .

BP The activation function commonly used in neural network algorithms ：

- Sigmod function ：
\[f(x)=\frac{1}{1+e^x} \]

effect ： It can be understood as adding a and input \(X\) Independent constant term , Make the effect of approximation better .

effect ： Measure the difference between the output and the expected output

Suppose there is \(p\) Training samples \(\{(x(1),y(1)),(x(2),y(2)),...,(x(p),y(p))\}\),\(d(i)\) Corresponding \(x(i)\) Expected output , Suppose a single training sample has \(n\) Outputs . Define the error function ：

among \(E(i)\) Is the training error of a single sample ：

So the global error function ：

** How to update weights and offsets **：

Error back propagation updates weights and offsets

The gradient descent method is generally used to update the weight and offset ：

among $\alpha \( Is the learning rate ,\)\alpha\in(0,1)\(.BP The key of neural network algorithm is how to solve the above two partial derivatives , The specific derivation is complicated , There is no narration here , Relevant references will be attached at the end of the text \)^{[2]}$.

##### 1.3 review

Finally, let's go through a schematic diagram , review BP The whole process of neural network algorithm .

** advantage **：

It is mainly used in the following four aspects ：

- Function approximation
- pattern recognition
- classification
- data compression

** Inferiority **：

- Slow learning speed , It takes a lot of learning to converge
- Gradient descent method , Easy to fall into local minimum
- Network layers 、 There is no theoretical guidance for the selection of the number of neurons , Mainly based on experience
- Limited network promotion ability

#### 2. Matlab Realization

##### 2.1 Algorithm implementation steps

(1) Data preprocessing

(2) establish BP Neural network model

(3) Use samples for training

(4) Return to the model at the end of the training

##### 2.2 Case study

In establishment BP Neural network model and training （ That is, update the weight and offset ）Matlab It has its own functions , In the realization of BP When neural network algorithm , We can call these functions directly .

In order to understand the implementation process of the algorithm more clearly , Here, select the relatively simple data for demonstration .

** Case a **： Curve fitting

subject ： establish BP neural network

Input vector \(P=[0,1,2,3,4,5,6,7,8,9,10];\)

Expected output \(T=[0,1,2,3,4,3,2,1,2,3,4];\)

The scatter diagram is as follows ：

**Matlab Code **：

```
close all; clearvars; clear; % Empty the work environment
P = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
T = [0, 1, 2, 3, 4, 3, 2, 1, 2, 3, 4];
% because feedforwardnet The function automatically normalizes and divides the samples 、 verification 、 Test set , Therefore, there is no need to normalize the data manually , But I don't know if it's out of order
net = feedforwardnet(5, 'traingd'); % yes '5' It means that the hidden layer has 5 Neurons , There is only one hidden layer , The number of neurons in multiple hidden layers is set to [5,3,...]
net.trainParam.lr = 0.01; % Learning rate
net.trainParam.epochs = 10000; % Maximum training times
net.trainParam.goal = 1e-6; % Minimum error , Achieve this accuracy , Stop training
net.trainParam.show = 50; % Every time 50 Show training results for the first time
net = train(net, P, T); % Training
Y = net(P); % Output
perf = perform(net, Y, T);% error
plot(P, T, P, Y, 'r-')
```

** Some pictures with good results **：

Because there are too few training samples , So the result is not very satisfactory .

** Case 2 **： Classification of midges

subject ： Based on the length of antennae and wings , It has been measured 9 the Af and 6 the Apf The data are as follows ：

Af: (1.24,1.72),(1.36,1.74),(1.38,1.64),(1.38,1.82),(1.38,1.90),(1.40,1.70),

(1.48,1.82),(1.54,1.82),(1.56,2.08).

Apf: (1.14,1.78),(1.18,1.96),(1.20,1.86),(1.26,2.00),(1.28,2.00),(1.30,1.96).

The tentacle and wing length of the test pair are (1.24,1.80),(1.28,1.84) And (1.40,2.04) Of 3 Three specimens were identified .

**Matlab Code **：

```
clearvars; close all; % Empty the work environment
% Import data , The first column is the antenna length , The second column is the wing length
x_1 = [1.24, 1.72; 1.36, 1.74; 1.38, 1.64; 1.38, 1.82;
1.38, 1.90; 1.40, 1.70; 1.48, 1.82; 1.54, 1.82; 1.56, 2.08]; %Af a kind of midge
x_2 = [1.14, 1.78; 1.18, 1.96; 1.20, 1.86; 1.26, 2.00; 1.28, 2.00;
1.30, 1.96]; %Apf a kind of midge
x = [x_1; x_2]'; % Merge transpose , because feedforwardnet Function takes a column as a single sample
goal = [ones(1, 9), zeros(1, 6); zeros(1, 9), ones(1, 6)]; %(1,0) Expressed as Af a kind of midge ,(0,1) Express Apf a kind of midge
x_recognize = [1.24, 1.80; 1.28, 1.84; 1.40, 2.04]'; % Identified samples
plot(x_1(:, 1), x_1(:, 2), 'ro', 'DisplayName', 'Af'); % draw Af The scatter diagram of
hold on;
plot(x_2(:, 1), x_2(:, 2), 'bo', 'DisplayName', 'Apf'); % draw Apf The scatter diagram of
plot(x_recognize(1, :), x_recognize(2, :), 'yo', 'DisplayName', ' distinguish ' ); % Draw the scatter diagram of the identification sample
xlabel(' Antenna length ');
ylabel(' Wing length ');
legend;
net = feedforwardnet([3, 2], 'trainlm'); % Two hidden layers , The number of corresponding neurons is 3 and 2, use L-M optimization algorithm , It works better
net.trainParam.max_fail = 1000;
net.trainParam.lr = 0.05; % Learning rate
net.trainParam.epochs = 10000; % Maximum training times
net.trainParam.goal = 1e-15; % Minimum error , Achieve this accuracy , Stop training
net.trainParam.show = 50; % Every time 50 Show training results for the first time
net = train(net, x, goal); % Training
y0 = sim(net, x) % Output
perf = perform(net, goal, y0)% error
ym = sim(net, x_recognize) % distinguish
```

The following figure is a scatter diagram of midges , It can be seen that these three samples are still difficult to classify , It's almost hard for the naked eye to judge . utilize BP The results obtained by neural network algorithm sometimes have great differences , It's also normal , It's really not easy to distinguish only by the length of antennae and wings .

[1] BP neural network _ Baidu Encyclopedia (baidu.com)

[2] BP Detailed explanation of neural network derivation process - Alex Yu - Blog Garden (cnblogs.com)