编程知识 cdmana.com

BP neural network algorithm

Data structure and algorithm

BP Neural network algorithm

1. Algorithm principle

1.1 summary

Artificial neural network There is no need to determine in advance between inputs and outputs The mapping relationship Mathematical equation of , Only through their own training , Learn some rules , When the input value is given, the result closest to the expected output value is obtained . As an intelligent information processing system , The core of artificial neural network to realize its function is algorithm .BP Neural network is a kind of neural network Error back propagation ( It's called error back propagation ) Trained multilayer feedforward networks , The algorithm is called BP Algorithm , The basic idea is this Gradient descent method , Using gradient search technology , In order to minimize the error mean square deviation between the actual output value and the expected output value of the network .
BP The calculation process of neural network from Forward calculation process and Reverse calculation process form . Positive propagation process , The input mode is processed layer by layer from the input layer through the hidden unit layer , And turn to the output layer , The state of each layer of neurons only affects the state of the next layer of neurons . If the desired output cannot be obtained at the output layer , Then turn to back propagation , Return the error signal along the original connection path , By modifying the weights of each neuron , Minimize the error signal .

1.2 Algorithm analysis

Multilayer neural network structure

Usually a multilayer neural network consists of $L$​​ Layer neurons , The first floor is called == Input layer ==, The last layer is called == Output layer ==, The middle layer is == Hidden layer ==.

The basic element of multilayer neural network is neuron , The model of a single neuron is as follows :

image-20210803202615457
Input layer input vector :$X=(x_1,x_2,...,x_i,...,x_m);$​

The first \(l\)​​​​​ Hidden layer vector of layer :\(H^l=(h_1^l,h_2^l,...,h_j^l,...,h_{s_l}^l) (l=2,3,...,L-1,j=1,2,...,s_l);\)​​​​

Output layer output vector :\(Y=(y_1,y_2,...,y_k,...,y_n);\)

set up \(w_{ij}^l\) For from \(l-1\) Layer of the first \(i\) The first neuron is connected with the second neuron \(l\) Layer of the first \(j\)​ The connection weight between neurons ,\(b_j^l\) For the first time \(l\) Layer \(j\)​ The bias of neurons .

So get :

\[h_j^l=f(net_j^l) \\ net_j^l=\sum_{j=1}^{s_{l-1}}{w_{ij}^l+b_j^l} \]

among \(net_j^l\) For the first time \(l\) Layer \(j\) The input of neurons ,\(f(\cdot)\)​ Is the activation function .

Activation function

effect : Introduce nonlinear factors , The model can better approximate the nonlinear function .

BP The activation function commonly used in neural network algorithms :

  • Sigmod function :

    \[f(x)=\frac{1}{1+e^x} \]

image-20210804000024960
- Tanh function ( Hyperbolic tangent function ) $$ f(x)=\frac{e^x-e^{-x}}{e^x+e^{-x}} $$
image-20210804001057704
** bias **:

effect : It can be understood as adding a and input \(X\) Independent constant term , Make the effect of approximation better .

img
If you use $y=x$​​​ To approach , The effect is not satisfactory , Instead, if you add a constant term , bring $y=x+2$​, The effect will be much better .
img
** error function **:

effect : Measure the difference between the output and the expected output

Suppose there is \(p\)​ Training samples \(\{(x(1),y(1)),(x(2),y(2)),...,(x(p),y(p))\}\)​,\(d(i)\)​ Corresponding \(x(i)\)​ Expected output , Suppose a single training sample has \(n\)​ Outputs . Define the error function :

\[E=\frac{1}{p}\sum_{i=1}^p{E(i)} \]

among \(E(i)\) Is the training error of a single sample :

\[E(i)=\frac{1}{2}\sum_{k=1}^n(d_k(i)-y_k(i))^2 \]

So the global error function :

\[E=\frac{1}{2p}\sum_{i=1}^p\sum_{k=1}^n{(d_k(i)-y_k(i))^2} \]

How to update weights and offsets

Error back propagation updates weights and offsets

The gradient descent method is generally used to update the weight and offset :

\[w_{ij}^l=w_{ij}^l-\alpha \frac{\partial E}{\partial w_{ij}^l} \\ b_{j}^l=b_j^l-\alpha \frac{\partial E}{\partial b_j^l} \]

among $\alpha \( Is the learning rate ,\)\alpha\in(0,1)\(.BP The key of neural network algorithm is how to solve the above two partial derivatives , The specific derivation is complicated , There is no narration here , Relevant references will be attached at the end of the text \)^{[2]}$.

1.3 review

Finally, let's go through a schematic diagram , review BP The whole process of neural network algorithm .

image-20210804155659704
##### 1.4 Advantages and disadvantages

advantage

It is mainly used in the following four aspects :

  • Function approximation
  • pattern recognition
  • classification
  • data compression

Inferiority

  • Slow learning speed , It takes a lot of learning to converge
  • Gradient descent method , Easy to fall into local minimum
  • Network layers 、 There is no theoretical guidance for the selection of the number of neurons , Mainly based on experience
  • Limited network promotion ability

2. Matlab Realization

2.1 Algorithm implementation steps

(1) Data preprocessing

(2) establish BP Neural network model

(3) Use samples for training

(4) Return to the model at the end of the training

2.2 Case study

​ In establishment BP Neural network model and training ( That is, update the weight and offset )Matlab It has its own functions , In the realization of BP When neural network algorithm , We can call these functions directly .

​ In order to understand the implementation process of the algorithm more clearly , Here, select the relatively simple data for demonstration .

Case a : Curve fitting

subject : establish BP neural network

Input vector \(P=[0,1,2,3,4,5,6,7,8,9,10];\)

Expected output \(T=[0,1,2,3,4,3,2,1,2,3,4];\)

The scatter diagram is as follows :

img
The trial BP The above figure is fitted by neural network algorithm , And the fitting effect is displayed in a drawing .

Matlab Code

close all; clearvars; clear; % Empty the work environment 
P = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
T = [0, 1, 2, 3, 4, 3, 2, 1, 2, 3, 4];
% because feedforwardnet The function automatically normalizes and divides the samples 、 verification 、 Test set , Therefore, there is no need to normalize the data manually , But I don't know if it's out of order 
net = feedforwardnet(5, 'traingd'); % yes '5' It means that the hidden layer has 5 Neurons , There is only one hidden layer , The number of neurons in multiple hidden layers is set to [5,3,...]
net.trainParam.lr = 0.01; % Learning rate 
net.trainParam.epochs = 10000; % Maximum training times 
net.trainParam.goal = 1e-6; % Minimum error , Achieve this accuracy , Stop training 
net.trainParam.show = 50; % Every time 50 Show training results for the first time 
net = train(net, P, T); % Training 
Y = net(P); % Output 
perf = perform(net, Y, T);% error 
plot(P, T, P, Y, 'r-')

Some pictures with good results

2
3 4

Because there are too few training samples , So the result is not very satisfactory .

Case 2 : Classification of midges

subject : Based on the length of antennae and wings , It has been measured 9 the Af and 6 the Apf The data are as follows :
Af: (1.24,1.72),(1.36,1.74),(1.38,1.64),(1.38,1.82),(1.38,1.90),(1.40,1.70),
(1.48,1.82),(1.54,1.82),(1.56,2.08).
Apf: (1.14,1.78),(1.18,1.96),(1.20,1.86),(1.26,2.00),(1.28,2.00),(1.30,1.96).

The tentacle and wing length of the test pair are (1.24,1.80),(1.28,1.84) And (1.40,2.04) Of 3 Three specimens were identified .

Matlab Code

clearvars; close all; % Empty the work environment 
% Import data , The first column is the antenna length , The second column is the wing length 
x_1 = [1.24, 1.72; 1.36, 1.74; 1.38, 1.64; 1.38, 1.82;
    1.38, 1.90; 1.40, 1.70; 1.48, 1.82; 1.54, 1.82; 1.56, 2.08]; %Af a kind of midge 
x_2 = [1.14, 1.78; 1.18, 1.96; 1.20, 1.86; 1.26, 2.00; 1.28, 2.00;
    1.30, 1.96]; %Apf a kind of midge 
x = [x_1; x_2]'; % Merge transpose , because feedforwardnet Function takes a column as a single sample 

goal = [ones(1, 9), zeros(1, 6); zeros(1, 9), ones(1, 6)]; %(1,0) Expressed as Af a kind of midge ,(0,1) Express Apf a kind of midge 
x_recognize = [1.24, 1.80; 1.28, 1.84; 1.40, 2.04]'; % Identified samples 

plot(x_1(:, 1), x_1(:, 2), 'ro', 'DisplayName', 'Af'); % draw Af The scatter diagram of 
hold on;
plot(x_2(:, 1), x_2(:, 2), 'bo', 'DisplayName', 'Apf'); % draw Apf The scatter diagram of 
plot(x_recognize(1, :), x_recognize(2, :), 'yo', 'DisplayName', ' distinguish ' ); % Draw the scatter diagram of the identification sample 
xlabel(' Antenna length ');
ylabel(' Wing length ');
legend;

net = feedforwardnet([3, 2], 'trainlm'); % Two hidden layers , The number of corresponding neurons is 3 and 2, use L-M optimization algorithm , It works better 
net.trainParam.max_fail = 1000;
net.trainParam.lr = 0.05; % Learning rate 
net.trainParam.epochs = 10000; % Maximum training times 
net.trainParam.goal = 1e-15; % Minimum error , Achieve this accuracy , Stop training 
net.trainParam.show = 50; % Every time 50 Show training results for the first time 
net = train(net, x, goal); % Training 
y0 = sim(net, x) % Output 
perf = perform(net, goal, y0)% error 
ym = sim(net, x_recognize) % distinguish 

The following figure is a scatter diagram of midges , It can be seen that these three samples are still difficult to classify , It's almost hard for the naked eye to judge . utilize BP The results obtained by neural network algorithm sometimes have great differences , It's also normal , It's really not easy to distinguish only by the length of antennae and wings .

This is an output when the training error is relatively low , Displays the first of the identification samples 、 The second is Af Types of midges , The third is Apf Types of midges .
#### 3. Reference source

[1] BP neural network _ Baidu Encyclopedia (baidu.com)

[2] BP Detailed explanation of neural network derivation process - Alex Yu - Blog Garden (cnblogs.com)

[3] project Through four matlab Thoroughly proficient in modeling cases BP neural network _ Bili, Bili _bilibili

[4] The easiest to understand BP Neural network course ---- Meng Xin's preferred introductory course _ Bili, Bili _bilibili

版权声明
本文为[Time flies]所创,转载请带上原文链接,感谢
https://cdmana.com/2021/08/20210804224623220T.html

Scroll to Top