The first few sections have talked about some things about neural networks , Because neural networks are so important in machine learning , So we need to summarize what we have learned in a separate section .
Determine the structure of the neural network
The structure of the neural network , It's just the input layer 、 Output layer plus hidden layer , There are several layers in the hidden layer ？ How many neurons are there in each layer ？ Input layer 、 How many units are there in the output layer ？
How much are these , How much should it be ？ These questions must be answered before training neural networks .
First , The number of units in the input layer is determined by the dimension of your independent variable ;
secondly , The number of units in the output layer is determined by how many classes the problem is divided into .
therefore , The selection of neural network structure , In essence, it is necessary to determine the number of hidden layers and the number of cells in each hidden layer .
With 3 Input units 、4 For example, a neural network with output units , The common settings of hidden layer are shown in the following figure .
According to the classification effect , The more cells in the hidden layer, the better , But too many neurons can make training quite slow , So we need to balance , Generally, the number of cells in the hidden layer is set to the number of cells in the input layer 2~4 Times is better . And the number of hidden layers is 1、2、3 Layers are common .
The general steps of neural network training
Step1： Random initialization weights ;
Step2： Implement forward propagation algorithm , Get the activation function for each input ;
Step3： Code to calculate the cost function ;
Step4： Realize back propagation to calculate the partial derivative of activation function .
Take a look at the pseudo code ：
In code m It's the number of training samples .
Step5： Use gradient test to verify that the code for calculating the partial derivative by back propagation is correct , If it's correct, close the code in the gradient check section .
Step6： Combine some better algorithms to calculate the parameters that can minimize the cost function .
This article is from WeChat official account. - Teacher Gao who talks about programming （codegao） , author ： Middle aged muddleheaded stone
The source and reprint of the original text are detailed in the text , If there is any infringement , Please contact the email@example.com Delete .
Original publication time ： 2020-11-10
Participation of this paper Tencent cloud media sharing plan , You are welcome to join us , share .
本文为[Teacher Gao who talks about programming]所创，转载请带上原文链接，感谢