编程知识 cdmana.com

How does the face recognition algorithm work?

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Over the past decade , There are many advanced new algorithms and breakthrough research in the field of deep learning , A new computer vision algorithm is introduced ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" It all begins with 2012 Year of AlexNet.AlexNet It's a depth ( Convolution ) neural network , It's in ImageNet Data sets ( Have more than 1400 Data set of ten thousand pictures ) High accuracy has been achieved ."}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":" How do humans recognize faces ?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Maybe , Neurons in the human brain first recognize the face in the scene ( From people's body shape and background ), Then extract facial features , And classify people through these characteristics . We have been in an infinite data set and "},{"type":"link","attrs":{"href":"https:\/\/www.engati.com\/glossary\/neural-networks","title":null,"type":null},"content":[{"type":"text","text":" neural network "}]},{"type":"text","text":" We trained on the Internet ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" In the machine "},{"type":"link","attrs":{"href":"https:\/\/www.engati.com\/glossary\/facial-recognition","title":null,"type":null},"content":[{"type":"text","text":" face recognition "}]},{"type":"text","text":" In the same way . First , We use face detection algorithm to detect faces in the scene , Then, facial features are extracted from the detected face , Finally, the algorithm is used to classify people ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/39\/11\/3983b31c3e2edf64f66352186d8b9311.png","alt":null,"title":" Workflow of face recognition system ","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1. Face detection "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Face detection is a specialized version of object detection , What's special is , It only detects one object , It's the face . Just like computer science needs to weigh time and space , Machine learning algorithms also need to trade off between reasoning speed and accuracy . Now there are many object detection algorithms , Different algorithms have different choices of speed and accuracy ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" This paper evaluates the following most advanced object detection algorithms :"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"OpenCV(Haar-Cascade)"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MTCNN"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"YoloV3 and Yolo-Tiny"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"SSD"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"BlazeFace"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"ShuffleNet and Faceboxes"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" In order to build a powerful face detection system , We need accurate and fast algorithms , To satisfy in GPU And the need for real-time operation on mobile devices ."}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":" Accuracy "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" In the real-time reasoning of streaming video , People's faces may have different postures 、 Occlusion and lighting effects . therefore , It is very important that the algorithm can accurately detect faces under different lighting conditions and different poses ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":""}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/ac\/5b\/ac93bc1bc84bacda8b1426402e91d35b.png","alt":null,"title":" Face detection under different pose and illumination conditions ","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"OpenCV(Haar-ascade)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" We from OpenCV Of Haar-cascade Implementation begins , It is a use of C Open source image processing library written in ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":" advantage :"},{"type":"text","text":" Because this library is used C language-written , So its reasoning speed in real-time system is very fast ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":" shortcoming :"},{"type":"text","text":" The problem with this implementation is that it cannot detect side faces , And the performance is not good under different posture and light conditions ."}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"MTCNN"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" This algorithm is based on deep learning method . It uses a deep cascaded convolutional neural network (Deep Cascaded Convolutional Neural Networks) To detect faces ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":" advantage :"},{"type":"text","text":" It is better than OpenCV Of Haar-Cascade The method is more accurate "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":" shortcoming :"},{"type":"text","text":" Long running time ."}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"YOLOV3"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"YOLO(“You only  look  once”) It is the most advanced deep learning algorithm for object detection . It consists of many convolutional neural networks , Form a depth CNN Model ( Depth means that the model architecture is very complex )."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" The original YOLO The model can detect 80 There are different object categories , And the detection accuracy is very high , We only need to use this model to detect an object —— Face . We are WiderFace( contain "},{"type":"text","marks":[{"type":"strong"}],"text":"393,703 A face tag "},{"type":"text","text":" The image data set of ) The algorithm is trained on the data set ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"YOLO There is also a mini version of the algorithm , namely Yolo-Tiny.Yolo-Tiny It takes less computing time , But at the expense of some accuracy . We trained a with the same data set Yolo-Tiny Model , Its bounding box (boundary box) The results are not consistent ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":" advantage :"},{"type":"text","text":" Very accurate , There is no defect . Than MTCNN faster ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":" shortcoming :"},{"type":"text","text":" Because of its huge depth, the neural network layer , It requires more computing resources . therefore , The algorithm CPU Or running very slowly on mobile devices . stay GPU On , Its large-scale architecture also costs more VRAM."}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"SSD"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"SSD(Single Shot Detector) It's also a similar YOLO Deep convolution neural network model ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":" advantage :"},{"type":"text","text":" Good accuracy . It can detect various postures 、 Illumination and occlusion . Good reasoning speed ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":" shortcoming :"},{"type":"text","text":" Than YOLO The model is poor . Although the reasoning speed is better , But still not satisfied with CPU、 Low-end GPU Or running on mobile devices ."}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"BlazeFace"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Just like its name , It is a very fast face detection algorithm released by Google . It accepts 128x128 Dimensional image input , The reasoning time is sub millisecond , Optimized for use in mobile phones . The reason why it's so fast is :"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"YOLO and SSD Used to detect a large number of categories , and BlazeFace Different , It is a special face detector model . therefore BlazeFace The depth convolution neural network architecture is better than YOLO and SSD The architecture is small ."}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" It uses deep separable convolution (Depthwise Separable Convolution), Instead of the standard convolution layer , This reduces the amount of calculation ."}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":" advantage :"},{"type":"text","text":" Very good reasoning speed , And the accuracy of face detection is high ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":" shortcoming :"},{"type":"text","text":" The optimization goal of this model is to detect the face of the image obtained by the mobile camera , Therefore, it is expected that the face will cover most of the area in the image , When the face size is small , Its recognition effect is very good . therefore , When captured by CCTV cameras (CCTV ,Closed Circuit Tele Vision) When the image is used for face detection , It doesn't perform well ."}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"Faceboxes"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Faceboxes It is the latest face detection algorithm we use . And BlazeFace similar , It is a small deep convolution neural network , To detect only one category —— Designed for human faces . Its reasoning time can meet CPU Real time detection requirements on . Its accuracy can be compared with Yolo Face detection algorithms are comparable , and , Whether the face in the image is large or small , It can accurately detect ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":" advantage :"},{"type":"text","text":" Reasoning is fast , Good accuracy ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":" shortcoming :"},{"type":"text","text":" The assessment is still ongoing ."}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2. feature extraction "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" After detecting the face in the image , We cut the face , And send it into the feature extraction algorithm , The algorithm creates face embedding (face-embeddings)—— A multidimensional representation of face features ( Mainly 128 or 512 dimension ) vector . We use FaceNet Algorithm to create face embedding ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" The embedded vector represents a person's facial features . therefore , The distance between the embedding vectors of two different images of the same person is close , The distance between embedded vectors of different people is far . among , The distance between two vectors is Euclidean distance ."}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3. Facial classification "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" After getting the face embedding vector , We trained a kind of "},{"type":"link","attrs":{"href":"https:\/\/www.engati.com\/glossary\/classification-algorithm","title":null,"type":null},"content":[{"type":"text","text":" Classification algorithm "}]},{"type":"text","text":", namely K- a near neighbor (K-nearest neighbor,KNN) Algorithm , Classify a person according to its embedding vector ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Suppose in an organization , Yes 1000 Employees . We created facial inserts for all employees , And use the embedded vector to train the classification algorithm . The algorithm takes the face embedding vector as the input , Returns the name of the person as output ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Before putting pictures online , Users can use filters to modify specific pixels in the picture . The human eye cannot detect these changes , But it will confuse the face recognition algorithm . —— "},{"type":"link","attrs":{"href":"https:\/\/www.thalesgroup.com\/en\/markets\/digital-identity-and-security\/government\/biometrics\/facial-recognition","title":null,"type":null},"content":[{"type":"text","text":"ThalesGroup"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" At present , Face recognition algorithms have made great progress . But this is only the beginning of the technological revolution . You can imagine , How powerful is the combination of face recognition algorithm and chat robot technology in the future ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" The original English text :"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/www.pimonk.com\/post\/how-do-facial-recognition-systems-algorithms-work-in-2021","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/www.pimonk.com\/post\/how-do-facial-recognition-systems-algorithms-work-in-2021"}]}]}]}

版权声明
本文为[InfoQ]所创,转载请带上原文链接,感谢
https://cdmana.com/2021/10/20211002145414396X.html

Scroll to Top