Professional Documents
Culture Documents
An Efficient Face Recognition Method Based On CNN
An Efficient Face Recognition Method Based On CNN
An Efficient Face Recognition Method Based On CNN
(1) Data preprocessing to be input into the face detection module, and when a face
is detected, the coordinate position of the corresponding
Face recognition method proposed in the work uses a face frame will be output. Note that, the position on upper
deep learning algorithm CNN to recognize the contour left corner is coordinate origin, and the height and width of
features of faces, and has low requirements for color. the image are respectively the i-axis and j-axis in the plane
Therefore, for color images, we convert them into grayscale rectangular coordinate .
images, and there are
(3) CNN model
V (i, j ) = 0.299 Ri, j + 0.587Gi , j + 0.114 Bi , j . In this paper, CNN is widely applied in the field of
image recognition. The feature of "weight sharing" in the
wherein Ri,j, Gi,j, and Bi,j are the pixel value of three CNN makes the structure of the network very simple. A
color channels at positions i and j, respectively. deep CNN usually contains some layers, such as input,
convolutional, nonlinear mapping and pooling layers.
(2) Face detection Finally, two-dimensional feature is connected into a vector
In this stage, we employ the deep learning MTCNN and input to the final classifier through the fully connected
method to face detection. In this algorithm, the images need layer.
As shown in Fig. 2. First, input face image 32*32 is IV. RESULTS AND DISCUSSION
convolved with six trainable convolution kernels and bias
terms, and six feature maps are generated in the H1 Face recognition is essentially a classification problem,
convolution layer ; Then, every four pixels in the feature and we evaluate the performance of the proposed method by
map are pooled, weighted value, and bias operation, and the precision. The higher the precision is , the better the
feature maps of H2 layers are obtained through an activation performance is .
function. Further, these feature maps are filtered to obtain
the H3 layer, and this hierarchical structure is the same as TP
Precision =
that of H2 to produce H4. Finally, H4 is converted to a vector TP + FP
and output to fully connected layer. Then classification
results can be obtained. wherein TP and FP are respectively the true positives
and false positives.
1789
Authorized licensed use limited to: Zhejiang University. Downloaded on February 22,2024 at 03:18:27 UTC from IEEE Xplore. Restrictions apply.
The data set in the experiment is a real-time face image kernels are used to reconvolve the obtained feature map in
captured by a computer camera. This data set takes photos its previous convolutional layer, where 32 refers to 32
of 10 people, and each person takes 400 face images.In convolution kernels in the first convolutional layer, so the
addition, we divide the obtained dataset into training set generated feature map is also 32-dimensional.
30% and test data set 70%. The purpose of training set is to
train the parameters of CNN, and we verify the accuracy of (3)Third convolutional layer: This layer further uses 64
face recognition by using the remaining test set. Note that convolution kernels to convolve the feature maps which
input face images detected by MTCNN must be first were generated by previous convolution layer, wherein the
converted into a grayscale image, and then scaled to a size size of each convolution kernel is 3*3*64.
of 64 *64, so the size of an input image is 64*64*1. When (4)Fully connected layer: This layer needs to connect
the CNN accept N images, the input layer dimension is features learned by convolutional layer with final output
N*64*64*1. layer. Fully connected layer will pass probability value of
In our study, the proposed CNN contains 3 output sample belonging to each category through the
convolutional layers and 1 fully connected layer. The main softmax function, then final classification result is one
framework is as follows: category which has a largest probability value. Loss
function is defined as the cross-entropy function, and the
(1)First convolutional layer: The convolutional layer error is propagated by gradient descent.
uses 32 3*3*1 convolution kernels to convolve the input
layer (32*32*1), and the last number 1 indicates that the As depicted in Fig.3, at the beginning of training, the
input image is single-channel, so the convolution kernel has train loss and test loss continue to decline, while the training
a moving stride of 1. Then, ReLU function is employed to set accuracy and test set accuracy also increase, these
enhance the fitting ability of the network model, and each indicate that the network is learning. As the network
2*2 area needs to be max pooled . continues to learn, it slowly became stable, and the training
result maintain in a stable and high recognition rate.
(2)Second convolutional layer: 64 3*3*32 convolution
1790
Authorized licensed use limited to: Zhejiang University. Downloaded on February 22,2024 at 03:18:27 UTC from IEEE Xplore. Restrictions apply.
on Optimized GoogLeNet," [J]. IEEE Access, 2021,vol. survey,Neurocomputing[J].2021(429): 215-244.
9:113599-113611. [10] Ou Yang, W. et al.DeepID-Net: Deformable Deep Convolutional
[4] K. Zhang, M. Sun, T. X. Han, X. Yuan, L. Guo and T. Liu."Residual Neural Networks for Object Detection. [J].IEEE transactions on
Networks of Residual Networks: Multilevel Residual Networks." [J]. pattern analysis and machine intelligence.2017, 39(7):1320-1334.
IEEE Transactions on Circuits and Systems for Video Technology, [11] Y. Zhong, W. Deng, J. Hu, D. Zhao, X. Li and D. Wen.“Sface:
2018, 28(6): 1303-1314. sigmoid-constrained hypersphere loss for robust face recognition,”[J].
[5] Turk M. Eigenface for recognition[J].Journal of Cognitive IEEE Transactions on Image Processing, 2021(30):2587–2598.
Neuroscience,1991(3):71-86. [12] Wen Y , Zhang K , Li Z , et al. A Discriminative Feature Learning
[6] Belhumeur, Peter, N, et al. Eigenface vs. fisherfaces: Recognition Approach for Deep Face Recognition[M]// Computer Vision -ECCV
using class specific linear projection.[J].IEEE Transactions on Pattern 2016. Springer International Publishing, 2016.
Analysis & Machine Intelligence, 1997,19(7) :711-720. [13] Liu W , Wen Y , Yu Z , et al. SphereFace: Deep Hypersphere
[7] T. Shu, B. Zhang and Y. Y. Tang "Sparse Supervised Embedding for Face Recognition[C]// 2017 IEEE Conference on
Representation-Based Classifier for Uncontrolled and Imbalanced Computer Vision and Pattern Recognition (CVPR). IEEE, 2017.
Classification,"[J]. IEEE Transactions on Neural Networks and [14] H. Wang et al. CosFace: Large Margin Cosine Loss for Deep Face
Learning Systems, 2020,31( 8) :2847-2856. Recognition[C]// 2018 IEEE/CVFConference on Computer Vision
[8] C. Low, A. B. Teoh and C. Ng."Multi-Fold Gabor, PCA, and ICA and Pattern Recognition, Salt Lake City, UT, 2018, pp. 5265-5274.
Filter Convolution Descriptor for Face Recognition" [J]. IEEE [15] Deng J, Guo J, Xue N, et al. ArcFace: Additive Angular Margin Loss
Transactions on Circuits and Systems for Video Technology, 2019, for Deep Face Recognition[C]// 2019 IEEE/CVF Conference on
29(1):115-129. Computer Vision and Pattern Recognition (CVPR). IEEE, 2019.
[9] Mei Wang, Weihong Deng. Deep face recognition: A
1791
Authorized licensed use limited to: Zhejiang University. Downloaded on February 22,2024 at 03:18:27 UTC from IEEE Xplore. Restrictions apply.