Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/346415448

Semi-Automatic Mask Generating for Electronics Component Inspection

Article in IEEE Transactions on Components, Packaging, and Manufacturing Technology · October 2020
DOI: 10.1109/TCPMT.2020.3033837

CITATIONS READS
4 111

4 authors, including:

Hao Wu Xiangrong Xu
Anhui University of Technology Anhui University of Technology
26 PUBLICATIONS 251 CITATIONS 70 PUBLICATIONS 306 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Hao Wu on 22 April 2021.

The user has requested enhancement of the downloaded file.


> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 1

Automatic Mask Generating for Electronics


Component Inspection
Hao Wu *, Wenbin Gao, Xiangrong Xu, and Sixiang Xu

the mother board in the height direction, so there is no contact


Abstract—This paper proposes a new method for automatically with the welding area, and the missing component refers to the
generating the image mask required for training MASK R-CNN. absence of components in the welding area. Figure 1 shows the
Since manual labeling is very time-consuming and laborious to comparison between these types of defects (a)-(d) and a good
obtain the image mask, we propose a very simple and fast method component (e).
based on Graph cut to obtain Image Mask method, which consists
of two stages: The first stage implements Graph cut-based image
segmentation, which outputs pixel-level segmentation results and
then obtains Mask of the input image through image transform;
In the second stage, the Mask R-CNN-based surface defect
detection method is implemented, which includes three branches
of different tasks, namely, the boundary box regression and
positioning branch, the boundary box classification branch, and Figure 1 different type of electronic components
the segmentation branch, to realize the function of locating,
classifying and segmenting defects at the same time. The In the past years, there has been a lot of research on vision
experimental results verify the effectiveness of our proposed
method. On the premise of ensuring the detection accuracy of the
inspection. These methods can be divided into traditional image
Mask R-CNN method, we can quickly and easily obtain the Mask processing methods, machine learning methods, and deep
required to train the Mask R-CNN. learning methods [5]. Traditional image processing methods
use wavelet transform [6], Fourier transform[7], principal
Index Terms— Mask R-CNN, Deep learning, Defect detection, component analysis [8] and template matching. Machine
Solder joint inspection learning methods generally first extract features such as
grayscale, color, and geometry of components, and then input
the features into the classifier for classification. Commonly
I. INTRODUCTION used machine learning classifiers include support vector
machines [9] and various traditional neural networks (self-
I N the manufacturing process of electronic components,
defect detection based on automatic optical inspection is very
important. The defects caused by the assembly of electronic
organizing map[10], back-propagation, learning vector
quantization[11], etc.).
components into the circuit board often affect the function of In recent years, deep learning techniques based on deep
the product and reduce the economic value of the product, so neural networks have achieved very excellent results in
machine vision-based defect detection technology is becoming computer vision, face recognition, image classification and
more and more widely used in the SMT industry. In recent recognition, and object detection. In image recognition and
years, with the rise of deep learning technology, more and more object detection, a large number of CNN-based studies are used,
new technologies based on convolutional neural networks have and there are also some researches using CNN in the field of
been added to machine vision detection [1-4]. component detection [12], Cai[13] proposed a novel cascaded
The common defects of electronic components that may convolutional neural network, three CNNs with different
occur during the installation on the PCB are as follows: wrong network parameters form the proposed cascaded CNN. A CNN
component, component shift, tombstones, missing component. is used to adaptively learn the region of interest (ROI) of SMT
The wrong component means that the component is installed in solder joint images. Then, feed the learned ROI and the entire
a mismatched position. The component shift means that the solder joint image to the other two CNNs respectively. Finally,
component has moved or rotated in the horizontal welding area. the examination results are obtained through the cascaded CNN
The tombstone indicates that the component is separated from learned.

This paragraph of the first footnote will contain the date on which you Wenbin Gao is with the Anhui University of Technology, China, 243032 (e-
submitted your paper for review. It will also contain support information, mail:wenbingao@126.com).Xiangrong Xu is with the the Anhui University of
including sponsor and financial support acknowledgment. Technology, China, 243032 (e-mail:xuxr88@qq.com). Sixiang Xu is with the
This work was supported in part by National Natural Science Foundation of the Anhui University of Technology, China, 243032 (e-mail:
China ( No. 51505002, 51605004 ) ,Anhui Provincial Natural Science 857393428@qq.com)
Foundation(1808085QE162)
Hao Wu is with Anhui Province Key Laboratory of Special Heavy Load
Robot, the Anhui University of Technology, China, 243032 (e-mail:
hao.wu@ahut.edu.cn)* Corresponding author.
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 2

In recent years, more advanced object detection methods The pipeline of our proposed automatically mask generating
Faster R-CNN [14] and instance segmentation method Mask R- method for electronic component inspection is shown in figure
CNN [15] have been proposed. The key component of them is 2. The original images are acquired by automated optical
the RPN (Region Proposal Network). The core idea of RPN is inspection [9] with the size of 200×100 ×3 pixels. The original
to densely sample the entire input image through many input image is segmented by the graph cut method and then is
overlapping bounding boxes of different shapes and sizes, and transformed to Masks. Secondly, the classification module
then train the network to generate multiple object proposals using Mask R-CNN method to position, classify and segment
(also called Region of Interest (RoI), this design allows RPN to the original image with label information and Masks.
smoothly search for features at different scales, RPN consists
of a convolutional neural network, input feature images, and
output the probability of bounding boxes and contained objects.
Faster R-CNN refines and classifies RoI through additional
fully connected layers on the basis of RPN, while Mask R-CNN
further improves Faster R-CNN by dividing instances with
additional convolutional layers in RoI, The specific structure of
the Mask R-CNN method has two main parts. The first part is
the Regional Proposal Network (RPN). After ROIAlign
alignment, they all enter the second part, namely the object
detection and segmentation mask prediction network.
Because Mask R-CNN has achieved good results in the
classification of object instances (obtained the best paper of the
International Conference on Computer Vision (ICCV)), it also
Figure 2. The pipeline of the proposed detection method for
has research used Mask R-CNN to classify, segment and locate
component inspection. (a)Original image, (b) Component
components in solder component defects detection, Hao et
segment, (c)Segmented Mask, (d)Detection result
al[16] proposed a component defect detection method based on
Mask R-CNN. The proposed Mask R-CNN contains three
branches with different tasks, namely, boundary box regression A. Mask Segmentation Module
and positioning branch, boundary box classifying branch, and In graph cut, the problem of image segmentation can be
segmentation branch to complete the tasks of positioning, described as binary (foreground and background) label
classifying and segmenting components. The regional proposal combination optimization process for each pixel in the image.
network used in the framework can accurately align the object And the core idea of the Graph cut model[18] is to construct an
boundary and obtain the location and classify the object, and energy function, and then through the weighted graph mapping
can accurately segment the defects through multiple and the application of network flow theory, the global
convolution layers and deconvolution and segmentation masks optimization solution of the labeling problem is transformed
in the segmentation branch. Experimental results prove that the into the corresponding maximum flow/minimum cut problem.
proposed method can accurately classify, locate and segment Image segmentation can be regarded as a problem of pixel
solder components. labeling. The label of the target is set to 1, and the label of the
However, when using the above-mentioned Mask R-CNN- background is set to 0. This process can minimize the energy
based component detection method for training, it is very time- function by minimizing graph cuts. It is obvious that the cut that
consuming and laborious to label and obtain the target image occurs at the boundary between the target and the background
Mask required for training (such as using LabelMe labeling is what we want (equivalent to cutting the place where the
tool to get the Masks, http://LabelMe .csail.mit.edu/Release3.0/ background and the target are connected in the image, then it is
[17] ), we propose a new simple and fast method for equivalent to dividing it). At the same time, the energy should
automatically generating Mask to train Mask R-CNN. The be minimal at this time. Suppose the label of the entire image
proposed method consists of two stages to automatically (the label of each pixel) is L = {l1, l2 ,.., lp}, where p is the total
generate the image Mask. The first stage is to generate Mask number of pixels in the image, li is 0 (background) or 1 (target).
based on Graph Cut, and the second stage is based on Mask R- Assuming that the image segmentation is L, the image energy
CNN object detection network, that is, the Mask output from can be expressed as:
Graph Cut, the original input image and the attached label 𝐸(𝐿) = aR(L) + B(L) (1)
information are used for training, and then test samples using
Among them, R (L) is the regional term, B (L) is the
the trained model.
boundary term, and a is an important factor between the
This paper is organized as follows: the sectionⅡwill discuss
regional term and the boundary term, which determines their
the proposed method in detail, followed by the implementation impact on energy. If a is 0, then only the boundary factors are
and experimental results as well as the discussions and analysis. considered. E (L) represents the weight, that is, the loss
function, also called the energy function. The goal of graph cut
is to optimize the energy function to minimize its value.
II. SYSTEM OVERVIEW In addition, because Graph Cuts needs to have two terminal
The proposed method includes two parts: The Mask nodes "S" and "T", respectively representing the initial target
segmentation module and Mask R-CNN classification module. area and background area, so need to manually specify the
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 3

initial s vertices and t vertices. For HCI, we can directly click perform non-maximum suppression, then enter into ROI Align
these two areas with the mouse. One of the areas represents a layer for feature extraction.
pixel position on the desired segmentation target, and the other Region of interest alignment (ROI Align layer): The ROI
is a pixel position on the background. With the definition of the Align layer extract the features of the corresponding ROI area
energy function and the initial terminal vertex, we can use graph selected in RPN in the whole image from the backbone
cut theory to continuously iterate and segment the target. network, get a fixed-size feature map of each ROI.
Classification branch and regression branch: further
B. Mask R-CNN Classification Module classification and regression of bounding box ROI after RPN
selection; the classification branch uses the Somax activation
The Mask R-CNN can effectively detect the object in the function.
image, and can generate high quality instance segmentation Segmentation mask branch: use fully convolutional network
results. Compared with single detection, it is a more elaborate to predict the segmentation mask
target detection task. At the same time, due to the model The loss function L of the entire network is defined as the
provided pixel-level annotations during training, and the target sum of classification, regression, and segmentation mask
object and background are classified with more information, so branch loss:
the results of the algorithm can be effectively improved.
L = Lcls + L box + L mask
The Mask R-CNN is based on the improvement of Faster R- (2)
CNN [19] and belongs to a two-stage method detector. As The classification branch loss function Lcls is defined as the
shown in Figure 3, the first stage is called the Regional Proposal cross entropy function, and the bounding box regression branch
Network (RPN), and its purpose is to propose regions of loss function is the SmoothL1 loss function, and the mask
interest. The second stage extracts feature of the proposed branch loss function is the average binary cross-entropy loss.
region of interest in the first stage and processes them into three
branches, one for classification, one for bounding box III. EXPERIMENTS
regression, and one for generating binary segmentation mask.
In this section, our proposed method is evaluated on the
The features of the two stages are extracted by the underlying
images of electronic component of PCB, the dataset and
backbone network and can be shared to use.
experiment setup is described firstly, then the segmentation
results using graph cut method are presented. Finally, the
classification results with Mask R-CNN are reported with
comparing methods.
A. Experimental Setup
The dataset of electronic component images is collected from
a factory using automated optical inspection equipment. All of
the components are evaluated by an expert, and labeled with
region and category. For the segmentation task, all the samples
have their own label image (Masks), which is an image and had
the same size as the original image. It is shown in Figure 2(b)
as before, the black pixels in the image had the value 0, which
represented the background region, and the central color map
represented the electronic component object.
The experimental system was developed on Python3.6 and
Tensorflow deep learning platforms with GPU. For the training
of the Mask R-CNN, the network was initialized and trained on
the Microsoft COCO dataset, then the transfer learning method
use the learning information on COCO dataset to improve the
accuracy on electronic component dataset. The Mask R-CNN’s
head layer is trained for 10 epochs with a learning rate of 0.0001
Figure 3 Mask R-CNN structure diagram firstly, and then the whole layers are trained for 30 epochs with
a learning rate of 0.00001.
The detail structure of Mask R-CNN method will be In order to evaluate our proposed method performance on
described as follows: electronic component detection, we use the mean of average
Backbone network: Use the ResNet Network [20] to extract precision (mAP) [21] as a metric, and the intersection over
the features of the input image to get feature map of the whole union (IoU) metric,which measure the overlap ratio between the
picture. detection results and the ground truth, IoU was defined as:
Regional Proposal Network (RPN): Using the sliding
window strategy, the original image is divided into anchor, all DetectionResult ⋂ GroundTruth
𝐼𝑜𝑈 =
anchor are classified and regressed with bounded box, and the DetectionResult ⋃ GroundTruth
region of interest (Rol) with a higher score is extracted to
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 4

B. Performance of Graph Cut component images to obtain the segmentation effect here. The
To evaluate the performance of the segmentation results FCN segmentation results are shown in Figure 4 (c1-c5), while
based on Graph cut, we compare our method with other Graph Cut only requires users to simply draw a line and other
traditional segmentation methods, which include Otsu operations to segment the component part, which is shown in
threshold [22]and FCN (fully convolutional network) Figure 4 (d1-d5). Therefore, the Graph Cut method is the
method[23]. The component region is the object area; these simplest and most convenient method to obtain the image
samples consist of different types of component mounted on the Mask.
PCB. C. Performance of Classification
Figure 4 shows the segmentation results. These include a
To quantitatively evaluate our proposed classification
good component sample (Figure 4 (a1)), defective component
method performance based on Mask R-CNN, we also compare
samples such as, wrong component (Figure 4 (a2)), component
it with traditional classification methods including the support
shift (Figure 4 (a3)), tombstones (Figure 4 (a4)), Missing
vector machine, principal component analysis and random
component (figure 4 (a5)).
forest ensemble learning, and with the aforementioned related
research using Mask R-CNN method.
(1) Eginsoder [24]: This is a classic method based on
principal component analysis, which includes feature value
extraction, selection and classification.
(2) Support vector machine [9]: This is a linear classification
model. Its basic model is a linear classifier with the largest
interval defined in the feature space.
(3) Ensemble learning[25]: Based on the random forest
algorithm, the geometry and color features extracted from the
components are classified.
(4) Mask R-CNN with LabelMe [15]: first label the image
with LabelMe , then train Mask R-CNN to obtain the model,
and test the sample.
(5) Our proposed method: first use the Graph Cut proposed
in this article to label and segment the image, then train the
Mask R-CNN to obtain the model, and test the sample.
According to the above method, five experiments for
component detection were carried out.
Table І shows the experimental results. It can be proved that
the method we proposed not only obtains the highest accuracy
rate with the Mask R-CNN method used in the paper in [15],
but also the mean of average precision (mAP) index of the
segmentation is 98.9% higher, the reason may be that the
annotation images used in the literature are artificially
mandatory annotation ( Use LabelMe labeling software), and
the method proposed in this article is an interactive labeling
method, which combines the features of the image itself with
the human labeling operation, so it can obtain better
segmentation results.
Figure 4 segmentation results. (a1) a good component sample.
(a2)- (a5) different types of defect component samples; (b1)- TABLE I COMPARING RESULT OF THE RECOGNITION ACCURACY AND OBJECT
(b5) segmentation results with the Otsu threshold method; (c1)- DETECTION WITH DIFFERENT PREVIOUS METHODS
(c5) results with the FCN method; (d1)- (d5) results with the Method Accuracy mAPbbox
Graph Cut method. Eginsoder [24] 96.43% ---
Support vector machine [9] 96.43% ---
It can be seen from Figure 4 that the Otsu threshold method
Ensemble learning [25] 89.29% ---
is not very suitable for the segmentation of components. Due to
the highlight of the solder pad area, the Otsu method will always Mask R-CNN with LabelMe [15] 100% 97.4%
highlight it during the segmentation process. As shown in Our proposed method 100% 98.9%
Figure 4 (b1) -(b5). When the electronic component is doped
with many highlight pixels in the solder pad area (the white For component shift defection, there are comparing results
pixels after segment), the segmentations fails in this task using between our proposed method and Mask R-CNN with
the Otsu method. LabelMe. Figure 7(a) shows the original component shift defect
For the FCN method and our proposed Graph Cut method, sample, figure 7(b) shows the Mask R-CNN with LabelMe
they have relatively close effects on most sample segmentation. detection method in [15] classify the component ambiguity with
However, FCN also needs to be trained first with the labeled two overlapping bounding boxes, the red dotted-line box with a
mAPbbox score of 0.934, the green dotted-line box with a
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 5

mAPbbox score 0.738. Figure 7(c) shows the detection result REFERENCES
with our proposed method, which detect the component [1] He D, Xu K, Wang D. Design of multi-scale receptive field convolutional
correctly with a mAPbbox score of 0.995. neural network for surface inspection of hot rolled steels[J]. Image and
Vision Computing, 2019, 89: 12-20.
[2] Wen S, Chen Z, Li C. Vision-based surface inspection system for bearing
rollers using convolutional neural networks[J]. Applied Sciences, 2018,
8(12): 2565.
[3] Sun J, Wang P, Luo Y K, et al. Surface defects detection based on adaptive
multiscale image collection and convolutional neural networks[J]. IEEE
Transactions on Instrumentation and Measurement, 2019, 68(12): 4787-
4797.
[4] Wang Y, Liu M, Zheng P, et al. A smart surface inspection system using
faster R-CNN in cloud-edge computing environment[J]. Advanced
Engineering Informatics, 2020, 43: 101037.
[5] Tao X, Zhang D, Ma W, et al. Automatic metallic surface defect detection
and recognition with convolutional neural networks[J]. Applied Sciences,
2018, 8(9): 1575.
[6] Tsai D M, Hsiao B. Automatic surface inspection using wavelet
reconstruction[J]. Pattern Recognition, 2001, 34(6): 1285-1305.
[7] Tsai D M, Hsieh C Y. Automated surface inspection for directional
textures[J]. Image and Vision computing, 1999, 18(1): 49-62.
[8] Chen S H, Perng D B. Directional textures auto-inspection using principal
component analysis[J]. The International Journal of Advanced
Manufacturing Technology, 2011, 55(9-12): 1099-1110.
[9] Wu H, Zhang X, Xie H, et al. Classification of solder joint using feature
selection based on Bayes and support vector machine[J]. IEEE
Transactions on Components, Packaging and Manufacturing Technology,
2013, 3(3): 516-522.
[10] Tsai T N. Development of a soldering quality classifier system using a
hybrid data mining approach[J]. expert systems with applications, 2012,
39(5): 5727-5738.
[11] Acciani G, Brunetti G, Fornarelli G. Application of neural networks in
optical inspection and classification of solder joints in surface mount
Figure 5. Component segmentation results in component technology[J]. IEEE Transactions on industrial informatics, 2006, 2(3):
shift defect. (a)original component shift sample, (b) ambiguity 200-209.
[12] Wei P, Liu C, Liu M, et al. CNN-based reference comparison method for
segmentation results, (c) correct segmentation results using our classifying bare PCB defects[J]. The Journal of Engineering, 2018,
proposed method. 2018(16): 1528-1533.
[13] Cai N, Cen G, Wu J, et al. SMT solder joint inspection via a novel
IV. CONCLUSIONS cascaded convolutional neural network[J]. IEEE Transactions on
Components, Packaging and Manufacturing Technology, 2018, 8(4): 670-
When MASK R-CNN method is used to locate, classify and 677.
segment electronic components, it is very time-consuming and [14] Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object
detection with region proposal networks[J]. IEEE Transactions on Pattern
laborious to label and obtain the image Mask required for Mask Analysis & Machine Intelligence, 2017 (6): 1137-1149.
R-CNN training, so we propose a new method of automatically [15] He K, Gkioxari G, Dollár P, et al. Mask r-cnn[C]//Computer Vision
generating image Mask to train the Mask R-CNN, which is (ICCV), 2017 IEEE International Conference on. IEEE, 2017: 2980-2988.
[16] Wu H, Gao W, Xu X. Solder joint recognition using Mask R-CNN
based on the Graph cut method. method[J]. IEEE Transactions on Components, Packaging and
Experiments results verify the effectiveness of our proposed Manufacturing Technology, 2020, 10(3): 525-530.
method. Not only can we quickly and easily obtain the Mask [17] Russell B C , Torralba A , Murphy K P , et al. LabelMe : A Database and
Web-Based Tool for Image Annotation[J]. International Journal of
required for Mask R-CNN training, but also ensure the Computer Vision, 2008, 77(1-3).
performance of Mask R-CNN positioning, classification and [18] Boykov Y Y, Jolly M P. Interactive graph cuts for optimal boundary &
segmentation. The method we proposed can be applied to region segmentation of objects in ND images[C]//Proceedings eighth
IEEE international conference on computer vision. ICCV 2001. IEEE,
automatic optical inspection equipment, and can also be easily 2001, 1: 105-112.
extended to other fields that require image annotation during [19] Girshick R. Fast r-cnn[C]//Proceedings of the IEEE international
training. conference on computer vision. 2015: 1440-1448.
[20] He K, Zhang X, Ren S, et al. Deep residual learning for image
recognition[C]//Proceedings of the IEEE conference on computer vision
ACKNOWLEDGMENT and pattern recognition. 2016: 770-778.
[21] C. Manning, P. Raghavan, and H. Schtze, Introduction to Information
This research was supported in part by the National Natural
Retrieval, vol. 39. Cambridge University Press.
Science Foundation of China (No. 51605004) and the Anhui [22] Otsu N. A threshold selection method from gray-level histograms[J].
Provincial Natural Science Foundation (1808085QE162). The IEEE transactions on systems, man, and cybernetics, 1979, 9(1): 62-66.
authors would also like to thank Dr Jan Paul Siebert, School of [23] Long J, Shelhamer E, Darrell T. Fully convolutional networks for
semantic segmentation[C]//Proceedings of the IEEE conference on
Computing Science, University of Glasgow, for his friendly computer vision and pattern recognition. 2015: 3431-3440.
hosting and instruction during the authors’ visiting research in [24] Wu, Hao, and Xiangrong Xu. "Solder joint inspection using eigensolder
UK. This support is greatly acknowledged. features." Soldering & Surface Mount Technology30.4 (2018): 227-232.
[25] Wu H. Solder joint defect classification based on ensemble learning[J].
Soldering & Surface Mount Technology, 2017, 29(3): 164-170.
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 6

Hao Wu received the B.S. degree in mechanical engineering


from Central South University, Changsha, China, in 2008, and
the Ph.D. degrees in mechanical
engineering and automation from South
China University of Technology,
Guangzhou, China, in 2013. From 2013 to
2014, he was an assistant professor at
Guangzhou institute of energy
conversion, Chinese academy of sciences,
where his work focused on machine
vision research and development. Currently, he is a Lecturer of
mechanical engineering at the Anhui University of Technology,
China. His research interests include automated visual
inspection, machine vision and pattern recognition

Wenbin Gao, Ph.D, He is currently an associate professor in the


School of Mechanical Engineering, Anhui University of
Technology, Ma’anshan, Anhui, 243002, China. His main
research direction is robotics.

Xiangrong Xu, Ph.D, He is currently a professor in the School


of Mechanical Engineering, Anhui University of Technology,
Ma’anshan, Anhui, 243002, China. His main research direction
is the principle and method of robot dynamics and trajectory
planning control .

Sixiang Xu, Ph.D, He is currently a professor in the School of


Mechanical Engineering, Anhui University of Technology,
Ma’anshan, Anhui, 243002, China. His main research direction
is robot vision and control.

View publication stats

You might also like