Professional Documents
Culture Documents
GROOVER - Machine Vision
GROOVER - Machine Vision
MACHINE VISION
Machine vision can be defined as the acquisition of image data, followed by the process-
ing and interpretation of these data by computer for some useful application. Machine vi-
sion (also called computer vision, since a digital computer is required to process the image
data) is a rapidly growing technology, with its principal applications in industrial inspection.
In this section, we examine how machine vision works and its applications in QC inspec-
tion and other areas.
Vision systems are classified as being either 2-D or 3-D. Two-dimensional systems
view the scene as a 2-D image.This is quite adequate for most industrial applications, since
many situations involve a 2-D scene. Examples include dimensional measuring and gag-
ing, verifying the presence of components, and checking for features on a flat (or semiflat)
surface. Other applications require 3-D analysis of the scene, and 3-D vision systems are
required for this purpose. Sales of 2-D vision systems outnumber those of 3-D systems by
Figure 23.10 Basic functions of a machine vision system.
more than ten to one [7]. Our discussion will emphasize the simpler 2-D systems, although
many of the techniques used for 2-D are also applicable in 3-D vision work.
The operation of a machine vision system can be divided into the following three
functions: (1) image acquisition and digitization, (2) image processing and analysis, and
(3) interpretation. These functions and their relationships are illustrated schematically in
Figure 23.10.
Image acquisition and digitization is accomplished using a video camera and a digitizing sys-
tem to store the image data for subsequent analysis. The camera is focused on the subject
of interest, and an image is obtained by dividing the viewing area into a matrix of discrete
picture elements (called pixels), in which each element has a value that is proportional to
the light intensity of that portion of the scene.The intensity value for each pixel is converted
into its equivalent digital value by an ADC (Section 5.3). The operation of viewing a scene
consisting of a simple object that contrasts substantially with its background, and dividing
the scene into a corresponding matrix of picture elements, is depicted in Figure 23.11.
The figure illustrates the likely image obtained from the simplest type of vision sys-
tem, called a binary vision system. In binary vision, the light intensity of each pixel is ulti-
mately reduced to either of two values, white or black, depending on whether the light
intensity exceeds a given threshold level. A more sophisticated vision system is capable of
distinguishing and storing different shades of gray in the image. This is called a gray-scale
system. This type of system can determine not only an object's outline and area charac-
teristics, but also its surface characteristics such as texture and color. Gray-scale vision sys-
tems typically use 4, 6, or 8 bits of memory. Eight bits corresponds to 28- 256 intensity
levels, which is generally more levels than the video camera can really distinguish and cer-
tainly more than the human eye can discern.
Each set of digitized pixel values is referred to as a frame. Each frame is stored in a
computer memory device called a frame buffer.The process of reading all the pixel values
Figure 23.11 Dividing the image into a matrix of picture elements,
where each element has a light intensity value corresponding to that
portion of the image: (a) the scene; (b) 1 2 x 1 2 matrix superimposed
on the scene; and (c) pixel intensity values, either black or white, for
the scene.
in a frame is performed with a frequency of 30 times per second (typical in the United
States, 25 times per second in European vision systems).
Types of Cameras. Two types of cameras are used in machine vision applications:
vidicon cameras (the type used for television) and solid-state cameras. Vidicon cameras
operate by focusing the image onto a photoconductive surface and scanning the surface with
an electron beam to obtain the relative pixel values. Different areas on the photoconduc-
tive surface have different voltage levels corresponding to the light intensities striking the
areas. The electron beam follows a well-defined scanning pattern, in effect dividing the
surface into a large number of horizontal lines, and reading the lines from top-to-bottom.
Each line is in turn divided into a series of points. The number of points on each line, mul-
tiplied by the number of lines, gives the dimensions of the pixel matrix shown in Figure
23.11. During the scanning process, the electron beam reads the voltage level of each pixel.
Solid-state cameras operate by focusing the image onto a 2-D array of very small,
finely spaced photosensitive eleihents.The photosensitive elements form the matrix of pix-
els shown in Figure 23.11. An electrical charge is generated by each element according to
the intensity of light striking the element. The charge is accumulated in a storage device con-
sisting of an array of storage elements corresponding one-to-one with the photosensitive
picture elements. These charge values are read sequentially in the data processing and
analysis function of machine vision.
Comparing the vidicon camera and solid-state camera, the latter possesses several ad-
vantages in industrial applications. It is physically .smaller and more rugged, and the image
produced is more stable. The vidicon camera suffers from distortion that occurs in the
image of a fast-moving object because of the time lapse associated with the scanning elec-
tron beam as it reads the pixel levels on the photoconduclive surface. The relative advan-
tages of the solid-state cameras have resulted in the growing dominance of their use in
machine vision systems. Types of solid-state cameras include: (1) charge-coupled-device
(CCD), (2) charge-injected device (CID), and (3) charge-priming device (CPD). These
types are compared in [8j.
Typical square pixel arrays are 256 X 256,512 X 512, and 1024 X 1024 picture ele-
ments. Other arrays include 240 X 320, 500 X 582, and 1035 X 1320 pixels [24]. The
resolution of the vision system is its ability to sense fine details and features in the image.
Resolution depends on the number of picture elements used; the more pixels designed
into the vision system, the higher its resolution. However, the cost of the camera increas-
es as the number of pixels is increased. Even more important, the time required to se-
quentially read the picture elements and process the data increases as the number of pixels
grows. The following example illustrates the problem.
The second function in the operation of a machine vision system is image processing and
analysis. As indicated by Example 23.4, the amount of data that must be processed is sig-
nificant. The data for each frame must be analyzed within the time required to complete
one scan (1/30 sec). A number of techniques have been developed for analyzing the image
data in a machine vision system. One category of techniques in image processing and analy-
sis is called segmentation. Segmentation techniques are intended to define and separate re-
gions of interest within the image. Two of the common segmentation techniques are
thresholding and edge detection. Thresholding involves the conversion of each pixel in-
tensity level into a binary value, representing either white or black. This is done by com-
paring the intensity value of each pixel with a defined threshold value. If the pixel value is
greater than the threshold, it is given the binary bit value of white, say 1; if less than the de-
fined threshold, then it is given the bit value of black, say 0. Reducing the image to binary
form by means of thresholding usually simplifies the subsequent problem of defining and
identifying objects in the image. Edge detection is concerned with determining the location
of boundaries between an object and its surroundings in an image.This is accomplished by
identifying the contrast in light intensity that exists between adjacent pixels at the borders
of the object. A number of software algorithms have been developed for following the bor-
der around the object.
Another set of techniques in image processing and analysis that normally follows
segmentation is feature extraction. Most machine vision systems characterize an object in
the image by means of the object's features. Some of the features of an object include the
object's area, length, width, diameter, perimeter, center of gravity, and aspect ratio. Fea-
ture extraction methods are designed to determine these features based on the area and
boundaries of the object (using thresholding, edge detection, and other segmentation tech-
niques). For example, the area of the object can be determined by counting the number of
white (or black) pixels that make up the object. Its length can be found by measuring the
distance (in terms of pixels) between the two extreme opposite edges of the part.
23.6.3 Interpretation
For any given application, the image must be interpreted based on the extracted features.
The interpretation function is usually concerned with recognizing the object, a task termed
object recognition or pattern recognition.The objective in these tasks is to identify the ob-
ject in the image by comparing it with predefined models or standard values. Two com-
monly used interpretation techniques are template matching and feature weighting.
Template matching is the name given to various methods that attempt to compare one or
more features of an image with the corresponding features of a model or template stored
in computer memory. The most basic template matching technique is one in which the
image is compared, pixel by pixel, with a corresponding computer model. Within certain sta-
tistical tolerances, the computer determines whether the image matches the template. One
of the technical difficulties with this method is the problem of aligning the part in the same
position and orientation in front of the camera, to allow the comparison to be made with-
out complications in image processing.
Feature weighting is a technique in which several features (e.g., area, length, and
perimeter) are combined into a single measure by assigning a weight to each feature ac-
cording to its relative importance in identifying the object. The score of the object in the
image is compared with the score of an ideal object residing in computer memory to achieve
proper identification.
The reason for interpreting the image is to accomplish some practical objective in an ap-
plication. Machine vision applications in manufacturing divide into three categories: (1) in-
spection, (2) identification, and (3) visual guidance and control.
Inspection. By far, quality control inspection is the biggest category. Estimates are
that inspection constitutes about 80% of machine vision applications [22]. Machine vision
installations in industry perform a variety of automated inspection tasks, most of which are
either on-line/in-process or on-line/post-process. The applications are almost always in mass
production where the time required to program and set up the vision system can be spread
over many thousands of units. Typical industrial inspection tasks include the following:
All of the preceding inspection applications can be accomplished using 2-D vision systems.
Certain applications require 3-D vision, such as scanning the contour of a surface, inspecting
cuttingtools to check for breakage and wear, and checking solder paste deposits on sur-
face mount circuit boards. Three-dimensional systems are being used increasingly in the au-
tomotive industry to inspect surface contours of parts such as body panels and dashboards.
Vision inspection can be accomplished at much higher speeds than the traditional method
of inspecting these components, which involves the use of CMMs.