Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Machine Vision

MACHINE VISION

Machine vision can be defined as the acquisition of image data, followed by the process-
ing and interpretation of these data by computer for some useful application. Machine vi-
sion (also called computer vision, since a digital computer is required to process the image
data) is a rapidly growing technology, with its principal applications in industrial inspection.
In this section, we examine how machine vision works and its applications in QC inspec-
tion and other areas.
Vision systems are classified as being either 2-D or 3-D. Two-dimensional systems
view the scene as a 2-D image.This is quite adequate for most industrial applications, since
many situations involve a 2-D scene. Examples include dimensional measuring and gag-
ing, verifying the presence of components, and checking for features on a flat (or semiflat)
surface. Other applications require 3-D analysis of the scene, and 3-D vision systems are
required for this purpose. Sales of 2-D vision systems outnumber those of 3-D systems by
Figure 23.10 Basic functions of a machine vision system.

more than ten to one [7]. Our discussion will emphasize the simpler 2-D systems, although
many of the techniques used for 2-D are also applicable in 3-D vision work.
The operation of a machine vision system can be divided into the following three
functions: (1) image acquisition and digitization, (2) image processing and analysis, and
(3) interpretation. These functions and their relationships are illustrated schematically in
Figure 23.10.

23.6.1 Image Acquisition and Digitization

Image acquisition and digitization is accomplished using a video camera and a digitizing sys-
tem to store the image data for subsequent analysis. The camera is focused on the subject
of interest, and an image is obtained by dividing the viewing area into a matrix of discrete
picture elements (called pixels), in which each element has a value that is proportional to
the light intensity of that portion of the scene.The intensity value for each pixel is converted
into its equivalent digital value by an ADC (Section 5.3). The operation of viewing a scene
consisting of a simple object that contrasts substantially with its background, and dividing
the scene into a corresponding matrix of picture elements, is depicted in Figure 23.11.
The figure illustrates the likely image obtained from the simplest type of vision sys-
tem, called a binary vision system. In binary vision, the light intensity of each pixel is ulti-
mately reduced to either of two values, white or black, depending on whether the light
intensity exceeds a given threshold level. A more sophisticated vision system is capable of
distinguishing and storing different shades of gray in the image. This is called a gray-scale
system. This type of system can determine not only an object's outline and area charac-
teristics, but also its surface characteristics such as texture and color. Gray-scale vision sys-
tems typically use 4, 6, or 8 bits of memory. Eight bits corresponds to 28- 256 intensity
levels, which is generally more levels than the video camera can really distinguish and cer-
tainly more than the human eye can discern.
Each set of digitized pixel values is referred to as a frame. Each frame is stored in a
computer memory device called a frame buffer.The process of reading all the pixel values
Figure 23.11 Dividing the image into a matrix of picture elements,
where each element has a light intensity value corresponding to that
portion of the image: (a) the scene; (b) 1 2 x 1 2 matrix superimposed
on the scene; and (c) pixel intensity values, either black or white, for
the scene.

in a frame is performed with a frequency of 30 times per second (typical in the United
States, 25 times per second in European vision systems).

Types of Cameras. Two types of cameras are used in machine vision applications:
vidicon cameras (the type used for television) and solid-state cameras. Vidicon cameras
operate by focusing the image onto a photoconductive surface and scanning the surface with
an electron beam to obtain the relative pixel values. Different areas on the photoconduc-
tive surface have different voltage levels corresponding to the light intensities striking the
areas. The electron beam follows a well-defined scanning pattern, in effect dividing the
surface into a large number of horizontal lines, and reading the lines from top-to-bottom.
Each line is in turn divided into a series of points. The number of points on each line, mul-
tiplied by the number of lines, gives the dimensions of the pixel matrix shown in Figure
23.11. During the scanning process, the electron beam reads the voltage level of each pixel.
Solid-state cameras operate by focusing the image onto a 2-D array of very small,
finely spaced photosensitive eleihents.The photosensitive elements form the matrix of pix-
els shown in Figure 23.11. An electrical charge is generated by each element according to
the intensity of light striking the element. The charge is accumulated in a storage device con-
sisting of an array of storage elements corresponding one-to-one with the photosensitive
picture elements. These charge values are read sequentially in the data processing and
analysis function of machine vision.
Comparing the vidicon camera and solid-state camera, the latter possesses several ad-
vantages in industrial applications. It is physically .smaller and more rugged, and the image
produced is more stable. The vidicon camera suffers from distortion that occurs in the
image of a fast-moving object because of the time lapse associated with the scanning elec-
tron beam as it reads the pixel levels on the photoconduclive surface. The relative advan-
tages of the solid-state cameras have resulted in the growing dominance of their use in
machine vision systems. Types of solid-state cameras include: (1) charge-coupled-device
(CCD), (2) charge-injected device (CID), and (3) charge-priming device (CPD). These
types are compared in [8j.
Typical square pixel arrays are 256 X 256,512 X 512, and 1024 X 1024 picture ele-
ments. Other arrays include 240 X 320, 500 X 582, and 1035 X 1320 pixels [24]. The
resolution of the vision system is its ability to sense fine details and features in the image.
Resolution depends on the number of picture elements used; the more pixels designed
into the vision system, the higher its resolution. However, the cost of the camera increas-
es as the number of pixels is increased. Even more important, the time required to se-
quentially read the picture elements and process the data increases as the number of pixels
grows. The following example illustrates the problem.

EXAMPLE 23.4 Machine Vision


A video camera has a 512 X 512 pixel matrix. Each pixel must be converted
from an analog signal to the corresponding digital signal by an ADC. The ana-
log-to-digital conversion process takes 0.1 microseconds (0.1 X 10~6sec) to com-
plete, including the time to move between pixels. How long will it take to collect
the image data for one frame, and is this time compatible with processing at the
rate of 30 frames per second?
Solution: There are 512 X 512 = 262,144 pixels to be scanned and converted. The total
time to complete the analog-to-digital conversion process is
(262,144 pixels) (0.1 X 10-6 sec) = 0.0262 sec
At a processing rate of 30 frames per second, the processing time for each frame
is 0.0333 sec, which is significantly longer than the 0.0262 sec required to per-
form the 262,144 analog-to-digital conversions.

Illumination. Another important aspect of machine vision is illumination. The


scene viewed by the vision camera must be well illuminated, and the illumination must be
constant over time. This almost always requires that special lighting be installed for a ma-
chine vision application rather than rely on ambient lighting in the facility.
Five categories of lighting can be distinguished for machine vision applications, as
depicted in Figure 23.12: (a) front lighting, (b) back lighting, (c) side lighting, (d) structured
lighting, and (e) strobe lighting. These categories represent differences in the positions of
the light source relative to the camera as much as they do differences in lighting tech-
nologies. The lighting technologies include incandescent lamps, fluorescent lamps, sodium
vapor lamps, and lasers.
Figure 23.12 Types of illumination in machine vision: (a) front light-
ing, (b) back lighting, (c) side lighting, (d) structured lighting using
a planar sheet of light, and (e) strobe lighting.
In front lighting, the light source is located on the same side of the object as the cam-
era.This produces a reflected light from the object that allows inspection of surface features
such as printing on a label and surface patterns such as solder lines on a printed circuit
board. In back lighting, the light source is placed behind the object being viewed by the cam-
era. This creates a dark silhouette of the object that contrasts sharply with the light back-
ground. This type of lighting can be used for binary vision systems to inspect for part
dimensions and to distinguish between different part outlines. Side lighting causes irregu-
larities in an otherwise plane smooth surface to cast shadows that can be identified by the
vision system. This can be used to inspect for defects and flaws in the surface of an object.
Structured lighting involves the projection of a special light pattern onto the object
to enhance certain geometric features. Probably the most common structured light pat-
tern is a planar sheet of highly focused light directed against the surface of the object at a
certain known angle, as in Figure 23.12(d). The sheet of light forms a bright line where the
beam intersects the surface. In our sketch, the vision camera is positioned with its line of
sight perpendicular to the surface of the object, so that any variations from the general
plane of the part appear as deviations from a straight line. The distance of the deviation can
be determined by optical measurement, and the corresponding elevation differences can
be calculated using trigonometry.
In strobe lighting, the scene is illuminated by a short pulse of high-intensity light,
which causes a moving object to appear stationary. The moving object might be a part mov-
ing past the vision.camera on a conveyor. The pulse of light can last 5-500 microseconds
[8].This is sufficient time for the camera to capture the scene, although the camera actua-
tion must be synchronized with that of the strobe light.

23.6.2 Image Processing and Analysis

The second function in the operation of a machine vision system is image processing and
analysis. As indicated by Example 23.4, the amount of data that must be processed is sig-
nificant. The data for each frame must be analyzed within the time required to complete
one scan (1/30 sec). A number of techniques have been developed for analyzing the image
data in a machine vision system. One category of techniques in image processing and analy-
sis is called segmentation. Segmentation techniques are intended to define and separate re-
gions of interest within the image. Two of the common segmentation techniques are
thresholding and edge detection. Thresholding involves the conversion of each pixel in-
tensity level into a binary value, representing either white or black. This is done by com-
paring the intensity value of each pixel with a defined threshold value. If the pixel value is
greater than the threshold, it is given the binary bit value of white, say 1; if less than the de-
fined threshold, then it is given the bit value of black, say 0. Reducing the image to binary
form by means of thresholding usually simplifies the subsequent problem of defining and
identifying objects in the image. Edge detection is concerned with determining the location
of boundaries between an object and its surroundings in an image.This is accomplished by
identifying the contrast in light intensity that exists between adjacent pixels at the borders
of the object. A number of software algorithms have been developed for following the bor-
der around the object.
Another set of techniques in image processing and analysis that normally follows
segmentation is feature extraction. Most machine vision systems characterize an object in
the image by means of the object's features. Some of the features of an object include the
object's area, length, width, diameter, perimeter, center of gravity, and aspect ratio. Fea-
ture extraction methods are designed to determine these features based on the area and
boundaries of the object (using thresholding, edge detection, and other segmentation tech-
niques). For example, the area of the object can be determined by counting the number of
white (or black) pixels that make up the object. Its length can be found by measuring the
distance (in terms of pixels) between the two extreme opposite edges of the part.

23.6.3 Interpretation

For any given application, the image must be interpreted based on the extracted features.
The interpretation function is usually concerned with recognizing the object, a task termed
object recognition or pattern recognition.The objective in these tasks is to identify the ob-
ject in the image by comparing it with predefined models or standard values. Two com-
monly used interpretation techniques are template matching and feature weighting.
Template matching is the name given to various methods that attempt to compare one or
more features of an image with the corresponding features of a model or template stored
in computer memory. The most basic template matching technique is one in which the
image is compared, pixel by pixel, with a corresponding computer model. Within certain sta-
tistical tolerances, the computer determines whether the image matches the template. One
of the technical difficulties with this method is the problem of aligning the part in the same
position and orientation in front of the camera, to allow the comparison to be made with-
out complications in image processing.
Feature weighting is a technique in which several features (e.g., area, length, and
perimeter) are combined into a single measure by assigning a weight to each feature ac-
cording to its relative importance in identifying the object. The score of the object in the
image is compared with the score of an ideal object residing in computer memory to achieve
proper identification.

23.6.4 Machine Vision Applications

The reason for interpreting the image is to accomplish some practical objective in an ap-
plication. Machine vision applications in manufacturing divide into three categories: (1) in-
spection, (2) identification, and (3) visual guidance and control.

Inspection. By far, quality control inspection is the biggest category. Estimates are
that inspection constitutes about 80% of machine vision applications [22]. Machine vision
installations in industry perform a variety of automated inspection tasks, most of which are
either on-line/in-process or on-line/post-process. The applications are almost always in mass
production where the time required to program and set up the vision system can be spread
over many thousands of units. Typical industrial inspection tasks include the following:

• Dimensional measurement. These applications involve determining the size of cer-


tain dimensional features of parts or products usually moving at relatively high speeds
on a moving conveyor. The machine vision system must compare the features (di-
mensions) with the corresponding features of a computer-stored model and determine
the size value.
• Dimensional gaging. This is similar to the preceding except that a gaging function
rather than a measurement is performed.
• Verification of the presence of components in an assembled product. Machine vision
has proved to be an important element in flexible automated assembly systems.
• Verification of hole location and number of holes in a part. Operationally, this task
is similar to dimensional measurement and verification of components
• Detection of surface flaws and defects. Flaws and defects on the surface of a part or
material often reveal themselves as a change in reflected light. The vision system can
identify the deviation from an ideal model of the surface.
• Detection of flaws in a printed label. The defect can be in the form of a poorly lo-
cated label or poorly printed text, numbering, or graphics on the label.

All of the preceding inspection applications can be accomplished using 2-D vision systems.
Certain applications require 3-D vision, such as scanning the contour of a surface, inspecting
cuttingtools to check for breakage and wear, and checking solder paste deposits on sur-
face mount circuit boards. Three-dimensional systems are being used increasingly in the au-
tomotive industry to inspect surface contours of parts such as body panels and dashboards.
Vision inspection can be accomplished at much higher speeds than the traditional method
of inspecting these components, which involves the use of CMMs.

Other Machine Vision Applications. Part identification applications are those in


which the vision system is used to recognize and perhaps distinguish parts or other objects
so that some action can be taken. The applications include part sorting, counting different
types of parts flowing past along a conveyor, and inventory monitoring. Part identification
. can usually be accomplished by 2-D vision systems. Reading of 2-D bar codes and charac-
ter recognition (Sections 12.3.3 and 12.3.4) represent additional identification applications
performed by 2-D vision systems.
Visual guidance and control involves applications in which a vision system is teamed
with a robot or similar machine to control the movement of the machine, Examples of
these applications include seam tracking in continuous arc welding, part positioning and/or
reorientation, bin picking, collision avoidance, machining operations, and assembly tasks.
Most of these applications require 3-D vision.

You might also like