Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 126

1

Acknowledgements

We are highly indebted to Dr. A. S. V. Sarma, scientist-in-charge at Central Electronics Engineering Research Institute (CEERI) Centre, Chennai. His valuable guidance and support as a project guide and project manager for the on-plant training immensely helped us to learn and remain motivated during the training. We would also like to thank Dr. P. K. Chatley, head of department of training and placement centre at Dr. B. R. Ambedkar National Institute of Technology, Jalandhar for the approval and organizational proceedings on the training. Our special thanks go to Dr. Arun Khosla, head of department of Electronics and Communication Engineering at the institute for his consent and recommendations for the training. We are grateful to Mrs. Thilaka Mohandoss, senior technician (I) at CEERI Centre, Chennai for her kind support in organizational registration formalities and lastly, we would like to thank the scientists at various labs and central library at CEERI Centre, Chennai for their consultancy.

INTRODUCTION
The on-plant industrial training was on Study of Inspection Systems and Implementation of ART Neural Networks as a re-trainable classifier. Machine vision based inspection system is a widely accepted technology that is being used in the manufacturing industry for applications including quality assurance and control. Machine vision has become a production line staple in most industries. In the year of 2010 the machine vision market soared more than 50% across Northern America and in the world as a continuing trend from 1980s. The advancement of the electronic technology is the main driver behind this expansion since 1980s when processing chips made it possible to create smart cameras, faster communicating bus technology surged and most importantly the digital image processing technology with the soft computing robust algorithms evolved. The heart of such inspection system is the processing technology which needs to provide the quality assurance and control in the real world problems of imprecise and uncertain situations. Presently, most industrial environments require the inspection system to be rugged, robust and flexible enough to cope up with the constantly changing real time situations. A large variety of situations require the system to learn new parameters or classes of situational variables as and when a new breed of situations becomes apparent. This can, at least at present, be fulfilled to a very large extent by the retrainable classifier. The Adaptive Resonance Theory (ART) neural networks impart the system with the so-called stability to remember the prior learned patterns as well as the plasticity to learn a new situation, giving the classifier its required re-trainability. There is a requirement for such a system in todays industrial environment. The outline of the project included the study of these ANN from the basic to ART networks from respective coherent (with respect to pattern recognition and classification) textbooks and references to internet articles, rarely. From a historical, many researchers have been influenced and inspired by the complex structure and working of the human brain and its versatility. In particular, the ART networks were inspired by the brains ability to learn new concepts while r etaining those learnt before. The theory of these ART networks was formulated and proposed by Stephen Grossberg and Gail Carpenter, both being pioneers of the fields of cognitive sciences & neural sciences and proficient mathematicians. Since the 1980s both, in conjunction with their students and colleagues, have developed a large variety of ART networks. Therefore we, as an electronics engineering students, underwent the on-plant training research on the implementation of ART neural networks as re-trainable classifier in inspection systems. This required thorough background knowledge of the basics of machine vision systems and the importance of a classifier in the system process cycle. Thus, a number references from variety of machine vision books, journals, documents were taken along with the interaction with the scientists working in labs at CEERI Centre Chennai, in order to be sufficiently comfortable with the practical industrial environmental factors affecting the accuracy, stability and performance quotient of the system being developed, some of which included illumination techniques as well as sources, importance of imaging sensors, imaging and image processing techniques, hardware technology, etc. This also highlighted the importance of inspection systems in the industrial environment.

MACHINE VISION BASED INSPECTION SYSTEM


1. Basics of Machine Vision Systems
I. What is machine vision?
A machine vision system comprises of a group of devices to receive the image of a real scene, analyze for the interested or targeted objects while interpreting to arrive at a decision based on some predefined criteria set according to the applications and use. Machine vision provides cost and quality benefits by replacing human vision on tasks that are fast, repetitive and require exact measurements. Machine vision has three general processes: 1) Location or Search finds the position of the objects of interest. When machine vision is used for guiding a robot this task is called alignment and when used to follow a moving object it is called tracking. 2) Identification is selecting a particular object from a set of possible objects. When Optical Character Recognition (OCR) or bar codes are used to identify an object, identification is called reading. 3) Inspection checks that whether the object has the proper dimensions, meets quality standards, is free of some class of defects, etc. A very important feature or requirement of a machine vision system is knowledge about the world or some pre-defined prototypes/models of real world objects, their features, geometrical properties and the relationship amongst them which adds the robustness, flexibility and adaptive nature of the machine vision systems. II. Difference between machine and computer vision Machine Vision: Machine vision is basically the application of computer vision to factory automation. Machine vision tends to focus on applications, mainly in manufacturing, e.g., vision based autonomous robots and systems for vision based inspection or measurement. It also implies that the external conditions such as lighting can be and are often more controlled in machine vision than they are in general computer vision, which can enable the use of different algorithms.

Computer Vision: Computer vision is concerned with the theory behind artificial systems that extract information from images that is necessary to solve some task. Computer vision tends to focus on the 3D scene projected onto one or several images, e.g., how to reconstruct structure or other information about the 3D scene from one or several images. Computer vision often relies on more or less complex assumptions about the scene depicted in an image.

III. Differences between machine and human vision Machine Vision Machine vision provides cost and quality benefits by replacing human vision on tasks that are fast, repetitive and require exact measurements. Machine vision is the application of computer vision to factory automation. Human Vision Human vision gets fatigue, oversight at tasks that are fast, repetitive and require exact measurements which leads to error.

Human Vision is a biological vision based on the binocular eyes and brain.

Machine vision systems use digital cameras and image processing software to visually inspect parts to judge the quality of workmanship. Machine vision systems have lesser intelligence and the learning capability than that of human.

Human inspectors work on assembly lines to perform similar inspections.

Machine Vision is yet to replace human vision.

Machine vision can work in UV and IR ranges.

Ultraviolent and Infra red is invisible to human vision.

IV. Components of machine vision based inspection systems A simple machine vision system will consist of the following: 1. Illumination System 2. Optics 3. A camera 4. Camera interface card for computer, known as "frame grabber" 5. Computer software to process images 6. Digital signal hardware or a network connection to report results
Figure 1.1: Components of machine vision system

5 V. Importance & their basic features 1.Illumination system The purpose of the illumination system in machine vision system is to control the lighting on the products to be inspected so as to improve how the image is appeared to the camera. Hence it determines the quality of the image produced for the processing. Thus illumination is one of the critical aspects of machine vision systems whose optimization can minimize efforts, time and resources. There are different sources of illumination including LED, florescent lamps, halogen lamps, xenon, mercury lamps and high Pressure Sodium which are used according to the need of applications. 1. Optics Cameras used in the domains of industry, medicine and science are usually shipped without a lens. Therefore, analysis and adjustment of the optics according to the requirements of the applications of the machine vision system is needed. On the basis of the results of the analysis, lens for the system is ordered and mount. Due to their standardized mount, in machine vision, the so-called C mount lenses are widely used. The calculation of these lenses only requires three parameters: an addition, a multiplication and a division. 2. Camera Camera basically transforms optical signals (light) into electrical ones (voltage) and digitizes them (raw digital image). Camera consists of the image sensor that converts the photons of light into the electrical signals. Today digital still cameras either use CCD (charge coupled device) imaging sensor or CMOS (complementary metal oxide semiconductor) imaging sensor. Any one of the discrete values coming out of the A/D employed to digitalized image is a pixel. This is the smallest distinguishable area in an image. CCD cameras are becoming smaller, lighter and less expensive. Images are sharper and more accurate, and the new dual output cameras produce images twice as fast as previous models. CMOS cameras are preferred over the applications which needs high speed performance with low costs. 3. Frame grabbers Frame grabbers are the specialized A/D converters which change video or still images into digital information. Most frame grabbers are printed circuit boards compatible with the most common types of bus structures, including peripheral component interconnect (PCI). A frame grabber can be of 2 types: Analog and Digital. In case of an analog frame grabber, we use an analog camera and digital camera for Digital frame grabber. Today's frame grabbers offer greater stability and accuracy than earlier models, and some can even handle image processing and enhancement, using digital signal-processing techniques. 4. Computer Software The software will typically take several steps to process an image. Often the image is first manipulated to reduce noise or to convert many shades of gray to a simple combination of black and white. Following the initial simplification, the software will count, measure, and/or identify objects in the image. As a final step, the software passes or fails the part according to programmed criteria. If a part fails, the software signals a robotic device to reject the part; alternately, the system may warn a human worker to fix the production problem that caused the failure.

6 5. Computer System Machine vision in the domains of industry, medicine and science are dominated by PCs and the operating system Windows with the use of modern interfaces, such as USB and FireWire. Efficient visualization requires graphics hardware with on-board memory. If image sequences should be recorded, the computer configuration should be similar to that of video editing systems (fast processor, fast separate hard disk). In case of simple applications with one camera and a slow sequence of images a simple lowend computer may be sufficient. However, increasing complexity, number of cameras and number of frames may lead to a processing load that has to be distributed amongst several PCs. VI. Basic steps in machine based inspection Machine vision system comprises of the basic steps as given below: 1. Object presentation Depending upon the camera employed in the machine vision system, it is often required to analyse how the object should be presented to the camera for the imaging in addition to the optimized illumination. Several cameras can acquire good images when the objects are still. But if the objects are bigger enough it is required to rotate the objects while in some other cases two or more cameras are employed to acquire the 3D image of the objects. This is an important step as it tremendously helps in identification and analysis of the targeted objects. 2. Imaging In imaging the object presented to the camera with the optimized illumination is sensed by the imaging sensor (CCD or CMOS) in the camera and is converted into the electrical signal (voltage). This analog electrical signal is converted into the digital using specialized Analog to Digital converter called Frame grabbers. Frame grabbers is compatible with the most common types of bus structures (USB or Firewire), including peripheral component interconnect (PCI) which sends the digital data to the computer system for the image analysis and processing. 3. Analysis Using the computer software, the digital image acquired is processed first to remove the noise or simplification processes. Then the image is segmented employing suitable algorithm among many methods to separate the image from the background and count, measure and identify objects requirement. As stated above the based upon the image processing algorithms the software gives the suitable digital output. 4. Control Action Based upon the digital control output from the software in the computer system, the computer signals a robotic device either to reject the part if the part fails; alternately, the system may warn a human worker to fix the production problem that caused the failure. Here the cause of the failure is noted for the further processing of the rest of the products. VII. Advantages of Machine vision based Systems Some of the advantages of using machine vision:

7 1) Increased productivity: It increases productivity of the company products by reducing direct and indirect labor, and reducing burden rate. 2) Increased machine utilization: Through the increased machine utilization it becomes able to locate the position of an object, to measure dimensions within thousandths-of-an-inch accuracy, to count small items and identify an object. 3) Increased flexibility of production: It increases flexibility in the production as machines are easily controllable by the increased machine utilization 4) Reducing operations cost: It reduces overall costs which include reducing Work in progress, scrap work, set-up times, lead times, material handling costs. 5) Increased quality assurance: It increases the quality of products by verifying whether a quality of an object meets standards 6) Reducing errors: Errors caused by fatigue, human judgment, oversight can be eliminated using machine vision system 7) Increased customer satisfaction: Better quality products increase the customer satisfaction which increases the net profit for the company. 8) High speed and repeatability: Manufacturers favor machine vision systems for visual inspections that require high-speed, high-magnification, 24-hour operation, and repeatability of measurements. 9) Wide range vision inspection: Machine vision system can work in UV and IR light which have advantages over human vision.

2. Illumination Sources
I. Importance Purpose of machine vision illumination is to control how the object appears to the camera. Image quality dependent on lighting irrespective of camera, frame grabber parameters as camera less versatile than human eye in uncontrolled conditions. Lighting quality determining robustness of MVS. Designing and following a rigorous lighting analysis sequence will minimize time, effort, and resources. Optimum lighting eliminating or reducing the need of post image acquisition filters and processing. Illumination most critical part of an imaging system. Benefits of specialist illumination sources over customized design: cost effectiveness, proven, reliability, repeatability and variety.

II. Types of illumination sources 1. Fluorescent Sources: These are common in household usage, less popular in MV industry. Fluorescent are more powerful than LEDs but less than metal halide bulbs. These are of moderate intensity with service life to match. Fluorescent tubes are AC devices, with high frequency supply essential (around 50 Hz). Standard fluorescent tubes do not have a very uniform colour balance, being predominantly blue with little red. Figure 2.1: Fluorescent Sources However, daylight varieties have a higher colour temperature and are generally more suited to machine vision. 2. LED Illumination Most appropriate for low-speed MV applications as have adequate light intensity, low cost, longer service life, low DC supply requirement, more flexible cable than bulky metal halide bulbs and their inflexible fiber optic light guide. Modern high-intensity LEDs have high illumination and matching metal halide bulb intensity when used strobe controller. Common use in spotlights, linelights, backlights, diffuse lights. These are available in variety of colors, hence, have extensive application. These are advantageous on the grounds of flexibility in application, output stability, etc. Disadvantageous include costeffectiveness for large area lighting. 3. Metal Halide (Mercury) Metal halides are also known as mercury, often used in microscopy as has many discrete wavelength peaks, which complements the use of filters for fluorescence studies. It harnesses very powerful intensity of a halogen or metal vapor bulb which is housed within a light source box, through an optical fiber light guide and into a light 'adapter' positioned close to the object to be illuminated. These commonly take the form of line lights and ring lights of various sizes and lengths. Not practical to use beyond 5m due to losses.

Figure 2.2: Fiber optics illumination

The more sophisticated halogen light sources have RS232 control for intensity. The most powerful lightsources available are Nickel Metal Halide and mercury vapour bulbs and can produce approximately 5 times greater light intensity than a halogen. They also provide the ultimate intensity for very high speed applications such as linescan camera. For fast moving applications, xenon strobe light sources provide bright white light with short duration, ideal for freezing motion. 4. Xenon It is useful in bright strobe lighting applications. A specific advantage of the source is the high light

Figure 2.3: A comparison of different sources on various parameters

power on nominal only 35W. Further the xenon source is suited for line cameras and solves off costly DC-sources on halogen basis. III. Different types of Lamp geometries Lights and lamps are available in different geometries and colors, suitable according to engineers decision for needs. More than one type of lighting may be required for illuminating different components of MV system. Dual-circular fluorescent illuminator using two independently controlled circular lamps providing 360 degree of uniform illumination.

10

Figure 2.4: Dual-Circular Lamp

General types including: spot, rectangular, linear & ring formats. These are also consumer customizable, widely used in medical, pharmaceutical applications, etc.

Figure 2.5: Various lamp lighting geometries

IV. Choice of Illumination Source for a given Application

Figure 2.6: Before and after correct illumination

Ring lights and array lights are available for those occasions when surfaces are flat and diffused. For cases where outside dimensions must be measured or openings viewed, backlights work best. Wavelength of illumination is an important factor. Viewing from opposite end of spectrum as compared to observed color useful in many situations UV, IR more useful in situations where image difficult to capture otherwise. Strobed LED lighting is also useful in moving parts inspection

Figure 2.7: Example of different wavelengths lighting differences

11

3. Imaging Sensors
I. Importance of imaging sensors in machine vision system. Machine vision system requires appropriate and accurate image sensing device for the robust and correct identification, analysis and interpretation. Hence, it is mandatory to use the most suitable imaging sensors available. Broadly speaking if the camera is the eye of a machine vision system, then the image sensor is the heart of a camera. Choice of the sensor is made according to the accuracy, throughput, sensitivity, and cost of the machine vision system. A basic understanding of sensor attributes and application needs will narrow the search for the right sensor as shown by the figure below.

Figure 3.1: When selecting a sensor for a machine vision application its important to have a thorough understandingof the application needs for the dynamic range, speed, and responsivity.

The sensor is made up of millions of "buckets" that essentially count the number of photons that strike the sensor. This means that the brighter the image at a given point on the sensor, the larger the value that is read for that pixel. The number of resulting pixels in the image determines its "pixel count". For example, a 640x480 image would have 307,200 pixels, or approximately 307 kilopixels; a 3872x2592 image would have 10,036,224 pixels, or approximately 10 megapixels.Hence this illustrates the importance of the image sensor technology. II. Different types of cameras, their characteristic Depending upon the parameter to be considered and the conditions of use there are different types of cameras:A. Based upon type of image sensor implied: CCD Camera:

CCD Camera use CCD imaging sensor which consists of photodiodes to detect the light photons. All CCDs are optically shielded, which are used only for readout purpose. The collected charge is simultaneously transferred to the vertical CCDs at the end of integration time (a new integration period can begin right after the transfer) and then shifted out charge transfer to vertical CCDs simultaneously resets the photodiodes.
Figure 3.2: CCD imaging sensor

12 The charges accumulated from the vertical CCDs are transferred to the horizontal CCDs which shifts the charge to the output amplifier. CCD cameras have advantages of high quality images with the optimized photo detectors with high (Quantum Efficiency) QE, low dark current and very low noise: no noise introduced during shifting. They have disadvantages of being highly nonprogrammable, requiring high power: entire array switching all the time. Limited frame rate (for large sensors) due to required increase in transfer speed (while maintaining acceptable transfer efficiency). CMOS Camera:

A CMOS camera is a type of digital camera having a CMOS image sensor inside and a consisting of an integrated circuit that records an image. The complementary metal-oxide semiconductor (CMOS) sensor consists of millions of pixel sensors, each of which includes a photodetector. As light enters the camera through the lens, it strikes the CMOS sensor, which causes each photodetector to accumulate an electric charge based on the amount of light that strikes it. Figure 3.3: CMOS Sensor Array The digital camera then converts the charge to pixels that make up the photo. Unlike CCD, each pixel in a CMOS imager has its own individual amplifier integrated inside. Since each pixel has its own amplifier, the pixel is referred to as an "active pixel". In addition, each pixel in a CMOS imager can be read directly on an x-y coordinate system, rather than through the "bucket-brigade" process of a CCD. This means that while a CCD pixel always transfers a charge, a CMOS pixel always detects a photon directly, converts it to a voltage and transfers the information directly to the output. This fundamental difference in how information is read out of the imager, coupled with the manufacturing process, gives CMOS Imagers several advantages over CCDs. High speed On-Chip system integration Low cost of manufacturing

B. Based on the motion of the object: If the object is moving along a conveyor it will need to be either a progressive area scan camera or a line scan camera. Progressive Area Scan-

This type of camera is used in applications where image is to be read as whole rather than an interlaced camera that reads two distinct fields (odd and even lines) separated by 40ms time interval and

13 then the resultant image is read out as a complete frame). The progressive cameras read all lines within the same scan and therefore no image blur is visible.

Line Scan cameras-

Line scan cameras are a linear image sensor, generally one row of pixels in the sensor up to about 8000 pixels. Linescan cameras read data at many thousands of lines per second so can deal with defect detection in very fast moving objects. Hence, are used in applications which demand reading of data very fast. Area scan cameras do not have the speed to capture data from a moving object. For example paper or textiles which may travel at many tens of meters a second. C. Based upon of colour detection capability: Monochrome Cameras:

These cameras contain monochrome sensors with a matrix of colour filter across them allowing only a single wavelength to reach the sensor. There are a number of different filters used but generally all filters will degrade the image sensor sensitivity by around 30 per cent. Color Cameras:

The Colour cameras have a single sensor with an array of colour filters printed over their pixels. Adjacent pixels use different colours, hence the resolution at each colour is less than that of monochrome cameras. Some high performance colour cameras employ colour-separation prism along with three chips for primary colours for full resolution of each colour. Colour cameras have certain disadvantages over monochrome cameras: 1. They are less sensitive in comparison to monochrome cameras. 2. Assuming the same number of pixels, the effective resolution of a color camera is lower than that of a monochrome camera. 3. There exists a green, blue and red value for every pixel in the raw digital image. The color camera has to undergo a color interpolation. This interpolation requires extra processing power and bandwidth during the data transfer. Colour cameras are only used, if the different colors of an image "carry" information. D. Based upon interfacing criteria: Analog Cameras: In these type of cameras the signal from the sensor is converted to analog voltage and then fed to frame grabber board in machine vision computer. Digital Cameras: The signal from the pixel is digitalized and digitalized data is directly fed to the computer. Most new machine vision system employ digital cameras.

14 III. Typical camera parameters and their significance Typical Camera parameters include 1. Shutter (Exposure Time) The shutter determines the CCDs exposure time. It may be adjusted manually or automatically. The three first sample images show a key ring (the LED is initially off) with correct exposure time, one which is too short and another which is too long.

Figure 3.5: Correct Exposure time

Figure 3.6: Exposure time too long

Figure 3.7: Exposure time too short

Switching on the LED, the image is overexposed in such a way that it only shows a big, white spot. The LED is correctly represented, if we decrease the exposure time. There is, however, a vertical line which disturbs the image. This is a typical CCD problem and is known as "smear" (Figure 3.9). To avoid this, we close the diaphragm and increase the exposure time:

Figure 3.8: Extremely overexposed

Figure 3.9: Smear

Figure 3.10: Correct representation

2. Gain (Contrast) Gain determines the amplification of the CCD's output signal. This parameter may be adjusted manually or automatically. The amplification increases the contrast. A high gain, however, leads to noisy images.

Figure 3.11: Source

Figure 3.12: Contrast increase

Figure 3.13: Gain too high

3. Offset (Brightness) The offset increases all graylevels resulting in brighter image. The offset is added to the cameras output signal. This parameter may be adjusted manually or automatically.

15

Figure 3.14: Source image

Figure 3.15: Slight brightness increase

Figure 3.16: Overdone brightness increase

4. Auto Exposure und Exposure Reference Auto Exposure determines whether the adjustment of the exposure time and the gain is to be adjusted manually or automatically. It compares the mean gray level of the current image with the Exposure Reference. If these values are different, the exposure time as well as the gain is varied accordingly. 5. Sharpness This mechanism may use to enhance blurred images. Overdoing its application leads, however, to distortions.

Figure 3.17: Source image

Figure 3.18: Sharpness improvement

Figure 3.19: Overdone sharpness

6. Gamma Gamma increases or decreases the middle gray levels. In other words, it is a way to compensate the non-linear behavior of picture tubes:

Figure 3.20: Source image

Figure 3.21: Increased middle graylevels

Figure 3.22: Decreased middle graylevels

7. Saturation This parameter is used to adjust the colors saturation from monochrome to high color values:

Figure 3.23: Source image

Figure 3.24: Saturation = 0

Figure 3.25: Maximum Saturation

16

8. Hue This parameter is used to shift color values. Nevertheless, the relation between the colors remains (in contrast to the parameter White Balance):

Figure 3.26: Source image

Figure 3.27: Color shift

9. White Balance This parameter is used to vary the degree of red and blue in the image to achieve a lifelike color representation. The values can be controlled manually or automatically. The automatic white balance feature offers two operation modes: Auto: The balancing algorithm effects the video stream continuously. One push: The balancing is controlled by triggers. Simple multimedia cameras only provide one white balance parameter in which increasing the degree of one colour leads to a decrease in degree of other and vice versa whereas high quality cameras offer two parameters which can be changed simultaneously.

Figure 3.28: Source image

Figure 3.29: Degree of blue too low

Figure 3.30: Degree of red too low

Camera calibration is often used as an early stage in machine vision and especially in the field of augmented reality. On the basis of camera calibration we have parameters as: IV. Choice of camera for a given application Choosing a right camera among different cameras having different features with is an important task. It is essential to decide which technology to prefer. When making the technology choice it is really important to be clear on exactly what the application is going to be. Depending upon requirements different parameters to be considered are: Resolution Requirement: Generally camera should be chosen with the lowest resolution that will meet the requirements. This is important because the higher the resolution, the more image processing that must be done by the host computer.

17 Colour or Monochrome Application: The use of color adds a level of complexity that should be avoided unless your application truly needs color. Color cameras produce larger amounts of data than monochrome cameras meaning greater image processing burdens. Color also negatively affects camera sensitivity and image resolution. Image processing speed: For applications demanding more processing speeds where high speed objects are inspected it is better to use low resolution and monochrome cameras that reduce the complexity and helps to maintain the image processing speed. Space limitation: Applications can be there in which space is limited, the plant layout is small, for these spaces we need to use small size cameras. Cable Limitations: Depending upon availability of cables a third party software package can be used, and cameras can be used that provide interfacing standards such as Firewires DCAM (IIDC) and Gigabit Ethernets GigE Vision.

V. Frame Grabbers and their functions It is an important component of a machine vision system, in which video frames are captured in digital form and then displayed, stored or transmitted in raw or compressed digital form. Historically frame grabbers were the predominant way to interface cameras to PC's. This has substantially changed in recent years as direct camera connections via USB, Ethernet and IEEE 1394 ("FireWire") interfaces have become practical and prevalent. Functioning: A frame grabber captures individual, digital still frames from an analog video signal or a digital video stream. The incoming signal from the vision camera is sampled at a rate specified by a fixed frequency pulse, which can be generated in the frame grabber itself or received from the camera. If the signal is not already digital it passes through an analogue to digital converter, and stored in the buffer until a full image has been converted.

Figure 3.31: Schematic of Frame Grabber

Early frame grabbers had only memory enough to acquire (i.e., "grab") and store a single digitized video frame, hence the name. Modern frame grabbers are typically able to store multiple frames and compress the frames in real time using algorithms such as MPEG2 & JPEG. Frame grabbers that perform compression on the video frames are referred to as "Active Frame Grabbers", frame grabbers that simply capture the raw video data are referred to as "Passive Frame Grabbers."

18 Applications where Frame Grabbers are implied are radar acquisition, manufacturing, and remote guidance requiring capturing of images at high frame rates and resolutions. Frame Grabbers can be classified into: Analog frame grabbers which accept and process analog video signals, include these circuits:

An input signal conditioner to buffer the analog video input signal and protect downstream circuitry. A circuit to recover the horizontal and vertical synchronization pulses from the input signal. An analog-to-digital converter. An NTSC/SECAM/PAL decoder, a function that can also be implemented in software.

Digital frame grabbers which accept and process digital video streams, include these circuits:

Physical interface to the digital video source, such as Camera Link, DVI, GigE Vision, LVDS or RS-422.

Circuitry common to both analog and digital frame grabbers:


Memory for storing the acquired image (i.e., a frame buffer). A bus interface through which a processor can control the acquisition and access the data. General purpose I/O for triggering image acquisition or controlling external equipment.

Based upon different applications there is a huge range of frame grabbers available and can basically be split into three main categories Standard, Advanced and Intelligent. Standard Frame Grabbers These are low cost devices with a high enough level of intelligence and software support for inspection applications. They can only be used with standard analogue interlaced video sources whereas nonstandard video sources (i.e., progressive scan, mega pixel, non-standard cameras and digital sources) are not supported. This type of grabber often does not include memory to buffer images on so the video data is sent to the CPU via the PCI-bus line by line which is processor intensive. Standard grabbers can be triggered to grab the next image although their response is not instantaneous and there will be a random delay of up to one frame, which remains satisfactory for most applications. They also contain a multiplexer allowing more than one camera to be connected and used in turn. Advanced Frame Grabbers Advanced Frame Grabbers are high performance frame grabbers which support non-standard cameras and therefore are dominant in most machine vision applications. Apart from increased accuracy, the distinct feature of an advanced frame grabber which sets it apart from standard level grabbers, is the ability for asynchronous image capture. This is achieved via the use of synchronization mechanisms between the grabber and camera resulting in instantaneous capture also known as an asynchronous reset

19 operation. This operation interrupts the sampling clock and resets the exposure and readout cycle so that a full image can be generated at any time. Intelligent Frame Grabbers Intelligent Frame Grabbers effectively contain advanced grabbers but also contain additional on board processing hardware to provide the grabber with a form of intelligence. Escalating the grabber from merely a messenger to a device with processing capabilities built in. Intelligent frame grabbers can be split into three predominant types: Intelligent capture: These remove interaction from the host during the acquisition cycle owing to time critical applications. Only the actual data transfer requires the host processor and as such the grabber notifies the host that there is new data residing in the grabber's memory. Pre-processing engines: These grabbers free up even more host processing time by performing some of the functions the host would normally do before the data would be ready to process. These functions include flat field correction, image arithmetic, convolution filters and data reduction. Expandable processing engines: These can form up to 30 processors into one computing engine increasing the processing power of the host. They excel in applications where the sampling rate is higher or where the processing cannot be accomplished on a single or dual processor host.

20

4. Imaging Techniques
I. Strobed and Steady imaging Imaging can usually be strobe or steady (continuous wave) according to the need of the application in machine vision inspection systems. Strobe Imaging:

In strobe imaging, a switched current source is coupled to the LED emitter element to enable strobe operation synchronous with the periodic operation of a line imaging camera. In applications where the exposure time is short, such as imaging fast-moving objects, strobing LED lights is a popular way to take advantage of LED ability for short cycles with high light output. The increase in light output comes from LED ability to be driven for short periods with currents exceeding normal steady state values, followed by a relatively long cool-off time. The ratio of on-time to off-time is typically 1 to 100; pulse durations are typically below 100 microseconds. The heart of the strobe source is the trigger detector, which connects to an image acquisition device or an external sensor, and reads a trigger when the strobe event occurs. Once the trigger detector detects a signal, a strobe event begins (refer to the figure 4.1).

Figure 4.1: Trigger timing diagram in Strobe imaging

An example to the strobe imaging: Sample particles are dropped between a video camera and synchronized strobe light by a vibrating feeder. When the strobe flashes, the camera takes an image of the particles. This image is then digitized by a computer frame grabber. For different products, the user can select either a camera with height magnification or a camera with wide field of view according to the particle size in the sample.

Figure 4.3: Digital Image prototype with uniform Figure 4.2: Schematic of strobe imaging example pixels

Steady Imaging
In steady imaging, camera continuously obtains the image with the continuous wave illumination source. Continuous-wave xenon lamps are used in steady imaging based machine-vision applications where continuous light and a good color balance are required. Continuous-wave xenon systems can
Figure 4.4: A Xenon light source

21 produce in excess of 250,000 candelas. For many years, continuous-wave xenon illumination systems have been used in machine-vision systems to provide high-intensity and a wide spectrum of radiation. Such features improve the contrast of captured images, increase the depth of field of the imaging scene, and shorten camera-integration times. In steady imaging machine vision system continuously image the viewing area and through image processing determine when a part is present. In these cases, image acquisition must be commanded by software; the software triggers the event. After image acquisition, some image processing is performed to determine if a part is present for viewing. In some implementations of this technique, the same image that is used for detection (triggering) is retained and analyzed as the primary image. In other cases, a new image is acquired by software command and its contents analyzed. The latency for part detection through imaging is the latencies for exposure, transfer of image data through the system, and initial processing. If the same image can be used for analysis, there is no further latency for image acquisition. Therefore steady imaging is preferred on the applications that require very high speed. And for this purpose of continuous inspection with low power dissipation CMOS camera is used. For discrete parts, part detection using imaging algorithm is generally more complex, less reliable, and has longer latency than using an external sensor to trigger the process.

II. Imaging Configurations The principle for how information is transferred from an object i.e. device under inspection to a detector i.e. CCD camera is based on how photons interact with the material within the object. If the device under inspection modifies the incoming light in such a way that the outgoing rays are different from the incoming rays, then we say that the object has created contrast. This is the basic principle of all machine vision applications. If the object cannot modify the incoming beam in some discernable fashion, then the device cannot be visible to either a camera or the human eye. The goal for the machine vision lighting is to provide incoming illumination in such a fashion that the naturally occurring features and properties of the device under test can be exploited to maximize contrast. Bright Field Imaging: A bright field image is formed using light (or electrons) transmitted through the object. Regions of the object that are thicker or denser will scatter photons more strongly and will appear darker in the image. When no object is present then a bright background will be seen. Since the background will tend to be bright for a majority of materials, lighting modes which consist primarily of structure which emanates from this part of the lighting hemisphere is called Brightfield. Brightfield lighting modes come in a variety of styles, sizes and shapes and provide varying degrees of contrast enhancement depending on the nature of the part under inspection. The purest and most interesting of the Brightfield modes is that Figure 4.5 : Brightfield produced by what is commonly called a Coaxial Illumination mode. Coaxial modes reflect back to the camera Illuminators produce light which appears to emanate from the detector, bounce off the part, and then return back upon itself to the detector. To accomplish this lighting mode, a beamsplitter is oriented at 45 degrees so as to allow half the light impinging on it to pass through, and the other half to be specular reflected (see Figure). For reflective surfaces, the background signal is very high and uniform over the field of view.

22

Any variation on the specular surface (reflective, transmissive or absorptive) will result in a reduction in the amount of light making it back into the sensor, causing the image of this area to appear darker than that of the surrounding bright background. This is an excellent lighting mode for flat parts that have a mixture of highly reflective areas surrounded by non-specular absorptive areas. Flat gold pads on a fiberglass circuit board provide an excellent example of the high contrast capability Figure 4.6: Coaxial Illumination is one of the most common of the Coaxial Illumination mode. forms of brightfield illumination. It is extremely useful for
flat objects with both reflective and absorbing features

Dark Field Imaging: A dark field image is formed using light (or electrons) scattered from the object. In the absence of a object, therefore, the image appears dark. If the part under inspection is flat and has a reflectance value that is nonzero, all light emanating from points below the Brightfield angle will be reflected off the part away from the detector. Since the background will tend to be dark for a majority of materials, lighting modes which consist primarily of structure which emanates from this part of the lighting hemisphere are called Darkfield. Dark field illuminators provide varying degrees of contrast enhancement depending on the nature of the part under inspection. For reflective surfaces, the background signal is generally very low and uniform over the field of view. Any variation on the specular surface (predominantly reflective) will result in an increase in the amount of light making it back into the sensor, causing the image of this area to appear lighter than that of the surrounding dark background (see Figure ). This is an excellent lighting mode for flat parts that have surface variations or imperfections that deviate from the perfect flat background. Application examples include surface flaw detection (scratches, pits, etc.) as well as OCR on stamped characters.

Figure 4.7: Darkfield Illumination provides low angle incident lighting that highlights any deviations from a perfectly flat surface. Most of the light generated never makes it to the camera.

23 For the rounded objects For a rounded part, the normal remains perpendicular to the surface, but because the surface is no longer flat, the direction of the normal varies across the field of view and is no longer parallel at all points. For a surface with a slight convex curvature, this phenomenon effectively increases the Brightfield portion of the hemisphere, and reduces the Darkfield portion (see Figure).

Figure 4.8: Curved surfaces alter the effective Brightfield and Darkfield regions because the normal vectors are no longer parallel.

As the curvature continues to increase towards a spherical ball, the Brightfield region continues to grow until such a point as the entire hemisphere can now only provide the Brightfield lighting mode. Spherical Brightfield Illuminators To properly inspect features on a curved surface (spherical or cylindrical), using brightfield illumination a special device is used called as Spherical Brightfield Illuminator (SBI). The goal for the SBI is to provide light in a fashion that all incident rays impinge upon the surface parallel with the normal vector. For this a collimated coaxial lighting device may be combined with a convex spherical lens to create a Convex Spherical Illuminator. Schematically, light travels from the collimator to the beam splitter, where 50% of the energy reflects toward the part under inspection (see Figure). In a similar manner, reflective surfaces that are inwardly domed or depressed may also be imaged using the same spherical brightfield technique with a slight modification. Here the same basic setup is utilized, but the convex spherical lens is replaced with a concave spherical lens. The concave lens is aligned such that the focal point for the lens and the concave surface are congruent (see Figure on next page).
Figure 4.9: In a SBI collimated coaxial light passes through a lens element that projects incident light normal to the curved surface.

24

Figure 4.10: In the concave SBI, collimated coaxial light passes through a concave lens element projecting incident light normal to the concave surface

Applications A common application of this technique is for the imaging of soda and beer can bottoms. Here the application requires that the date and lot code is to be read from the bottom of the cans. These codes are generally printed onto the concave bottom of the reflective aluminum can, creating a difficult illumination problem. Since the inks used tend to be dark in color and are fairly absorbing, hence brightfield imaging is to be applied. The curved surface introduces an additional challenge because normal brightfield techniques fail to produce a uniform illumination field over the entire dome. As can be seen in Figure dark field illumination provides a uniform background, but fails to create high contrast with the date lot codes. The concave SBI technique increases the uniform brightfield zone such that the black absorbing ink can now be imaged with high contrast against the reflective background. .
Figure 4.11: Concave SBI provides high contrast brightfield illumination for objects that have concave domed or depressed regions.

25

5. Image Processing Techniques

I. Image types A digital image is an array, or a matrix, of square pixels (picture elements) arranged in columns and rows.

Figure 5.1: An image an array or a matrix of pixels arranged in columns and rows

In a (8-bit) grayscale image each picture element has an assigned intensity that ranges from 0 to 255. A grey scale image is what people normally call a black and white image, but the name emphasizes that such an image will also include many shades of grey.

Figure 5.2: Each pixel has a value from 0 (black) to255 (white). The possible range of the pixel values depend on the colour depth of the image, here 8 bit = 256 tones or grayscales

Figure 5.3: A true-colour image assembled from three grayscale images coloured red, green and blue. Such an image may contain up to 16 million different colours.

A normal grayscale image has 8 bit colour depth = 256 grayscales. A true colour image has 24 bit colour depth = 8 x 8 x 8 bits = 256 x 256 x 256 colours = ~16 million colours. Some grayscale images have more grayscales, for instance 16 bit = 65536 grayscales. In principle three grayscale images can be combined to form an image with 281,474,976,710,656 grayscales. There are two general groups of images: vector graphics (or line art) and bitmaps (pixel-based or images). Some of the most common file formats are: GIF is an 8-bit (256 colour), non-destructively compressed bitmap format. It is mostly used for web. It has several sub-standards one of which is the animated GIF. JPEG is a very efficient (i.e. much information per byte) destructively compressed 24 bit (16 million colours) bitmap format. It is widely used, especially for web and Internet (bandwidthlimited).

26 TIFF is the standard 24 bit publication bitmap format. It compresses image non-destructively. PS is a postscript, a standard vector format. It has numerous sub-standards and can be difficult to transport across platforms and operating systems. PSD is a dedicated Photoshop format that keeps all the information in an image including all the layers.

We may loosely classify images according to the way in which the interaction occurs, understanding that the division is sometimes unclear, and that images may be of multiple types. Figure (5.4) depicts these various image types.

Figure 5.4: Reflection, Emitted and Altered image formations

Reflection images sense radiation that has been reflected from the surfaces of objects. The radiation itself may be ambient or artificial, and it may be from a localized source. Emitted Radiation Images sense radiation that has been emitted from the source objects. The radiation itself may be ambient or artificial, and it may be from a localized source. Altered Images sense radiation that has been altered by the transparent or translucent objects.

Several standard types of the images are as follows: Grayscale Images: These are coded using one number per pixel representing one of 256 different gray tones ranging from black to white (figure 5.5).

Figure 5.5: Grayscale Image

Figure 5.6: Palette Image

Figure 5.7: RGB image

27 Palette Images These are images coded using one number per pixel, where the number specifies which color in a palette of up to 256 different colors should be displayed for that pixel (as shown in figure 5.6). The colors in the palette can be True Color RGB colors. Palette images save space at the cost of a reduced total number of colors available for use in the image. The image shown about uses only 16 colors. RGB Images - These images use three numbers for each pixel, allowing possible use of millions of colors within the image at the cost of requiring three times as much space as grayscale or palette images. They are often called True Color RGB images (shown in figure 5.7). RGBa Images - These images are RGB images with a fourth number added for each pixel that specifies the transparency of that pixel in the range 0 to 255. When seen in an image window, grayscale, palette and RGB images will be shown on a background of solid color (white by default). RGBa images are shown on a background of alternating white and light gray checkerboard pattern so that differences in transparency are more visible. RGBa images are used when combining multiple images in maps for elaborate graphics composition or creation of special visual effects in maps. For example, the RGBa image illustrated along is shown in a layer above a grid of lines that become visible to an increasing degree as transparency increases towards the bottom of the image.

Figure 5.8: RGBa image

Compressed Images - Compressed images use sophisticated wavelet compression technology to not only compress the amount of data an image requires but also to reconstitute the image dynamically on demand. At any given zoom level the desired view of the image is reconstituted from the compressed data store. Compressed images can be viewed, but not edited or otherwise manipulated. Compressed images are used to display very large images that would require too much time for display and possibly too much room for storage if they were Figure 5.9: Compressed image not compressed. II. Image Enhancement Images may suffer from the following degradations: Poor contrast due to poor illumination or finite sensitivity of the imaging device Electronic sensor noise or atmosphere disturbances leading to broad band noise Aliasing effects due to inadequate sampling

Enhancement techniques transform an image into a better image by sharpening the image features for display analysis. Image enhancement is the process of applying these techniques to facilitate the development of a solution to a computer/machine imaging problem.

28

Figure 5.10: Schematic of Image Enhancement

Figure above illustrates the importance of the feedback loop from the output image back to the start of the enhancement process and models the experimental nature of the development. The range of applications includes using enhancement techniques as preprocessing steps to ease the next processing step or as postprocessing steps to improve the visual perception of a processed image, or image enhancement may be an end in itself. Enhancement methods operate in the spatial domain by manipulating the pixel data or in the frequency domain by modifying the spectral components. Image enhancement techniques can be divided into two broad categories: 1. Spatial domain methods, which operate directly on pixels, and 2. Frequency domain methods, which operate on the Fourier transform of an image. 1. Spatial domain methods The value of a pixel with coordinates (x; y) in the enhanced image F is the result of performing some operation on the pixels in the rectangular neighbourhood of (x; y) in the input image, f. Types include: A) Grey scale manipulation: A operator T acts only on a 11 pixel neighbourhood in the input image, that is F(x; y) depends on the value of f only at (x; y). The simplest case is thresholding where the intensity profile is replaced by a step function, active at a chosen threshold value. In this case any pixel with a grey level below the threshold in the input image gets mapped to 0 in the output image. Other pixels are mapped to 255. Other grey scale transformations are outlined in Figure:

29

Figure 5.11: Depending upon different step functions different Gray Scale Mappings can be observed.

B) Histogram Equalization Histogram equalization is a common technique for enhancing the appearance of images. Suppose we have an image which is predominantly dark. Then its histogram would be skewed towards the lower end of the grey scale and all the image detail is compressed into the dark end of the histogram. If we could `stretch out' the grey levels at the dark end to produce a more uniformly distributed histogram then the image (see Figure) would become much clearer.

Figure 5.12: The original image and its histogram and its equalized versions.

C) Image Smoothing The aim of image smoothing is to diminish the effects of camera noise, spurious pixel values, missing pixel values etc. There are many different techniques for image smoothing;

30 Neighborhood averaging: Each point in the smoothed image, F(x; y) is obtained from the average pixel value in a neighborhood of (x; y) in the input image but smoothing will tend to blur edges because the high frequencies in the image are attenuated. Edge-preserving smoothing: It is also called median filtering since we set then grey level to be the median of the pixel values in the neighborhood of that pixel. The outcome of median filtering is that pixels with outlying values are forced to become more like their neighbors, but at the same time edges are preserved.

Figure 5.13 : Original image; with noise; the result of averaging; and the result of median filtering

D) Image sharpening The main aim in image sharpening is to highlight fine detail in the image, or to enhance detail that has been blurred (perhaps due to noise or other effects, such as motion). With image sharpening, we want to enhance the high-frequency components. E) High boost filtering High pass filtering can be defined in terms of subtracting a low pass image from the original image, that is, High pass = Original- Low pass However, in many cases where a high pass image is required, we also want to retain some of the low frequency components to aid in the interpretation of the image. Thus, if we multiply the original image by an amplification factor A before subtracting the low pass image, we will get a high boost or high frequency emphasis filter. Thus, High boost = A. Original -Low pass = (A-1). (Original) + Original - Low pass = (A -1). Original + High pass 2. Frequency domain methods: In this we compute the Fourier transform of the image to be enhanced, multiply the result by a filter, and take the inverse transform to produce the enhanced image. Low pass filtering involves the elimination of the high frequency components in the image. It results in blurring of the image (and thus a reduction in sharp transitions associated with noise). An ideal low pass filter (see Figure) would retain all the low frequency components, and eliminate all the high frequency components. However, ideal filters suffer from two problems: blurring and ringing. These problems are caused by the shape of the associated spatial domain filter, which has a large number of undulations.

31 Smoother transitions in the frequency domain filter, such as the Butterworth filter, achieve much better results.

Figure 5.14: Transfer function for ideal low pass filter.

III. Image Segmentation A segmentation of an image is a partition of the image that reveals some of its content. The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. The result of image segmentation is a set of segments that collectively cover the entire image, or a set of contours extracted from the image. Applications of image segmentation include: 1) Identifying objects in a scene for object-based measurements such as size and shape. 2) Identifying objects in a moving scene for object-based video compression (MPEG4) 3) Identifying objects which are at different distances from a sensor using depth measurement from a laser range finder enabling path planning for mobile robots. Different types of image segmentation are: 1) Segmentation based on greyscale: Original greyscale image leads to inaccuracies in labelling and in distinguishing of components. Hence image segmentation is required.

Figure 5.15: Segmentation based on greyscale.

2) Segmentation Based upon Texture: This type of segmentation enables object surfaces with varying patterns of grey to be segmented.

Figure 5.16: Segmentation based upon texture

32 3) Segmentation based on range: In this type of segmentation a range image is obtained with a laser range finder . A segmentation based on the range (the object distance from the sensor) is useful in guiding mobile robots.

Figure 5.17: Segmentation based upon range

4) Segmentation based on motion: Here the objective of segmenting the image is to observe the motion parameter accurately. The main difficulty of motion segmentation is that an intermediate step is required to (either implicitly or explicitly) estimate an optical flow field.

Figure 5.18: Segmentation based upon Motion

Segmentation techniques: 1) Edge-based technique of segmentation: In this technique the image is analysed and is segmented by boundary detection methods. After that further classification an analysis are observed for the area of interest.

Figure 5.19: Schematic of Edge-based technique

33

Figure 5.20: Edge based technique of segmentation, here the area of interest i.e. text is been extracted from background.

2) Thresholding The simplest method of image segmentation is called the thresholding method. This method is based on a threshold value to turn a gray-scale image into a binary image.The key of this method is to select the threshold value and segment the image according to that value. 3) Clustering method Clustering method is based upon the K-means algorithm in which an iterative technique is used to partition an image into K clusters. The basic algorithm is: 1. Pick K cluster centers, either randomly or based on some heuristic 2. Assign each pixel in the image to the cluster that minimizes the distance between the pixel and the cluster centre 3. Re-compute the cluster centers by averaging all of the pixels in the cluster 4. Repeat steps 2 and 3 until convergence is attained (e.g. no pixels change clusters) In this case, distance is the squared or absolute difference between a pixel and a cluster centre. The difference is typically based on pixel color, intensity, texture, and location, or a weighted combination of these factors. K can be selected manually, randomly, or by a heuristic.

Figure 5.21: Schematic of K-Means algorithm

Figure 5.21: General representation how clusters are decided.

Figure 5.22: Different types of clusters in K-means Algorithm method of segmentation

34

4) Compression-based methods Compression based methods postulate that the optimal segmentation is the one that minimizes, over all possible segmentations, the coding length of the data. In other words segmentation tries to find patterns in an image and any regularity in the image can be used to compress it. The method describes each segment by its texture and boundary shape. 5) Histogram-based methods In this technique, a histogram is computed from all of the pixels in the image, and the peaks and valleys in the histogram are used to locate the clusters in the image. Color or intensity can be used as the measure. A refinement of this technique is to recursively apply the histogram-seeking method to clusters in the image in order to divide them into smaller clusters. This is repeated with smaller and smaller clusters until no more clusters are formed. One disadvantage of the histogram-seeking method is that it may be difficult to identify significant peaks and valleys in the image. In this technique of image classification distance metric and integrated region matching are familiar.

Figure 5.23: Image Histogram

6) Region growing methods The region growing method or the seeded region growing method takes a set of seeds as input along with the image. The seeds mark each of the objects to be segmented. The regions are iteratively grown by comparing all unallocated neighbouring pixels to the regions. The difference between a pixel's intensity value and the region's mean, , is used as a measure of similarity. The pixel with the smallest difference measured this way is allocated to the respective region. This process continues until all pixels are allocated to a region.

35

Figure 5.24: (a) Image showing defective welds (b) Seedspoints (c) Result of region growing. (d) Boundaries of segmented defective welds

Seeded region growing requires seeds as additional input. The segmentation results are dependent on the choice of seeds. Noise in the image can cause the seeds to be poorly placed. 7) Split-and-merge methods Split-and-merge segmentation is based on a quadtree partition of an image. It is sometimes called quadtree segmentation. This method starts at the root of the tree that represents the whole image. If it is found non-uniform (not homogeneous), then it is split into four son-squares (the splitting process), and so on so forth. The node in the tree is a segmented node. This process continues recursively until no further splits or merges are possible. 8) Graph partitioning methods Graph partitioning methods can effectively be used for image segmentation. In these methods, the image is modeled as a weighted, undirected graph. Usually a pixel or a group of pixels are associated with nodes and edge weights define the similarities between the neighborhood pixels. The graph (image) is then partitioned according to a criterion designed to model clusters of interest. Each partition of the nodes (pixels) output from these algorithms are considered an object segment in the image. 9) Watershed transformation The watershed transformation considers the gradient magnitude of an image as a topographic surface. Pixels having the highest gradient magnitude intensities (GMIs) correspond to watershed lines, which represent the region boundaries. Water placed on any pixel enclosed by a common watershed line flows downhill to a common local intensity minimum (LIM). Pixels draining to a common minimum form a catch basin, which represents a segment.

36

Figure 5.25: Watershed Transformation.

10) Model based segmentation The central assumption of such an approach is that structures of interest have a repetitive form of geometry. Therefore, one can seek for a probabilistic model towards explaining the variation of the shape of the area of interest and then when segmenting an image impose constraints using this model as prior. Such a task involves (i) (ii) (iii) Classification of the training examples, Probabilistic representation of the variation of the registered samples, Statistical inference between the model and the image.

11) Semi-automatic segmentation In this kind of segmentation, the user outlines the region of interest with the mouse clicks and algorithms are applied so that the path that best fits the edge of the image is shown. 12) Neural networks segmentation Neural Network segmentation relies on processing small areas of an image using an artificial neural network or a set of neural networks. After such processing the decision-making mechanism marks the areas of an image accordingly to the category recognized by the neural network. A type of network designed especially for this is the Kohonen map. Compared with conventional image processing means, Neural networks segmentation methods have several significant merits, including robustness against noise, independence of geometric variations in input patterns, capability of bridging minor intensity variations in input patterns, etc.

IV. Feature Extraction Feature extraction is to simplify and reduce the dimensionality in the image processing when the input data is too large and contains the redundant information for the faster and better processing. This involves transforming the input data into the set of features. If the features extracted are carefully chosen

37 it is expected that the features set will extract the relevant information from the input data in order to perform the desired task using this reduced representation instead of the full size input. In other words feature extraction is a method of capturing visual content of images for indexing & retrieval. Feature extraction includes methods for training learning machines with millions of low level features. Identifying relevant features leads to better, faster, and easier to understand learning machines. Because of perception subjectivity, there cannot be a single best representation for a feature.

Figure 5.26: Feature extraction process and Image retrieval

Above figure 5.26 shows the architecture of a typical image retrieval system. For each image in the image database, its features are extracted and the obtained feature space (or vector) is stored in the feature database. When a query image comes in, its feature space will be compared with those in the feature database one by one and the similar images with the smallest feature distance will be retrieved. Feature extraction is followed by some kind of preprocessing in order to extract the features, which describe its contents. The processing involves filtering, normalization, segmentation, and object identification. The output of this stage is a set of significant regions and objects. It should be observed that image extraction should be guided by the following concerns: The features should carry enough information about the image and should not require any domain-specific knowledge for their extraction. They should be easy to compute in order for the approach to be feasible for a large image collection and rapid retrieval. They should relate well with the human perceptual characteristics since users will finally determine the suitability of the retrieved images. Image Properties Properties of the image can refer to the following: Global properties of an image: i.e. average gray level, shape of intensity histogram etc. Local properties of an image: We can refer to some local features as image primitives: circles, lines, texels (elements composing a textured region) Other local features: shape of contours etc.

38 Image features The feature of an object is defined as a function which is computed such that it quantifies some significant characteristics of the object. We classify the various features currently employed as follows: General or Local features: In these features no specific shapes or higher spatial information are detected. Application independent features such as color, texture, and shape are the local features. According to the abstraction level, they can be further divided into: 1. Pixel-level features: These are the features calculated at each pixel, e.g. color, location. 2. Local features: These are the features calculated over the results of subdivision of the image band on image segmentation or edge detection. 3. Global features: These are the features calculated over the entire image or just regular sub-area of an image. Domain-specific features: These features are domain specific and application dependent features such as human faces, fingerprints, and conceptual features. These features are often a synthesis of low-level features for a specific domain.

On the other hand, all features can be classified into Low-level features: Low- level features can be extracted directed from the original images. Some low-level shape features may also include: i. Edge Detection ii. Circle Detection iii. Line Detection iv. Corner Detection High-level features: High-level feature extraction must be based on low-level features.

Image features are local, meaningful, detectable parts of an image: Meaningful: Features are associated to scene elements that of interests to the user in the image formation process They should be invariant to some variations in the image formation process (i.e. invariance to viewpoint and illumination for images captured with digital cameras) Detectable: They can be located/ detected from images via algorithms .They are described by a feature vector, representing the useful information out of the data.

Different features in an image include: 1. Colour features The color feature is one of the most widely used visual features in image retrieval. Images characterized by color features have many advantages: Robustness. The color histogram is invariant to rotation of the image on the view axis, and changes in small steps when rotated otherwise or scaled. It is also insensitive to changes in image and histogram resolution and occlusion.

39 Effectiveness. There is high percentage of relevance between the query image and the extracted matching images. Implementation simplicity. The construction of the color histogram is a straightforward process, including scanning the image, assigning color values to the resolution of the histogram, and building the histogram using color components as indices. Computational simplicity. The histogram computation has O(X, Y) complexity for images of size X Y. The complexity for a single image match is linear; O (n), where n represents the number of different colors, or resolution of the histogram. Low storage requirements. The color histogram size is significantly smaller than the image itself, assuming color quantization.

2. Texture Texture is another important property of images. Texture is a powerful regional descriptor that helps in the retrieval process. Texture, on its own does not have the capability of finding similar images, but it can be used to classify textured images from non-textured ones and then be combined with another visual attribute like color to make the retrieval more effective. Texture has been one of the most important characteristic which has been used to classify and recognize objects and have been used in finding similarities between images in multimedia databases. Generally we capture patterns in the image data (or lack of them), e.g. repetitiveness and granularity 3. Shape Shape based image retrieval is the measuring of similarity between shapes represented by their features. Shape is an important visual feature and it is one of the primitive features for image content description. Shape content description is difficult to define because measuring the similarity between shapes is difficult. Therefore, two steps are essential in shape based image retrieval, they are: feature extraction and similarity measurement between the extracted features. Shape descriptors can be divided into two main categories: region based and contour-based methods. Region-based methods use the whole area of an object for shape description, while contour-based methods use only the information present in the contour of an object. The shape descriptors are as follows: features calculated from objects contour: Circularity aspect ratio discontinuity angle irregularity length irregularity complexity right-angleness sharpness

Those are translation, rotation (except angle), and scale invariant shape descriptors. It is possible to extract image contours from the detected edges. From the object contour the shape information is derived. We extract and store a set of shape features from the contour image and for each individual contour

40

6. Soft Computing Techniques


The real world problems are mostly imprecise and uncertain. Conventional computing techniques, also called as hard computing techniques, are based upon precise analytical models which arrive to an ideal output at the cost of processing time. Thus hard computing are susceptible to imprecision, uncertainty, partial truth and approximations. Soft computing techniques overcome these problems as they are built to become tolerant to solve the non-ideal environment problems. Soft computing techniques are not based upon the precise analytical models to exploit the imprecision and uncertainty and achieve tractability, robustness and low cost.The principle techniques in soft computing are artificial neural network, fuzzy logic and genetic algorithm which are described below. I. Artificial Neural Network In its most general form, an artificial neural network is a machine that is designed to model the way in which the brain performs a particular task or function of interest; the network is usually implemented using electronic components or simulated in software on a digital computer. Our interest will be confined largely to neural networks that perform useful computations through a process of learning. To achieve good performance, neural networks employ a massive interconnection of simple computing cells referred to as neurons or processing units. We may thus offer the following definition of a neural network viewed as an adaptive machine. A neural network is a massively parallel distributed processor that has a natural propensity for storing experiential knowledge and making it available for use. It resembles the brain in two respects: 1. Knowledge is acquired by the network through a learning process. 2. Interneuron connection strengths known as synaptic weights are used to store the knowledge. The procedure used to perform the learning process is called a learning algorithm. Artificial neural networks are also referred to as Neuro-Computers, Connectionist Networks, Parallel Distributed Processors, etc. Differences between neural networks and digital computers are as follows: Neural Networks Inductive Reasoning. Given input and output data (training examples), we construct the rules. Computation is parallel, asynchronous, and collective. Memory is distributed, internalized, short term and content addressable. Fault tolerant, redundancy, and sharing of responsibilities. They have dynamic connectivity. Digital Computers Deductive Reasoning. We apply known rules to input data to produce output. Computation is serial, synchronous, and centralized. Memory is in packet, literally stored, and location addressable. Not fault tolerant. One transistor goes and it no longer works. They have static connectivity.

41 Models of a neuron A neuron has a set of n synapses associated to the inputs. Each of them is characterized by a weight. A signal xi, i=1,2n at the ith input is multiplied (weighted) by the weight wi, i=1,2n. The weighted input signals are summed. Thus, a linear combination of the input signals w1x1++wnxn is obtained. A "free weight" (or bias) w0, which does not correspond to any input, is added to this linear combination and this forms a weighted sum z=w0+w1x1++wnxn. A nonlinear activation function is applied to the weighted sum. A value of the activation function y= (z) is the neuron's output.

w0 w0 x1 w1
...

x1
Z=

w1 x1

w x
i

(Z)
Output

. . .
xn

f ( x1 ,..., xn ) (z)

( z ) f ( x1 ,..., x n ) wn x n

xn

wn

z w0 w1 x1 ... wn xn

f ( x1 ,..., xn ) F ( w0 w1 x1 ... wn xn )
Where f is a function to be earned x1,..xn are the inputs is the activation function Activation function: An activation function is for limiting the amplitude of the output of a neuron. The activation function is generally non-linear. Linear functions are limited because the output is simply proportional to the input. Learning process

This definition of the learning process implies the following sequence of events: 1. The neural network is stimulated by an environment. 2. The neural network undergoes changes as a result of this stimulation. 3. The neural network responds in a new way to the environment, because of the changes that have occurred in its internal structure. Let wkj(n) denote the value of the synaptic weight wkj at time n. At time n an adjustment wkj(n) is applied to the synaptic weight wkj(n), yielding the updated value

42

wkj (n +1) = wkj (n) + wkj (n) A prescribed set of well-defined rules for the solution of a learning problem is called a learning algorithm. Training types in ANN a) Fixed: There is no learning required for the fixed-weight networks, so a learning mode is supervised or unsupervised. b) Supervised learning: In supervised training, both the inputs and the outputs are provided. The network then processes the inputs and compares its resulting outputs against the desired outputs. Errors are then propagated back through the system, causing the system to adjust the weights which control the network. c) Unsupervised learning: In unsupervised training, the network is provided with inputs but not with desired outputs. The system itself must then decide what features it will use to group the input data. This is often referred to as self-organization or adaption. Example architectures : Kohonen Networks, ART Benefits of ANN

The use of neural networks offers the following useful properties and capabilities: i) Nonlinearity. A neuron is basically a nonlinear device. Consequently, a neural network, made up of an interconnection of neurons, is itself nonlinear. Moreover, the nonlinearity is of a special kind in the sense that it is distributed throughout the network. ii) Input-output mapping. A popular paradigm of learning called supervised learning involves the modification of the synaptic weights of a neural network by applying a set of training samples so as to minimize the difference between the desired response and the actual response of the network produced by the input signal in accordance with an appropriate criterion. Thus the network learns from the samples by constructing an input-output mapping for the problem at hand. iii) Adaptivity. Neural networks have a built-in capability to adapt their synaptic weights to changes in the surrounding environment. Moreover, when it is operating in a non-stationary environment a neural network can be designed to change its synaptic weights in real time. The natural architecture of a neural network for pattern classification, signal processing, and control applications, coupled with the adaptive capability of the network, makes it an ideal tool for use in adaptive pattern classification, adaptive signal processing, and adaptive control. iv) Contextual information. Knowledge is represented by the very structure and activation state of a neural network. Every neuron in the network is potentially affected by the global activity of all other neurons in the network. Consequently, contextual information is dealt with naturally by a neural network. v) Fault tolerance.

43 A neural network, implemented in hardware form, has the potential to be inherently fault tolerant in the sense that its performance is degraded gracefully under adverse operating. For example, if a neuron or its connecting links are damaged, recall of a stored pattern is impaired in quality. However, owing to the distributed nature of information in the network, the damage has to be extensive before the overall response of the network is degraded seriously. vi) VLSI implementability. The massively parallel nature of a neural network makes it potentially fast for the computation of certain tasks. This same feature makes a neural network ideally suited for implementation using verylarge-scale-integrated (VLS1) technology. vii) Uniformity of analysis and design. Basically, neural networks enjoy universality as information processors in as the same notation is used in all the domains involving the application of neural networks. This feature manifests itself in different ways:Neurons, in one form or another, represent an ingredient common to all neural networks which commonality makes it possible to share theories and learning algorithms in different applications of neural networks. Modular networks can be built through a seamless integration of modules. viii) Neurobiological analogy. The design of a neural network is motivated by analogy with the brain, which is a living proof that fault-tolerant parallel processing is not only physically possible but also fast and powerful. Neurobiologists look to (artificial) neural networks as a research tool for the interpretation of neurobiological phenomena. On the other hand, engineers look to neurobiology for new ideas to solve problems more complex than those based on conventional hard-wired design techniques.

Applications of ANN classification A. in marketing: consumer spending pattern classification B. In defence: radar and sonar image classification C. In agriculture & fishing: fruit and catch grading D. In medicine: ultrasound and electrocardiogram image classification, EEGs, medical diagnosis recognition and identification A. In computing and telecommunications: speech, vision and handwriting recognition B. In finance: signature verification and bank note verification assessment A. In engineering: product inspection monitoring and control B. In defence: target tracking C. In security: motion detection, surveillance image analysis and fingerprint matching forecasting and prediction A. In finance: foreign exchange rate and stock market forecasting B. In agriculture: crop yield forecasting C. In marketing: sales forecasting D. In meteorology: weather prediction

44 II. Fuzzy Logic Fuzzy logic is a superset of conventional (Boolean) logic that has been extended to handle the concept of partial truth -- truth values between "completely true" and "completely false". It was introduced by Dr. Lotfi Zadeh of UC/Berkeley in the 1960's as a means to model the uncertainty of natural language. Following is the base on which fuzzy logic is built: As the complexity of a system increases, it becomes more difficult and eventually impossible to make a precise statement about its behavior, eventually arriving at a point of complexity where the fuzzy logic method born in humans is the only way to get at the problem. Characteristics

In fuzzy logic, exact reasoning is viewed as a limiting case of approximate reasoning. In fuzzy logic everything is a matter of degree. Any logical system can be fuzzified. In fuzzy logic, knowledge is interpreted as a collection of elastic or, equivalently , fuzzy constraint on a collection of variables

Fuzzy Perception A fuzzy perception is an assessment of a physical condition that is not measured with precision, but is assigned an intuitive value. Measured, non-fuzzy data is the input for the fuzzy logic method, the assessment of of the physical condition from this data is called Fuzzy perception.Examples: temperature measured by a temperature transducer, motor speed, economic data, financial markets data, etc. Linguistic/Fuzzy Variables Professor Lotfi Zadeh proposed the concept of linguistic or "fuzzy" variables. These are the general linguistic objects or words, rather than numbers. The sensor input is a noun, e.g. "temperature", "displacement", "velocity", "flow", "pressure", etc. Since error is just the difference, it can be thought of the same way. The fuzzy variables themselves are adjectives that modify the variable (e.g. "large positive" error, "small positive" error, "zero" error, "small negative" error, and "large negative" error). As a minimum, one could simply have "positive", "zero", and "negative" variables for each of the parameters. Additional ranges such as "very large" and "very small" could also be added to extend the responsiveness to exceptional or very nonlinear conditions.
Figure 6.1: An example of linguistic/fuzzy variables

45 Crisp sets Crisp sets are the binary sets in which an element either belongs to the set or doesn't.

Figure 6.2: Representation of a Crisp Sets

i.e. only two outcomes possible {True, false} or {1, 0} Fuzzy Subsets As there is a strong relationship between Boolean logic and the concept of a subset, there is a similar strong relationship between fuzzy logic and fuzzy subset theory. In classical set theory, a subset U of a set S can be defined as a mapping from the elements of S to the elements of the set {0, 1}, U: S {0, 1} This mapping may be represented as a set of ordered pairs, with exactly one ordered pair present for each element of S. The first element of the ordered pair is an element of the set S, and the second element is an element of the set {0, 1}. The value zero is used to represent non-membership, and the value one is used to represent membership. The truth or falsity of the statement x is in U is determined by finding the ordered pair whose first element is x. The statement is true if the second element of the ordered pair is 1, and the statement is false if it is 0. Similarly, a fuzzy subset F of a set S can be defined as a set of ordered pairs, each with the first element from S, and the second element from the interval [0,1], with exactly one ordered pair present for each element of S. This defines a mapping between elements of the set S and values in the interval [0,1]. The value zero is used to represent complete non-membership, the value one is used to represent complete membership, and values in between are used to represent intermediate degrees of membership. The set S is referred to as the Universe Of Discourse for the fuzzy subset F. Frequently, the mapping is described as a function, the Membership Function of F. The degree to which the statement x is in F is true is determined by finding the ordered pair whose first element is x. The Degree of Truth of the statement is the second element of the ordered pair. Example: Let's talk about people and "tallness". In this case the set S (the universe of discourse) is the set of people. Let's define a fuzzy subset TALL, which will answer the question "to what degree is person x tall?" Zadeh describes TALL as a LINGUISTIC VARIABLE, which represents our cognitive category of "tallness". To each person in the universe of discourse, we have to assign a degree of membership in the fuzzy subset TALL. The easiest way to do this is with a membership function based on the person's height.

46 tall (x) = { 0, (height (x)-5ft.)/2ft., 1, if height(x) < 5 ft, if 5 ft. <= height (x) <= 7 ft, if height(x) > 7 ft

A graph of this looks like:


1.0 + +------------------| / | / 0.5 + / | / | / 0.0 +-------------+-----+------------------| | 5.0 7.0

Height( ft) For this definition, some example values:

Person Billy Yoke Drew Erik Mark Kareem

Height 3' 2" 5' 5" 5' 9" 5' 10" 6' 1" 7' 2"

Degree of Tallness 0.00 0.21 0.38 0.42 0.54 1.00

-------------------------------------------------------------

Expressions like "A is X" can be interpreted as degrees of truth, e.g., "Drew is TALL" = 0.38. Fuzzy Logic Operations Complex equations in fuzzy logic like: X is LOW and Y is HIGH or (not Z is MEDIUM) are represented by using fuzzy logic operators. The standard operators in Fuzzy Logic are: 1) Complement Operation: It can be defined as: Negate (negation criterion): truth (not x) = 1.0 - truth (x)

Figure 6.3: Complement Operation

2) Intersection Operation: It is also called Fuzzy AND operation. This operation can be defined as: Intersection (minimum criterion): truth (x and y) = minimum (truth(x), truth(y))

Figure 6.4: Intersection Operation

47

3) Union Operation: It is also called as Fuzzy OR operation and can be defined as: Union(maximum criterion): truth (x or y) = maximum (truth(x), truth(y))

Figure 6.5: Union Operation

In order to clarify this, a few examples are given. Let A be a fuzzy interval between 5 and 8 and B be a fuzzy number about 4. The corresponding figures are shown below.

Figure 6.6 graphical representation of A

Figure 6.7: graphical representation of A

The figure below gives an example for a negation. The blue line is the Negation of the fuzzy set A.

Figure 6.8: Negation function applied on A and B

48 The following figure shows the fuzzy set between 5 and 8 AND about 4 (blue line). This time the minimum criterion is used.

Figure 6.9: AND operation on A and B.

Finally, the Fuzzy set between 5 and 8 OR about 4 is shown in the next figure (blue line). This time the maximum criterion is used.

Figure 6.10: OR operation on A and B

The Method:

Figure 6.11: Fuzzy controller block diagram

49 A typical Fuzzy Control algorithm would proceed as follows: 1. Obtaining information: Collect measurements of all relevant variables. 2. Fuzzification: Convert the obtained measurements into appropriate fuzzy sets to capture the uncertainties in the measurements. 3. Running the Inference Engine: Use the fuzzified measurements to evaluate the control rules in the rule base and select the set of possible actions. 4. Defuzzification: Convert the set of possible actions into a single numerical value. 5. The Loop: Go to step one. 6. Averaging and weighting the resulting outputs from all the individual rules into one single output decision or signal which decides what to do or tells a controlled system what to do. The output signal eventually arrived at is a precise appearing, defuzzified, "crisp" value. Comparison to probability Fuzzy logic and probability are different ways of expressing uncertainty. While both fuzzy logic and probability theory can be used to represent subjective belief, fuzzy set theory uses the concept of fuzzy set membership (i.e., how much a variable is in a set), and probability theory uses the concept of subjective probability (i.e., how probable do I think that a variable is in a set). Advantages of Fuzzy Logic (FL) FL offers several unique features that make it a particularly good choice for many control problems. 1) It is inherently robust since it does not require precise, noise-free inputs and can be programmed to fail safely if a feedback sensor quits or is destroyed. The output control is a smooth control function despite a wide range of input variations. 2) Since the FL controller processes user-defined rules governing the target control system, it can be modified and tweaked easily to improve or drastically alter system performance. New sensors can easily be incorporated into the system simply by generating appropriate governing rules. 3) FL is not limited to a few feedback inputs and one or two control outputs, nor is it necessary to measure or compute rate-of-change parameters in order for it to be implemented. Any sensor data that provides some indication of a system's actions and reactions is sufficient. This allows the sensors to be inexpensive and imprecise thus keeping the overall system cost and complexity low. 4) Because of the rule-based operation, any reasonable number of inputs can be processed (1-8 or more) and numerous outputs (1-4 or more) generated, although defining the rulebase quickly becomes complex if too many inputs and outputs are chosen for a single implementation since rules defining their interrelations must also be defined. It would be better to break the control system into smaller chunks and use several smaller FL controllers distributed on the system, each with more limited responsibilities. 5) FL can control nonlinear systems that would be difficult or impossible to model mathematically. This opens doors for control systems that would normally be deemed unfeasible for automation.

50 Drawbacks of Fuzzy logic Requires tuning of membership functions Fuzzy Logic control may not scale well to large or complex problems Deals with imprecision, and vagueness, but not uncertainty Applications:

Control (Robotics, Automation, Tracking, Consumer Electronics) Information Systems (DBMS, Info. Retrieval) Pattern Recognition (Image Processing, Machine Vision) Decision Support (Adaptive HMI, Sensor Fusion)

III. Genetic Algorithms


A genetic algorithm (GA) is a search heuristic (experience based technique) that mimics the process of natural evolution. This is done by the creation within a machine of a population of individuals represented by chromosomes, in essence a set of character strings that are analogous to the chromosomes that we see in our own DNA. The individuals in the population then go through a process of evolution. As such they represent an intelligent exploitation of a random search within a defined search space to solve a problem.

Figure 6.12: Darwinian Evolution Paradigm

Genetic Algorithms can be seen as programs that simulate the logic of Darwinian selection, if you understand how populations accumulate differences over time due to the environmental conditions acting as a selective breeding mechanism then you understand GAs. Put another way, understanding a GA means understanding the simple, iterative processes that underpin evolutionary change. The genetic algorithm is a probabilistic search algorithm that iteratively transforms a set (called a population) of mathematical objects (typically fixed-length binary character strings), each with an associated fitness value, into a new population of offspring objects using the Darwinian principle of natural selection and using operations that are patterned after naturally occurring genetic operations, such as crossover (sexual recombination) and mutation. Process Outline For a given problem, by taking a population of possible answers and evaluating them against the best possible solution, the fittest individuals of the population are determined. This population of possible answers comprises of all possible solutions to the problem, which in the case of a standard problem, are randomly specified. This population may comprise of solutions already known to work, with the aim of the GA being to improve them. The fitness criteria, depends upon the reproduction capabilities of the individual such that those individuals which provide a better solution to the target problem is likely to be given more reproduction opportunities.

51 These promising candidates are kept and allowed to reproduce. Multiple copies are made of them, but the copies are not perfect; random changes are introduced during the copying process. These digital offspring then go on to the next generation, forming a new pool of candidate solutions, and are subjected to a second round of fitness evaluation. Those candidate solutions which were worsened, or made no better, by the changes to their code are again deleted; but again, the random variations introduced into the population may have improved some individuals, making them into better, more complete or more efficient solutions to the problem at hand. Again these winning individuals are selected and copied over into the next generation with random changes, and the process repeats. The expectation is that the average fitness of the population will increase each round, and so by repeating this process for hundreds or thousands of rounds, very good solutions to the problem can be discovered. The iterative process is shown below:

Figure 6.13: Genetic Algorithm Process Outline

The fitness measures or selection measures may not always be based on simple low-to-high property dependent criteria. Better individuals are preferred Best is not always picked Worst is not necessarily excluded Nothing is guaranteed These procedures quickly converge to optimal solutions after examining only a small fraction of the search space and have been successfully applied to complex engineering optimization problems. Outline of the Basic Genetic Algorithm 1. [Start] Generate random population of n chromosomes (suitable solutions for the problem) 2. [Fitness] Evaluate the fitness f(x) of each chromosome x in the population

52 3. [New population] Create a new population by repeating following steps until the new population is complete : 1. [Selection] Select two parent chromosomes from a population according to their fitness (the better fitness, the bigger chance to be selected) 2. [Crossover] With a crossover probability, cross over the parents to form a new offspring (children). If no crossover was performed, offspring is an exact copy of parents. 3. [Mutation] With a mutation probability mutate new offspring at each locus (position in chromosome). 4. [Accepting] Place new offspring in a new population 4. [Replace] Use new generated population for a further run of algorithm 5. [Test] If the end condition is satisfied, stop, and return the best solution in current population 6. [Loop] Go to step 2 The notion of evaluation and fitness are sometimes used interchangeably. However, it is useful to distinguish between the evaluation function and the fitness function used by the genetic algorithm. In essence, the evaluation function provides a measure of the performance of the individual with respect to a parameters. On the other hand, the fitness function transforms that measure into an allocation of reproductive oppportunities. The evaluation of a string representing a set of parameters is independent of the evaluation of any other string. The fitness of a string, however, is always defined with respect to other members of the current population. Reproduction and Crossover play a very important role in the artificial genetic algorithm search. Reproduction (duplication) emphasizes highly fit strings or good solutions, and Crossover recombines these selected solutions for new, potentially better solutions. Mutation plays an important secondary role in the search by providing diversity in the population. Strengths of GAs The first and most important point is that genetic algorithms are intrinsically parallel. Most other algorithms are serial and can only explore the solution space to a problem in one direction at a time, and if the solution they discover turns out to be suboptimal, there is nothing to do but abandon all work previously completed and start over. However, since GAs have multiple offspring, they can explore the solution space in multiple directions at once. If one path turns out to be a dead end, they can easily eliminate it and continue work on more promising avenues, giving them a greater chance each run of finding the optimal solution. Due to the parallelism that allows them to implicitly evaluate many schemas at once, genetic algorithms are particularly well-suited to solving problems where the space of all potential solutions is truly huge - too vast to search exhaustively in any reasonable amount of time. Most problems that fall into this category are known as "nonlinear". In a linear problem, the fitness of each component is independent, so any improvement to any one part will result in an improvement of the system as a whole. Needless to say, few real-world problems are like this. Nonlinearity is the norm, where changing one component may have ripple effects on the entire system, and where multiple changes that individually are detrimental may lead to much greater improvements in fitness when combined. Genetic Algorithms cater to such requirements and problems quite easily.

They perform well in problems for which the fitness landscape is complex - ones where the fitness function is discontinuous, noisy, changes over time, or has many local optima. Most practical problems have a vast solution space, impossible to search exhaustively; the challenge then becomes how to avoid the local optima - solutions that are better than all the others that are similar to them, but that are not as good as different ones elsewhere in the solution space. Many search algorithms can become trapped by local optima: if they reach the top of a hill on the fitness

53 landscape, they will discover that no better solutions exist nearby and conclude that they have reached the best one, even though higher peaks exist elsewhere on the map. Genetic algorithms, on the other hand, have proven to be effective at escaping local optima and discovering the global optimum in even a very rugged and complex fitness landscape. All four of GA's major components: parallelism, selection, mutation, and crossover, work together to accomplish this. In the beginning, the GA generates a diverse initial population, casting a "net" over the fitness landscape. Small mutations enable each individual to explore its immediate neighborhood, while selection focuses progress, guiding the algorithm's offspring uphill to more promising parts of the solution space. However, crossover is the key element that distinguishes genetic algorithms from other methods . Without crossover, each individual solution is on its own, exploring the search space in its immediate vicinity without reference to what other individuals may have discovered. However, with crossover in place, there is a transfer of information between successful candidates individuals can benefit from what others have learned, and schemata can be mixed and combined, with the potential to produce an offspring that has the strengths of both its parents and the weaknesses of neither. Limitations of GAs

The first, and most important, consideration in creating a genetic algorithm is defining a representation for the problem. The language used to specify candidate solutions must be robust; i.e., it must be able to tolerate random changes such that fatal errors or nonsense do not consistently result. There are two main ways of achieving this. The first, which is used by most genetic algorithms, is to define individuals as lists of numbers - binary-valued, integer-valued, or real-valued - where each number represents some aspect of a candidate solution. If the individuals are binary strings, 0 or 1 could stand for the absence or presence of a given feature. If they are lists of numbers, these numbers could represent many different things. Mutation then entails changing these numbers, flipping bits or adding or subtracting random values. In this case, the actual program code does not change; the code is what manages the simulation and keeps track of the individuals, evaluating their fitness and perhaps ensuring that only values realistic and possible for the given problem result. The problem of how to write the fitness function must be carefully considered so that higher fitness is attainable and actually does equate to a better solution for the given problem. If the fitness function is chosen poorly or defined imprecisely, the genetic algorithm may be unable to find a solution to the problem, or may end up solving the wrong problem. In addition to making a good choice of fitness function, the other parameters of a GA - the size of the population, the rate of mutation and crossover, the type and strength of selection - must be also chosen with care. If the population size is too small, the genetic algorithm may not explore enough of the solution space to consistently find good solutions. If the rate of genetic change is too high or the selection scheme is chosen poorly, beneficial schema may be disrupted and the population may enter error catastrophe, changing too fast for selection to ever bring about convergence. One well-known problem that can occur with a GA is known as premature convergence. If an individual that is more fit than most of its competitors emerges early on in the course of the run, it

54 may reproduce so abundantly that it drives down the population's diversity too soon, leading the algorithm to converge on the local optimum being represented by that individual rather than searching the fitness landscape thoroughly enough to find the global optimum. To deal with this problem, control the strength of selection, so as not to give excessively fit individuals too great of an advantage.

It is advised against using genetic algorithms on analytically solvable problems. It is not that genetic algorithms cannot find good solutions to such problems; it is merely that traditional analytic methods take much less time and computational effort than GAs and, unlike GAs, are usually mathematically guaranteed to deliver the one exact solution.

55

7. Computing Hardware
PC Hardware
A computer is a necessary part of a machine-vision system. The computer, during the entire process, has to communicate with motion and process control systems to be effective. Physically, this communication goes through digital inputs and outputs, RS-232 lines, or Ethernet. When communicating with motioncontrol hardware, the computer typically uses standard protocols. The basic steps are: Computer loads variables in the motion-control hardware with the coordinates of a part to pick up. Computer signals a change of state to the motion-control hardware by setting a flag in another variable. Motion-control hardware instructs the robot to move and signals success by setting a flag variable in the computer. This coordination between the vision system and the process or motion controller can range from defective part removal or adjusting some aspect of a process, to sophisticated interactions between these component systems In general, the faster the PC, the less time the vision system will need to process each image. The vibration, dust, and heat often found in manufacturing environments frequently require the use of an industrial-grade or ruggedized PC. Software processes incoming image data and makes pass/fail decisions. Usually, a digital I/O interface board and/or network card makes up the interfacing through which the machine-vision system communicates with other machines, systems, and databases to pass on pass/fail decisions and control action decisions. Latency Latency is the time between an activitys initiation and its result. While a system has an overall activity, from trigger event to output signal, the system itself is comprised of a number of internal events usually occurring serially. Each latency in the system has some degree of uncertainty ranging, in our technology, from a few nanoseconds to several seconds. A systems latency is comprised of the latencies of the individual processes within the system. Thus, all these points should be kept in mind while deciding the hardware for the system. Processing Architecture The majority of vision systems use a single processor; usually a PC, but many high-speed applications are addressed with multiple processors; operating in either a SIMD (Single Instruction Multiple Data) or a MIMD (Multiple Instruction Multiple Data) mode. Processor Speed Processor speed is very important. While the CPUs clock speed is a major determinant of speed, the architecture of the circuits surrounding the CPU can have significant effect on the actual processing speed as does the clock speed. The amount and speed of memory will have an effect on the processors speed of execution. The speed of processors is often judged by the time it takes to complete one or a series of benchmark algorithms. For this to be meaningful, the system developer needs to use benchmarks that have similar processing characteristics to those the application requires. An application environment that uses extensive correlation or convolution algorithms and only basic I/O will be well served by a DSP. One requirement of a high-speed or a real-time vision system using a computer is that concurrent tasks

56 performed by the computer do not add to the latency of the vision task. In most implementations, this is accomplished simply by having the processor not perform any other task than the primary vision task. In more sophisticated implementations using real-time multi-tasking operating systems, there can be concurrent tasks running at a priority lower than the vision task. Instruction Set A deterministic instruction is one that executes in the same time regardless of the data. Such an instruction set is favorable for a predictable system. The most frequently used instructions in a real-time system need to be deterministic to insure a minimum variation in latency. Achieving a deterministic instruction set is only practical in embedded processors designed specifically to provide this characteristic. In addition, operating system will insert delays in the execution of the program to handle task scheduling, interrupt processing, and virtual memory management. An additional consideration in instruction sets is any provision for parallel calculations. Single Processor

Figure 7.1: Single Processor System Architecture

The most prevalent processor used for machine vision and scientific imaging is the PC. This is due to its very high performance to cost factor, availability of peripherals, availability of programming and engineering talent that is familiar with the hardware and software environment. However, even with its power, the PC does not have the processing power to handle many high-speed applications. Multi-Processor

Figure 7.1: Multi-Processor System Architecture

A multi-processor design may be used either to gain more raw processing power or to use different processors, each for their individual strengths. Perhaps the most common approach to multi-processor image processing is to use an embedded processor; a DSP, Intel Pentium, or Power PC being the most common, within a PC. The embedded

57 processor, which does not have the system overhead of the main CPU, performs the data intensive image processing like image transforms, correlation, and convolution. The CPU may be used to make a final decision and to control the outputs. The embedded processor may be on a separate board inside the PC and rely on image data being transferred over the computers bus. It may be on a separate board with a private data interconnect to the frame grabber, or it may be integrated onto the frame grabber itself. Another common approach to gaining processing power for image processing is to use ASICs. These integrated circuits process image data in hardware, and are very fast. They can be designed to provide any number of image processing functions, but they are usually not programmable by the user; the available functions are determined by the design of the ASIC or its programming at the factory. The most common functions available in ASIC processing are look-up tables (LUT) and certain convolutions. Still another common approach to multi-processor image processing is to use a PC with multiple CPUs; typically either two or four. This multiplies the available processing power almost by the number of CPUs available. The operating system still requires resources from one processor, but a portion of that processor and the resources of the others are available for image processing. The successful use of such a multiprocessor environment requires an appropriate operating system that supports multi-processors, application software written to run in a multi-processor environment, and careful attention to competition for common resources. Operating System The operating system or kernel manages system resources. Most computers use a powerful, generalpurpose operating system which provides a very rich array of capabilities to simplify the development of applications, but is not particularly high-speed, and is definitely not real-time. By installing only the absolute minimum part of the operating system and carefully avoiding any task, like disk accesses, which can take up significant portions of the processors resources, the operating system can perform ade quately for some high-speed or real-time applications. Some applications use a real time operating system (RTOS). Real-time versions or extensions are becoming available for the general-purpose operating systems. What distinguishes a real-time operating system or kernel is its facilities to support timeliness in applications running under them. These features include preemptive interrupt processing with guaranteed latencies, and the ability to prioritize software tasks or threads so that time critical tasks get executed before more subsidiary or secondary tasks. Device drivers for special devices like the frame grabber come either from the frame grabber supplier or from a supplier of an application program or library. II. Embedded Hardware It generally implies use of embedded computers in place of standard PCs in machine vision systems. Embedded computers are originally interfaced with the world through push buttons and light-emitting diodes (LEDs). The traditional embedded device processes simple inputoutput (I/O), whereas the modern embedded device processes multimedia, including still images, video, and audio, with concurrent interfaces. The same case applies to machine vision systems also. A general machine vision system incorporating embedded hardware can be shown as:

58

Figure 7.3: Embedded Hardware based Machine Vision System.

Embedded Computer Architecture The embedded computer system consists of five nodes, which are fully connected by ten dual port memories DPij as shown in Figure. High performance interconnection network provides efficient interprocessor communication. There are three computer processor cores for computational intensive tasks and two IO processor cores (IPd) for realtime control. The IP processors support fault tolerance and recovery from failures in addition to task scheduling and load balancing. Both IP processors are capable of serving as system controller, however at a given time, one of them is designated as controller for system monitoring. The other IP monitors the designated controller and take over the role of controller when the designated controller fails. The system nodes are self-checking and in the Figure 7.4: Embedded Processor Architecture case of a failure they isolate themselves. A watchdog timer detects a node failure while failure of DP memories and their interface is detected during message transfer. The processor interconnection network provides alternate routes for inter-processor communication. A number of fault-tolerant strategies are used for fault detection, containment and system recovery. The failure of node components is handled as a single fault. The interrupting capabilities of a faulty processor are disabled and its access to DP memories is also inhibited. The system controller broadcasts the failure to healthy nodes and invokes a diagnostic process. The failed node is put into service for a transient fault. Otherwise, it is kept out of the system and its tasks are re-scheduled. A node with a faulty program memory is utilized in a degraded mode by executing its critical tasks from DP memory blocks. The processor architecture supports various software and hardware fault-tolerance configurations. Three compute processors can be connected in TMR (triple modular redundant) mode while one of the IP processor serves as a voter.

59

III. Smart Cameras


A smart camera is defined as a vision system which, in addition to image capture circuitry, is capable of extracting application-specific information from the captured images, along with generating event descriptions or making decisions that are used in an intelligent and automated system. So, to say, a smart camera is a camera that not only captures but also understands images. Three essential common features for cameras as being smart: Integration of some key functions into the device (e.g., optics, illumination, imaging capture, and image processing). Utilizing a processor and software in order to achieve computational intelligence at some level. The ability to perform multiple applications without requiring manual actions. The ideas behind smart cameras is to make something small enough to be positioned near rapidly moving machinery or on assemblies that themselves move within an automated system. Moreover, smart cameras carry enough software to serve as a complete vision system. So they can be deployed throughout a production line to detect problems and defects as they arise. Components The components of a typical smart camera include: 1. Sensors: Image detection equipment, such as a charge-coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) that converts lens projections into a voltage sequence, which can then be digitized or stored in memory. 2. Digitization circuit: A conversion device that maps a set of points onto an image and translates them into pixels to create a digital representation. 3. Central processing unit: A CPU, or in some cases a digital signal processor (DSP), that executes algorithmic programs for interpreting a digital image code. 4. Storage hardware: Primary and secondary memory, such as RAM or Flash, used to run CPU programs, or to record and store images for future use. 5. Communication technology: A method for connecting cameras to external devices. An Ethernet or RS232 signal transmits encoded images to a computer for analysis, or delivers instructions to reactive equipment. 6. Lighting Device/LED: An illumination apparatus for clearer image captures. This is internal to the camera. Some types may incorporate all of the listed components, while others retain only the sensors, digital circuitry, and communication interface necessary for supporting a larger machine vision system. Some types may even include components not present in this list, as per the specialized requirement.

Figure 7.5: Typical Smart Camera Internal Structure

60 Comparing Approaches It is interesting to compare the smart-camera approach to industrial stand-alone cameras used in conjunction with PCs. In the latter case, cameras typically connect to frame grabbers on PCs through FireWire, Camera Link, USB, or some other broadband connection standard. But bandwidth limitations of these connections can pose problems in high-speed applications. The speed of the connection may be such that the camera must first compress an image before sending it back to the frame grabber. At the PC, the frame grabber decompresses and manages the image. The compress/decompress cycle necessary for each image limits the reaction time available for controlling processes or machines. The fact that smart cameras bundle an image processor with a frame grabber eliminates this potential bottleneck is the key. Close proximity of camera and frame grabber also make possible other operations that come in handy for special imaging tasks. For example, smart cameras can set shutter time for individual frames if necessary. Similarly, close proximity between the camera and the DSP handling the image lets Fast Fourier Transforms and other calculations common in image analysis progress quickly. PC vs. Smart Camera The smart camera system is a relatively new innovation in the machine vision industry, which has mostly relied on PC-based image processing in automated operations. Initially, smart cameras had a limited capacity for interpreting images and were mainly assigned to basic projects, such as reading barcodes. But, as technological advances and processing power have progressed, smart cameras can now manage a wider range of duties and compete with PC-based systems in the industrial machine vision market. There are a number of differences between PC-based and smart camera systems. There are complex distinctions in design, capability, and cost between PCs and smart cameras, as well as between the various models in each group, but a basic comparison may be helpful in making the decision.

Figure 7.6: Size vs. Processing Capability comparison of different approaches (left) & Practical Smart Camera Specimen (right)

61 A brief comparison between both the smart camera and PC-based approaches is given below:

Smart Camera Costs less due to less number of moving parts and less temperature management, compact, easier to maintain since requires less knowledge, easier to integrate with an existing automation system.

PC Based

System developed around it can become bulky or overly complex, and usually requires interfaces for each component in the system. Integrating such a network into a manufacturing process can be a challenging task, and may require advanced computer knowledge to install. Less durable, more wear and tear maintenance. Smart cameras usually have less processing capability PC-based vision systems generally have greater than PC-based systems. This limits the range and processing power and are capable of handling complex number of tasks that a smart camera can handle. operations at relatively high speed. As the raw video stream does not need to comply with the cameras output bandwidth any more, sensors with higher spatial or temporal resolutions can be used: due to the very close spatial proximity between sensor and processing means, higher bandwidth can be achieved more easily. Smart cameras often use proprietary hardware, making replacement or modification of parts a challenging process. When coupled with a compressed design, this feature severely limits the degree to which smart cameras can be upgraded. Conventional vision system (camera + PC) have their major drawbacks: First, transmitting the raw video stream in full spatial resolution at full frame rate to the external PC for doing the whole processing there can easily exceed todays networking bandwidths.. Most PCs are upgradeable and can have components swapped with relative ease. This versatility makes a PC system highly customizable, as it can have newer or more application-specific hardware installed to specialize on a certain task, or have its general range of functions expanded.

Smart Camera Applications Industrial Quality Control Smart cameras are also used for industrial measuring. Using sensors, the camera can determine and record a components physical dimensions without making direct contact. In industrial production, manufacturers often use smart cameras for inspection and quality assurance purposes. A smart camera can be programmed to detect structural or component flaws, missing parts, defective or deformed pieces, and other deviations from an intended design. Code Reading and Identification They can read low contrast direct part marked codes and two dimensional codes, where as laser scanners cannot. Code reading and authentication require less processing capacity than product inspection, so relatively simple smart camera models can perform such operations. Optical character recognition is a more complex form of code reading that requires smart cameras to identify typewritten text. The rate of authentication may be slower than that of barcode reading, but with adequate processing power, a smart camera can analyze text to a high degree of

62 accuracy. A smart camera can provide movement correction and repositioning data when working in conjunction with an automated tool. Other applications Some machine vision systems form a visual sensor network, which uses multiple smart cameras positioned at specific locations to capture images of a single object or area from several angles. This method is applied under circumstances in which numerous images fused together are more useful than the individual image each camera obtains. Sensor networks can effectively monitor environmental conditions, track objects in motion, or simulate three-dimensional representations of images. Other applications can be:

unattended surveillance (detection of intruders, fire or smoke detection) biometric recognition and access control (face, fingerprint, iris recognition) visual sensor networks robot guidance

63

8. Applications of Machine Vision Based Inspection Systems


I. Machine Vision-based system for inspecting painted slates Slate manufacturing is a highly automated process. The aims of using automated machine vision inspection are to classify products for quality so that defective units may be rejected, to measure some properties of the product with a view to controlling the production process, and to gather statistics on the efficiency of the production process. Diffuse or collimated lighting are usually used with combination of colour camera to detect the cracks or holes in the slate surface. The second major component of the inspection system is the image processing algorithm that is applied to detect the visual defects. Grey level intensity histograms can be used to identify whether the unit being inspected is defective. More complex algorithms are employed to inspect of multi-coloured and textured surfaces and this is a limiting factor in the use of this technique for slate inspection. Defects in Slates Slates have a rectangular shape and their top surfaces are painted black or grey and have a high gloss finish. The painted surface is nominally flat. The slate surface defects are broadly classified into two categories: a) Substrate defects include incomplete slate, lumps, depressions and template marks. b) Paint faults include no paint, insufficient paint, paint droplets, efflorescence, paint debris and orange peel. These defects can have arbitrary shapes and their sizes range from sub-millimetre to hundreds of square millimetres. A selection of representative defects is shown in Figure 1, where Figure 1(a) shows a defect free (reference) image section.

Figure 8.1: Representative defects found on the slate surface

The Inspection System

The long belt conveyor transports the slates to the inspection line at speeds in the range 15-50 m/min. The slate is illuminated using a 762mm wide fibre optic line light where the light is collimated by a cylindrical lens. The sensing device is a 2k-pixel line-scan camera fitted with a lens. The line-scan camera operates at a scan frequency of 2.5 KHz. A micro-positioner is attached to the camera to facilitate fine adjustment of camera view line. A frame grabber is used to interface the camera to the PC. The slate is

64 aligned by a guide placed on one side of the conveyor and an optical proximity sensor triggers the image capture immediately prior to the arrival of the slate at the inspection line. When the image transfer is complete, the slate image is processed using the image-processing routines.

Figure 8.2: Slate Inspection System

Illumination set-up

The strategy used to image the slate is based on the strong reflecting properties of the slates surface. Light incident on the slate will be reflected and the angle of reflection is equal to the angle of incidence if the slate surface is acceptable. Paint defects have reduced gloss levels, except for paint droplets that may have a higher gloss level than the slate areas of acceptable quality. Substrate defects are associated with uneven slate surfaces that are generated by an incorrect drying/formation process. For this implementation, a collimated light source is used so that changes in the ambient light have a negligible influence on the light reflected by the slate back to the camera. As a mechanical solution to force the slate into a uniform flat position is not feasible, the lens is defocused that is used to collimate the light. This produces a wider band of light and reduces collimation. The resulting reduction in light intensity is compensated by using the spare capacity in the lamp controller. The illumination set-up comprises two lamp controllers, a fiber optic light line and a cylindrical lens. Segmentation of slate image from background
Figure 8.3: Illumination for slate surface

The first step of the image processing procedure involves the identification of slate boundaries facilitated by cutting slots in the conveyor base and ensuring the belt width is less than that of the slate. The light intensity signal arriving at the sensing device from these slots is close to the camera black level when no slate is present and a sharp signal transition is sensed when a slate arrives in the field of view. Therefore, a simple threshold operation is sufficient to identify the slate edges. Corners are located by tracking the horizontal and vertical edge lines to their end positions. To verify if the slate is defect free, initially, a straight line between the detected corner positions is drawn and the inspection start locations are set relative to this line. Rotation is accounted for using the equation of a straight line. After this operation, the slate is segmented from the background and the algorithm can be applied.

65 Image processing algorithm

The mean grey level of successive reference slates can vary by up to 20 grey levels. These variations are not generated by imperfections in the optical or sensing equipment but rather due to acceptable variation in slate surface colour. As such the image processing algorithm has to accommodate these variations. In order to identify the computational components of the inspection algorithm, the grey level histograms of slate images containing paint and substrate defects has to be investigated. This enabled us to draw a number of conclusions: a) the grey level mean is different for each reference slate; b) the limit of this variation can be determined by experimentation; c) the average grey level value of the defect pixels is generally lower than the average grey level value of the background pixels. The devised algorithm is shown in Figure 8.4.

Full Slate Image

NO

Get next Sub-section

Image processing algorithm

Defect?

Reject

no
Final Subsection?

yes
Accept

Figure 8.4: Slate defect detection block diagram

Challenges introduced by factory conditions a) Effect of depth profile variation and vibration on edge detection

If the slate is affected by depth profile variations, the captured image follows the curve of the slate and false rejections occur when there are large differences between the assumed straight edge and the imaged curved edge (Figure 8.5). An example of false defects caused by the edge checking method is shown in Figure 8.6 below.

Figure 8.5: False triggers generated by slates depth profile variation

Figure 8.6: False defects caused by edge checking method as indicated by bold boxes and true positive defects by the faint ones.

66 In order to address this problem, the slates edge is divided into shorter line lengths of 30mm length and the location for these short edge segments is verified. The image sub-sections that will be analyzed by the inspection algorithm are positioned adjacent to these detected short segments. This will avoid the inclusion of background information in the image sub-section that may cause false triggers when the image data is analyzed by the inspection algorithm. b) Effect of variation in speed profile The constancy of conveyor speed is measured by imaging the same slate several times and by imaging slates in succession. The change in image resolution in the moving direction is very small (approx. 0.8 per cent) and does not have any negative effect on the inspection results. PC Platform

The computer is a key element of the system. For inspection type applications, in general, the faster the PC, the less time the system will need to process each image. Due to the vibration, dust, and heat often found in manufacturing environments, an industrial-grade or rugged PC is often required. Conclusion The success rate for correct identification of acceptable and defective slates is 99.32 percent for defecting free slates based on 148 samples and is 96.91 per cent for defective slates based on 162 samples.

II. Machine Vision in Solar Cell Manufacturing Automated optical inspection (AOI) systems because of its aid in error detection and process control are growing rapidly as the solar industry reaching the stage of maturity in which machine vision undergoes mass adoption on the production floor.

Figure 8.7. Wafer Inspector (left) uses NIR backlighting to detect and measure micro-cracks. The system incorporates linescan cameras, which interface to camera link and frame grabbers that are installed in high-speed image-processing PCs. The PCs run material-handling and image processing software. Standard encoder-based image acquisition along with continuous light and speed control ensures consistent image quality. Parallel processing is required to achieve one-wafer per-second throughput (right).

An incoming wafer inspection system looks at the surface conditions and geometry of the wafer, checking for distortion, chips, saw marks, contamination, and obvious cracks. It also looks for micro-

67 cracks, which are especially problematic for solar cell production because they can result in a wafer shattering during the following steps. Shattering not only results in yield reduction but can shut down the production line for the time needed to remove the debris. Other AOI opportunities include inspecting antireflective coatings, printed metallization layers, and a final operational inspection of the finished cells that takes advantage of the solar cells electro-optical characteristics. Illumination setup

The inspection unit mounts into a light shield shaped like a truncated pyramid mounted over the conveyor line carrying wafers (Figure 9.7). The camera mounts in the upper portion, and a row of LED lights on either side of the conveyor illuminates wafers as they come by. The conveyor runs at a constant speed, with the strobe LEDs being used. The inspection unit designed is mounted 1 inch above the conveyor. This allows the light shield to exclude most ambient light while allowing wafers to move freely on the conveyor. The LED light lines are positioned so that specular reflections from the wafer Figure 8.8: Placing lines of LEDs within a light surfaces miss the camera aperture and are absorbed by the shield illuminates conductive ink traces while avoiding specular reflections. light shields inner surface. It is advantageous that the system use a Gigabit Ethernet camera connected to vision controller. Camera and Imaging a) Visible imaging Either linescan or area-scan cameras can be used depending on the system design. In the geometry and surface inspection stages, for instance, visible light and a monochrome linescan or TDI camera will suffice. These types of cameras allow the inspection to occur in line, as the wafer moves along the production flow. In addition, anti-blooming or auto-exposure features are desirable to avoid image saturation. Inspection of the antireflection coatings can also use visible light, but color camera capability is required. Under white light, the antireflective coating on the solar cell will appear blue because longer wavelengths transmit more efficiently while the shorter wavelengths still experience reflection. The exact hue and color saturation of the reflected light depends on the coating thickness. Two-camera systems can be used, providing front side and backside inspection simultaneously. b) Electroluminescence This step takes advantage of the solar cells electro-optical characteristics, which allow the cell to generate luminescence for imaging. This step can pinpoint micro-cracks and other production-induced defects in the finished cell/panel that can cause early failure but might not be detectable via conventional electrical testing. Eliminating defective cells at final inspection can ensure that solar panels fabricated from the remaining cells have a product lifetime exceeding 20 years. The EL process has the additional attribute that the amount of light a cell generates for a given applied current can serve as a measure of the solar cells conversion efficiency. This means the final inspection step can not only detect defects, it can help sort and grade finished cells by their output characteristics. The results can also assist in process control as well as in matching cells for compatibility in a solar panel assembly.

68 c) Near-infrared imaging using backlight One of the most demanding AOI challenges is during the incoming inspection stage: checking for micro-cracks that can occur during crystal growth and wafer sawing. However, the micro-cracks can be too small less than 5 m wideto be seen during a typical surface inspection. Furthermore, cracks are difficult to detect using surface illumination because there is almost no contrast between the reflections of the crack and the surrounding silicon. Instead, this inspection requires the use of backlighting. Because the wafer is only 150200 m thick and silicon is semitransparent in the near-infrared (NIR) spectrum, LED backlighting using 850950-nm wavelengths will provide enough illumination to obtain useful images. Under these backlighting conditions, a crack will scatter light and create a dark line against a light background that is readily detectable. The challenge is that CCD image sensors lose quantum efficiency (QE) in NIR Figure 8.9: Micro-cracks in polysilicon wafers can be captured using NIR wavelengths, resulting in a relatively weak signal. Moreover, camera systems vary considerably in their NIR sensitivity. Some backlighting with sufficient sensitivity at those wavelengths. cameras exhibit as much as 3040% QE at 900 nm while others can be much lower. Therefore, sensor and camera selection are critical in developing an AOI system for backlit micro-crack inspection. The shorter the backlight NIR backlight the exposure time, the faster the production line can run at this step. Machine Vision Based Automated inspection

The machine-vision-based automated inspection system and the robotic wafer handler sit between the screen-printing station and bake oven on the electrode printing production line (see Figure 9.9). The final step, adding the electrodes, is needed to channel electricity out of the wafer. Once wafers leave the oven, they need only a clear conformal Figure 8.10: Robotic wafer-handling station uses rippers such as suction cups coating to protect active layers from
for sorting.

Figure 8.11: Combining machine vision and robotics for automated inspection before baking electrodes onto wafers improves process yield.

69

environmental effects before shipping to customers. Wafer losses at this point are maximally expensive, so high yield is critical. The unit is a PC-based vision controller running the companys Automation Control Environment (ACE) and machine-vision software. ACE is an integrated, point-and-click development environment with configuration tools and basic programming features for robotics applications. The twin requirements of automated inspection and robot guidance call for different analysis algorithms applied to the same image. As the wafer moves on the conveyor, the first step is to acquire an image from the camera and upload it over the Giga Ethernet connection to the vision controller for processing.

Figure 8.12: The machine-vision software locates a wafer with a refined center.

Thresholding and edge finding bring out the wafer outline as well as highlighting the screen-printed electrodes. Inspection tasks are broken into three areas: a) print inspection: It ensures the electrode edges are crisp, form the correct pattern, and are properly registered with the wafer edges. b) inspection for chips: It ensures that the wafer outline does not vary from the correct size and shape: All edges must be straight; dimensions must be correct; and the corner chamfers must also be correct. c) pattern recognition: Inspection for chips Pattern recognition algorithms search for anything out of the ordinary such as cracks or broken ink lines. Linear measurement algorithms verify correct placement of ink patterns that are supposed to be there.

Wafer handling

The wafer-handling station following the inspection uses a robot with an Figure 8.13: Robots pick wafers for unusual robotic gripper and coordinates with the vision system to inspection in the pre-etch inspection, automate both the inspection and handling of the wafers. sorting, and packaging machine. Vision-guided robotic systems inspect and package solar cell wafers. System design

On the front end, the system is required to pick wafers out of boxes and move them to a camera station to inspect the wafers for imperfections such as cosmetic defects, edge defects, and cracks. It also needed to determine x-y and theta position of the wafers when placing them on the conveyor (see Figure 9.13).

70

Figure 8.14: To inspect solar cell wafers, two robots unload boxes, place wafers on a conveyor after inspection by a smart camera for sorting and acid wash, and then load them into boxes after a final inspection

The robot lifts an individual wafer from a box, moves it over the camera, which is located below the conveyor; LED lights on the robot provide additional backlighting for the camera. The wafers are visually inspected; rejects are set aside. The software records the processing data for statistical process control (SPC) recall and evaluation. Reasons for rejects are noted. If they pass inspection, wafers are placed on the conveyor to be moved through the acid wet bench. The system uses a smart camera with a processor and has 64 Mbytes of RAM and 16 Mbytes of flash memory. The CCD camera operates at 75 frames/second and supports partial frame acquisition. The smart camera communicates through Ethernet and RS-232. Vision tool support includes identification (modeling), readers, flaw detection, and application-specific scripting. The camera can analyze for odd shapes and inspect for multiple and overlapping parts in a range of light conditions, which means it is suitable for robotic guidance.

Figure 8.15: The operator interface is a customized GUI that is standard on all plant equipment.

Conclusion

Operator interface should be as such that its standard on all of machines so that any operator can go to any machine in the plant and know how to use it as shown in Figure 9.14.

71

9. Projects on Machine Vision Based Inspection Systems at CEERI


I. MV Based Biscuit Inspection System Like any other manufacturing processes, quality evaluation and sorting are two essential operations performed routinely in biscuit production. Among many tests that need to be carried out on biscuits is the measurement of colour as colour indicates quality and defect. Colour is also an important guide as biscuits will appear more appetizing when its appearance is optimized. Obtaining identical biscuits are difficult even during a short baking period. This is due to the complexity of the baking process where biochemical reactions and physical transformations give rise to biscuits with different shades of colours.

Figure 9.1: Example of non-touching and touching biscuits. In both cases, top left is under-baked, top right is moderately baked, bottom left is over-baked and bottom right is substantially over-baked.

Materials and methods

Biscuit colour grading Colour quality control in most biscuit producers involved human inspectors while some rely on a more objective measurement by using a colorimeter. However, this equipment is not suitable when the sample has colour variations. The biscuits are categorized into four distinct groups reflecting four degrees of baking: under-baked, moderately baked, over-baked and substantially over-baked. It can be seen in Fig. 1 that the colour is not uniformly distributed. This is due to the temperature variation inside the oven, causing colours of biscuits to appear darker in some regions. Therefore, the challenge for image processing software is to use this information as the basis for colour inspection. Machine vision system

Figure 9.2: Schematic diagram of biscuit machine vision inspection system

Figure 9.3: Biscuit Inspection system

72 The hardware can be a Workstation equipped with a colour frame grabber, an illumination system, cable, a charge-coupled device camera, and conveyor. The frame grabber is mounted into the workstation. A high quality 3-CCD camera was used as the image capturing device with a sustainable speed of 25 frames per second, captured at a spatial resolution of 640480 pixels. The camera comes with standard C-mount type optical lens connected to a frame grabber. The station is illuminated using a white ultra-highfrequency uorescent ring light. The light bulb was fitted with a continuous light intensity control which allowed 10100% intensity adjustment. Standard conveyor belt with adjustable speed was used in this study to simulate moving object detection. The schematic diagram of biscuit inspection system showing all essential elements is depicted in Figure 9.2. The proposed colour inspection system is divided into two main steps: a) Pre-processing: The pre-processing step includes calibration of machine vision and the image processing part. In this work the machine vision was calibrated using four colour samples. The samples are corresponding to red, green, blue, and yellow colour standards respectively. The image processing part involves image acquisition and smoothing, RGB to HLS colour transformation, and image segmentation. b) Post-processing: The post-processing step includes dimensionality reduction of the segmented objects and classification of the objects. Colour space transformation

The images of biscuits were taken by the CCD camera and represented in the three-dimensional RGB colour space. Unfortunately, the RGB colour space used in computer graphics is device dependent, which is designed for specific devices, e.g. cathode-ray tube (CRT) display. Hence, RGB colour has no accurate definition for a human observer. Therefore, the effective way is to transform the RGB information sensed by the machine vision to the Hue-Lightness-Saturation (HLS) colour space. The HLS space was selected since it defines colour not only in the sense of perceptual uniformity, but more significantly, matches to the way that the human perceives colour. Image segmentation and object detection

To separate an image of biscuit from the background, the combination of auto thresholding and watershed transformation techniques are proposed. The auto thresholding method locates the two peaks in the histogram corresponding to the background and object of the image. In this way the biscuit images could be separated from the background. The area of segmented image is calculated in order to distinguish between touching and nontouching biscuits. It was heuristically discovered that the surface area of the biscuit lies in the range between 39,870 and 39,990 pixel square. Therefore, it was possible to distinguish touched from the untouched cases using direct thresholding. If the area was smaller than the threshold value, then, the biscuits were considered untouched. The watershed transformation considers the gradient magnitude of an image as a topographic surface. If the objects have well-defined edges, edge detection will produce a maximum along the edges of each object. These maxima will define each object as a catchment basin since they produce a minimum in each object. A watershed transformation will then label the catchment basins, effectively segmenting the image. After that morphological erosion was applied to smooth the image and remove artifacts. Then, watershed and the distance transformation were implemented in order to separate touching objects and reduce noise due to overlapping. The blob analysis was then used to compute the centre of gravity of each object. Finally, image cropping was applied to the original image in order to obtain the region of interest.

73 MV application software

The custom-designed front end of the systems software incorporates a display showing average size measurements over the last 100 biscuits in each row. To satisfy the customers traceability demands, these statistics are tagged with the time and date and are stored for four years. The operator can easily select a set of saved system parameters from a drop down menu to ease product changeover. All this information is available for monitoring by an engineer on a duplicate display screen located 100 meters away. Because it can deliver data via cables of up to 100 meters in length, Gigabyte Ethernet Figure 9.4: GUI of Software Application for the image is the ideal interface for applications such as this. processing Control Action

If the area of the biscuit under direct thresholding is greater than the heuristically derived value, the decision that the biscuit is touched and based upon this decision the robotic arm is directed to remove the piece. Based on the colour inspection algorithm output as over-baked, substantially over baked and under baked biscuits are removed. Conclusion

Quality evaluation of bakery products has a major role in food industry. Hence the use of the automated quality inspection system in the biscuit production yields not only the higher production but also the quality products that meet the standard of the food.

II. MV Based Fruit Inspection System A machine vision system inspecting fruit quality comprises of two subsystems: Computer vision system and a Fruit handling system. The computer vision system has two modules, namely the image processing module and the pattern recognition module. The bsic steps that take place to determine quality of fruit are: 1. Electromechanical fruit handler places the fruit on the conveyer belt to carry the fruit through a computer vision system to the sorting bins 2. The computer vision system captures the image of the underlying fruit and transmits it to an image processor 3. The processor, after processing the image, presents it to a pattern recognizer 4. The recognizer performs the quality assessments and classifies the underlying fruit into prespecified quality classes, and directs the sorter to direct the fruit to the appropriate bin.

74

Figure 9.5: Layout of a computer mediated Fruit sorting system

Components 1. Image capturing chamber The image capturing chamber is a wooden box that was painted black inside to reduce the light reflection. The ceiling of the chamber was quoted with reflective material to reduce the shading effect. 2. Image Capturing device Two cameras are mounted facing each other in the chamber. The cameras are mounted right under the light source for the best imaging. The size of the captured image is kept generally 320 240 pixels. It is kept small for fast feature extraction and processing. A sample image is shown in Figure.

Figure 9.6: A dates image

3. Preprocessing module A binarization threshold is estimated from the image intensity histogram. The threshold is used to convert the underlying image into a binary image. Fig. below shows the binarized image of the original image. Fig. shows the edges that surround the binarized regions. These edges are extracted by applying Sobel edge operator.

Figure 9.7: Segmented image and its edges

4. Feature definition and extraction External quality factors i.e features are defined. These features are flabbiness, size, shape, intensity and defects. Quality of a fruit is determined by considering properties, usefulness and extraction mechanism of these features.

75 Flabbiness The flabbiness parameter is used to determine the date quality. The flabbiest date is considered of the best quality. We have used the color intensity distribution in the image as an estimate of flabbiness. It is observed that the image of the least flabby date is darker than the flabbier date.

Figure 9.8: Flabbier fruit is brighter

Size The bigger size fruit is considered of better quality. The size is estimated by calculating the area covered by the fruit image. To compute the area, first the fruit image is binarized to separate the fruit image from its background. The number of pixels that cover the fruit image is counted and considered as an estimate of size. Shape Shape irregularity is used as a quality measure. Fruits having irregular shapes are considered of better quality. We estimated it from the outer profile of the fruit image. Intensity Generally it is observed that the better quality date yield high intensity images. The intensity is estimated in terms of the number of wrinkles. The number of edges was considered as the number of wrinkles. To determine the intensity the image is binarized and edges are extracted using Sobel operator and labelled. Defects The bruises and bird flicks are common defects in date fruits. The bird flicks appear brighter in the image so they are determined from the color intensity. An estimate of the average brightness and variations in intensity of the bird flicked area are obtained. The average brightness is thoroughly examined and the bird flicked area size can be tracked and estimated.

Figure 9.9: Bird Flicks

The bruises are estimated from the shape as they generally deform the shape by tearing the fruit.
Figure 9.10: Bruises

76 III. MV Based Paper Dirt Analyzer Dirt counting and dirt particle characterisation of pulp samples is an important part of quality control in pulp and paper production. Nowadays, dirt counting methods include both visual and automatic inspections. Traditional visual inspection compares dirt particles in sample sheets with example dots on a transparent sheet. Automatic inspection is based on scanner or camera image analysis methods in which dirt particles are computed from pulp sample images or a full moving paper web.
Figure 9.11: A Paper Dirt Analyzer employed in an industry

The main steps followed are:

Figure 9.12: Basic steps involved in paper dirt analyser

1. Image acquisition: The images are captured using a scanner or camera .In this step, the low resolution images acquired by the camera or scanner will be analysed and the dirt counting standards will be defined. Imaging with a digital camera Digital cameras use a solid-state device called an image sensor. In these cameras, the image sensor is a Charge-Coupled Device (CCD) or a Complementary Metal-OxideSemiconductor (CMOS) sensor. On the surface of the image sensor are millions of photosensitive diodes that function as light buckets. Each of them captures a single pixel in the image. Imaging with a scanner The basic principle of a scanner is to analyze an image. It allows a computer to convert an object into digital code to display an image. Inside the scanner is a linear array of CCD which is composed of millions of photosensitive cells. For scanning an object, a light bar moves across the object and the light is reflected to the CCD by a system of mirrors. Each cell produces an electrical signal based on the strength of the reflected light. Each signal presents one pixel of the image which is converted into a binary number.

77 2. Correction in non-uniformity of illumination: The non-uniform illumination field affects image contrast, dirt particle characterisation and dirt counting. Therefore, illumination correction can be utilised to obtain a uniform illumination field. Correcting imaging illumination Removal of non-uniform illumination is very important for later processing stages, such as image restoration based on correlation and segmentation based on intensity thresholding. As in this case, thresholding has a main role in segmentation, and uniformity of the image illumination field can affect the improvement of the thresholding operation.

Figure 9.13: The illumination correction of pulp samples. (a) Original camera images of pulp samples, (b) estimated illumination field, (c) illumination corrected images.

3. Multi-level thresholding methods in dirt counting: A different background colour of the pulp sample sheets affects the dirt counting result, especially when utilising a single threshold. Therefore, multilevel thresholding and cluster-based thresholding can improve the result of dirt counting. Segmentation Segmentation is described by an analogy to visual processes as a foreground/background separation. This is a process of reducing the information of an image by dividing it into two regions which corresponds to the scene or the object. Selecting a feature within an image is an important prerequisite to segment desired objects from the scene. Thresholding is applied by selecting a range of the brightness values in the image. The pixels within this range belong to the objects and the other pixels to the background. The output is a binary image to distinguish the regions 4. Reorganization of overlapped dirt particles: In cases where dirt particles overlap, the system counts them as the same dirt particle. Therefore, morphological processing can be utilised as a post processing approach to extract overlapped particles as two separate particles.

78 5. Feature extraction for dirt particles: After the images have been segmented into objects and background regions, the segmented object has to be described in a suitable form for further computer processing. Presenting a region can be done either on the basis of its external characteristics or internal characteristics. An external representation focuses on shape characteristic such as length and internal representation is based on regional property such as colour. The size of a dirt particle is the only feature utilised to categorize the dirt particles related to standards. Geometry and colour features can be extracted in image analysis systems to achieve more information about the characteristics of the dirt particles. Colour and intensity features The RGB (red, green, blue) model for the colour and intensity value of the dirt particles involves basic characteristics which are utilised visually to make the fibre bundles and bark separate. Therefore, these two features are also extracted from the dirt particles in the pulp sample sheets. Geometrical features The geometrical features describe shape and boundary characteristics. Simple geometric features can be understood by human vision but most of them can be extracted by computer vision which can highly affect the classification results. The geometrical features, which are utilised in this study, are defined as follows: 1. Area is the most basic measure of the size of the features in an image. The area is defined as number of pixels located within the boundary of a segmented dirt particle. 2. Major axis of an ellipse is its longest diameter which crosses through the centre and its ends are located at the widest points of the shape. In addition, Minor axis crosses the major axis at the centre and its ends are located at the narrowest points of the ellipse. 3. Eccentricity of the ellipse is the ratio of the distance between the foci of the ellipse and its major axis length. 6. Classification of dirt particles into fibre bundles and bark: Classification is concerned with identifying or distinguishing different populations of objects that may appear in an image based on their features. Dirt particles can be classified into at least two main groups: uncooked wood materials (such as knots, shives and fibre bundles) and bark. By utilising new extracted features, which include more information about the dirt particle, it is possible to perform this categorization in an automated image analysis system. IV. Standardizing Defect Detection for the Surface Inspection of Large Web Steel

Introduction

Escalating pressures towards zero defect rolled material have forced steel manufactures to provide higher levels of product quality. Of primary interest to these end users is the surface quality of the rolled strip, which has significant impact on the outcome of their finishing operation. Simple surface defects like pits, bumps, scratches and holes create problems for finishing operations, but more problematic is the fact that many times these defects do not become visibly noticeable until the operation is complete. This adds complex and expensive rework steps for the end-user that reduce efficiency and drive up production costs. Other more complex surface defects like laminations and oxidation can cause even greater problems. These surfaces look proper in the finished product, but can suffer rapid corrosion failure rates once the product is in the field.

79 Defects of Interest

The key to the successful implementation of any machine vision application relies on a thorough and complete knowledge and quantification of all the defects (or features) of interest. Cold rolled steel develops a wide range of important surface defects which must be detected, recognize and classified.

DEFECT Pit Pinhead Hole Sticker Lamination Rust Scale Scratches Rubs Silver Pincher

DESCRIPTION A small roll-mark type depression A small roll-mark causing pock-like swelling or bump A deep protrusion caused by material imperfection Sub-surface separation from layer adhesion after annealing Sub-surface defect caused by re-rolled separation Red or white surface oxidation Embedded oxides rolled into material Small gauges in surface-specially in direction of travel Group of soft omni-directional scratches from layer sliding Embedded defects, closed or tight, from re-rolled lamination A roll mark caused by uneven surface shear over width of roll

The Interaction of Light with Steel

An incident ray of light impinging on an arbitrary material can either enter the material, be reflected, or be broken into some combination of transmitted and reflected components (see Figure 3). Since steel is a conductor, a large percentage of the light will be reflected from the surface, small components will be absorbed, and virtually no light will be transmitted through the sheet.

Figure 9.14: Interaction of Light at Steel Surface

80 If the surface is perfectly smooth and shiny, a very high percentage of the light will be reflected. If the surface is machined, rolled or granular, a portion of the reflected light will be reflected in directions other than specular. If the scattered light leaves the surface on the same side of the normal as the specular component, but is not purely specular, then we call this diffuse reflection. A camera viewing the surface from this side of the normal will be said to be viewing in bright-field mode because the image of the surface will tend to be bright from the abundance reflected light that enters the camera. If there is any form of defect on the surface which prohibits the perfect reflection of the bright-field illumination, then the camera will detect this defect as a dark area within a light background. If the scattered light leaves the surface towards the same side of the normal as the incident radiation, we call this back-scattered radiation. A camera viewing the surface from this side of the normal will be said to be viewing in dark-field mode because the image of the surface will tend to be dark since the majority of the reflected light will not enter the camera. If there is a defect on the surface that causes light to be back scattered, then the camera will detect this defect as a light area within a dark background. Test Set-Up

The camera and lighting devices were mounted on a large protractor type fixture, with the samples placed at the focal point. This geometry allows the projected line of light to remain fixed at the sample plane as it is rotated throughout the testing angles. In a similar manner, the camera remains in focus as it is rotated throughout the various testing angles.

Figure 9.15: Line Scan Test Set-up

The data is recorded once the camera and lightline are fixed into position. The data is extracted from an 8-bit digital image using a histogram across each defect, and for each possible combination of angles. Each angle is varied (one at a time) by 10 degrees and a new reading is acquired until all combinations are expired. The procedure is then repeated for each of the defects listed. The line histogram provides a quantitative measure of the signal-to-noise ratio for each defect and geometrical configuration.

Data

Bright-field mode defects (see Figure 5) typically exhibit dark defects in a light background, although the background creates noise due to the less than perfect reflecting surface caused by the granularity and crystalline nature of the material. The amount of light is not as serious a concern for these defects because a great deal of the emitted light is reflected directly into the camera.

81

Figure 9.16: Typical Bright Field Defect (hole)

Figure 9.17: Typical Dark Field Defect(scratch)

Dark-field defects (see Figure) typically exhibit lighter defects within a dark background. Here the background noise is generally much less than in the bright-field case, which helps improve the signal to noise ratio. Here the amount of light required is of major concern because a great deal of the emitted light is reflected away from the camera and is never used. Results

The signal to noise (S/N) was defined as the maximum pixel value in the histogram minus the minimum pixel value, divided by the maximum noise spike minus the minimum pixel value. In all cases this yields a fairly conservative estimate of the S/N for a given histogram. Analysis

First, there is no one lighting/camera configuration that will deliver all the defect images to the camera with desirable S/N ratios. However, all defects may be detected if they are divided into two basic groups: Bright-field and Dark-field. While some of the bright-field defects can be seen for almost any set of bright-field angles (pits and holes), all bright-field defects are most visible for camera and lighting angles between 20 and 30 degrees, as measured from opposite sides of normal. The only two defects that do not appear within either of these groups are scratches and rubs. Observations and Conclusions

Independent lighting and camera modes must be implemented if all of the above defects are to be inspected with peak detection probability, thereby ensuring system reliability. While one lighting system may be used, two banks of cameras must be used in order to catch all bright field and dark-field defects. It must also be understood that the dark-field defects require almost 8 times the amount of light as the bright-field defects. There are also interesting effects that occur for some of the bright-field defects. Slight variations from the true specular angle (1 or 2 degrees) can create considerable improvement in the signal to noise ratio for some of these defects. This slight adjustment from the true bright-field angle is sometimes referred to as the Twilight Angle.

82

NIR BASED INSPECTION SYSTEMS


1. Introduction
NIR based inspection systems are also called contact less inspection based systems. Inspection of sliced wafers, processed layers, and complete photovoltaic cells is possible with NIR imaging. This inspection technique illuminates the objects of interest with high optical power at one wavelength, and photons are absorbed in the bulk material. Some energy is lost as heat in the interaction with the molecular structure and the remaining energy causes a photon to be reemitted at a longer wavelength. This energy is plotted as spectra and is quantitatively and qualitatively studied.

Figure 1.1: NIR region of electromagnetic spectrum

Infrared region is composed of: Near Infrared (NIR) band: 780 to 3000 nm Mid Infrared (MIR) band: 3000-30,000 nm Far Infrared (FIR) band: 30,000-3,00,000nm NIR absorptions originate in molecular vibrations: Fundamental absorptions found in MIR band (Raman spectroscopy). Overtones and combinations found in NIR band (NIR spectroscopy). NIR spectroscopy may be divided into: Transmission spectroscopy of liquids: 750 to 1100 nm band. Reflection spectroscopy of powdered materials: 1100 to 2500 nm band.

83

2. Principles of NIR Imaging and Spectroscopy


A general process outline for the spectral phenomenon observed in NIR spectroscopy is: A molecule has certain discrete energy levels. In mid and near infrared spectroscopy molecules absorb energy and the dipole moment changes when a photon matches frequency of vibration and the dipole moment changes. The transition from the ground state to the second excited level is called an overtone, and is observed in the NIR spectral region.

A basic quantum mechanical harmonic oscillator energy diagram is shown below. For the harmonic oscillator model, the potential energy well is symmetric. According to quantum-mechanical principles molecular vibrations can only occur at discrete, equally spaced, vibrational levels, where the energy of the vibration is given by: Ev= (v + ) h v = 0, 1, 2, 3...

When absorption occurs, the molecule acquires a clearly defined amount of energy, (E = h ), from the radiation and moves up to the next vibrational level (v = +1). For a harmonic oscillator, the only transitions permitted by quantum mechanics are up or down to the next vibrational level (v = 1). If the molecule moves down to the next vibrational level (v = -1), a certain amount of energy is emitted in the form of radiation. This is called emission. The transition from the ground state to the second excited level is called an overtone, and is observed in the NIR spectral region. Finally a quantitative analysis is done by observing differences between spectra. The NIR spectra thus obtained contains information on the physical form of the material and its chemical structure. Particle size and density difference will be seen as baseline offsets and broad, overlapping absorbance bands from multiple components and/or functional groups are obtained.

Figure 2.1: Excitation of vibrations Figure 2.2: Potential Energy of harmonic oscillator

Some important conclusions pertaining to NIR spectroscopy as well as general atomic and molecular structures are:

84 Polyatomic linear molecules have 3N-5 modes of vibration. Non-linear have 3N 6. A transition that occurs simultaneously for two of these modes is called a combination band. The absorption bands seen in this spectral range arise from overtones and combination bands of O-H, N-H, C-H and S-H stretching and bending vibrations.

Figure 2.3: Vibrational modes of a molecule


Absorption is one to two orders of magnitude smaller in the NIR compared to the MIR; this phenomenon eliminates the need for extensive sample preparation. The lineshapes for overtone and combination bands tend to be much broader and more overlapped than for the fundamental bands seen in the MIR. Often, multivariate methods are used to separate spectral signatures of sample components.

Figure 2.4: Overlapping NIR Components

85

3. Components of NIR Inspection Systems


Illumination Sources Common incandescent or quartz halogen light bulbs are most often used as broadband sources of nearinfrared radiation for analytical applications. Light-emitting diodes (LEDs) are also used; they offer greater lifetime and spectral stability and reduced power requirements

A Spectrally Selective and Dispersive Element This includes a prism or a diffraction grating to allow the intensity at different wavelengths to be recorded. Fourier transform NIR instruments using an interferometer are also common, especially for wavelengths above ~1000 nm. Instruments intended for chemical imaging in the NIR may use a 2D array detector with an acousto-optic tunable filter. Multiple images may be recorded sequentially at different narrow wavelength bands. NIR Imaging Detectors/Sensors From the hardware side, nowadays a large number of focal plane arrays (FPAs) offering a great flexibility in terms of wavelength range and noise are available. In the near-infrared (NIR), several possible choices are available to the experimenter: FPAs in PbS, Ge, InGaAs and HgCdTe can be purchased. All these detectors, but HgCdTe, can work at room temperature. The type of detector used depends primarily on the range of wavelengths to be measured. Silicon-based CCDs are suitable for the shorter end of the NIR range, but are not sufficiently sensitive over most of the range (over 1000 nm). InGaAs and PbS devices are more suitable though less sensitive than CCDs. System Requirement The critical computational issue connected with spectral imaging is related to the size of the measured data sets. Therefore, there is the need of fast computers with huge mass memory devices and a RAM large enough to avoid as much as possible the swapping with the hard disk. Also the software suite being used should include all the main multivariate techniques for calibration and classification as well as their extension to imaging problems. Spectral imaging is a field so huge that a complete review can hardly fit in a single paper, since completely different subjects should be treated. Instrumentation All NIR measurements are based on exposing material to incident NIR light radiation and measuring the attenuation of the emerging (transmitted, scattered, or reflected) light. Several spectrophotometers are available; they are based on different operating principlesfor example: filters, grating-based dispersive, acousto-optical tunable filter (AOTF), Fouriertransform NIR (FTNIR), and liquid crystal tunable filter (LCTF). The selection of specific NIR instrumentation and sampling accessories should be based on the intended application, and particular attention should be paid to the suitability of the sampling interface for the type of sample that will be analyzed.

86

4. Types of NIR Imaging


NIR imaging can be carried out in one of the following modes: transmission (inversely absorption), absorption reflection (RAIRS) and diffuse reflection. A key element in acquiring spectra is that the radiation must somehow be energy selected either before or after interacting with the sample. Wavelength selection can be accomplished with a fixed filter, tunable filter, spectrograph, an interferometer, or other devices. Spectra are typically measured with an imaging spectrometer, based on a Focal Plane Array.

Figure 4.1: Reflection and Absorption of light

Transmission: In transmission imaging, the radiation goes through a sample and is measured by a detector placed on the far side of the sample. The energy transferred from the incoming radiation to the molecule(s) can be calculated as the difference between the quantity of photons that were emitted by the source and the quantity that is measured by the detector. The solid sample must, of course, be IR transparent over an appreciable wavelength range. Absorption Reflection: A general over view of the process outline is given: Sample adsorbed on metal surface IR beam aligned at grazing incidence Metal surface constraints observed incident radiation and dipole moment.

Figure 4.2: Incident radiation interacts with adsorbate and gets reflected by metal

87

Figure 4.3: Dipole selection rule for RAIRS

Figure 4.4: Only Dipole moments with contribution along surface normal interact with incident radiation

In this case the IR beam is specularly reflected from the front face of a highly-reflective sample, example a metallic single crystal surface. As a molecule sits on a surface, it will vibrate. Such vibrations can be studied by shining infrared light on to the surface at a grazing angle of incidence. If the molecule has a dipole moment, then the molecule can absorb infrared light, but only at certain fixed frequencies. Hence, an infrared spectrum of light reflected from the surface will show absorption peaks which are characteristic of the molecule and its method of bonding to the surface. This is the basis of the RAIRS technique. Vibrations can only be detected if the vibration is perpendicular to the surface. Diffused Reflection: In a diffused reflectance measurement, the same energy difference measurement is made, but the source and detector are located on the same side of the sample, and the photons that are measured have re-emerged from the illuminated side of the sample rather than passed through it. The energy may be measured at one

Figure 4.5: Diffuse and specular reflection

or multiple wavelengths; when a series of measurements are made, the response curve is called a spectrum. This type of NIR imaging has the following principle and process: Most of the light being contributed is by scattering centers beneath the surface. In a powdered crystalline substance containing small crystallites, an impinging ray is partially reflected (a few

88 percent) by the first particle, enters in it, is again reflected by the interface with the second particle, enters in it, impinges on the third, and so on, generating a series of "primary" scattered rays in random directions, which, in turn, through the same mechanism, generate a large number of "secondary" scattered rays, which generate "tertiary" rays.... until they arrive at the surface and exit in random directions. We get back, in all directions, all the light we sent, so that we can say that substance is white, in spite of the fact that it is made of transparent objects (crystals). Usually, the sample must be ground and mixed with a non-absorbing matrix such as KBr. Diluting ensures a deeper penetration of the incident beam into the sample which increases the contribution of the scattered component in the spectrum and minimizes the specular reflection component. The specular reflectance component in diffuse reflectance spectra causes changes in band shapes, their relative intensity. Dilution of the sample with a non-absorbing matrix minimizes these effects. This is shown below in the spectral data for caffeine,

Figure 4.6: Spectra showing greatly improved results for sample dilution

where the upper spectrum is diluted in KBr and demonstrates very high quality with sharp, welldefined absorbance bands. The lower spectrum is of undiluted caffeine measured by diffuse reflectance and shows distorted bands. The upper spectrum of diluted caffeine is clearly of higher spectral quality than that of the undiluted caffeine. Some factors to be considered for this technique are: Particle Size reducing the size of the sample particles reduces the contribution of reflection from the surface. Smaller particles improve the quality of spectra. Refractive Index effects result in specular reflectance contributions (spectra of highly reflecting samples will be more distorted by the specular reflectance component). This effect can be significantly reduced by sample dilution. Homogeneity samples prepared for diffuse reflectance measurements should be uniformly and well mixed. Non-homogenous samples will lack reproducibility and will be difficult to quantify. Packing the required sample depth is governed by the amount of sample scattering. The sample should be loosely but evenly packed in the cup to maximize IR beam penetration and minimize spectral distortions.

89 The above scheme continues to be valid in the case that the material is absorbent. In this case, diffused rays will lose some wavelengths during their walk in the material, and will emerge colored.

5. Steps in NIR Imaging


A typical NIR chemical imaging instrument collects many thousands of spatially distinct NIR spectra simultaneously in a single data-set. These spatially resolved data can be acquired over a microscopic or macroscopic portion of a sample. A generalized process outline is shown below.

Figure 5.1: Process outline for NIR Imaging

The data is shown as a series of wavelength-resolved images and spatially resolved spectra. An image at a single wavelength has contrast based on the strength of the special features at that wavelength. NIRspectral imaging, being most frequently used, deals with narrow spectral bands over a contiguous spectral range, and produce the spectra of all pixels in the scene. NIR-spectral sensors collect information as a set of 'images'. In the NIR field the most common kind of measurements carried out with hyper spectral imagers are reflectance ones. Although reflectance is always measured by taking the ratio between scattered and incident light, there exist many different definitions of reflectance. The main parameters for their classification are the illumination and detecting geometries.

Figure 5.2: Basic steps in NIR inspection

90

Figure 5.3: Hyper spectral Imaging data

As shown in figure, the data-set, commonly referred to as a hypercube, consists of a sequence of images of a sample recorded over a series of infrared wavelengths or frequencies. The data can be viewed as a collection of frequency-resolved images. The image size and magnification is configurable, and can be increased or decreased, as per requirement, by the simple selection of the appropriate image formation optics.

6. NIR Imaging Sensors


In the near-infrared (NIR), several possible choices are available: focal plane arrays (FPAs) offering a great flexibility in terms of wavelength range and noise are available in PbS, Ge, InGaAs and HgCdTe. All but HgCdTe can work at room temperature.

IInGaAs Linear Image Sensor The InGaAs linear image sensor is a self-scanning photodiode array designed specifically for detectors in multichannel spectroscopy. The InGaAs PIN linear image sensor offers a number of features such as a large photosensitive area, high quantum efficiency, wide dynamic range due to low dark current and high saturation charge, excellent output linearity and uniformity, as well as low power consumption. The standard types use a sapphire window that has superior infrared transmission. PbS, PbSe Photoconductive Detector PbS and PbSe photoconductive detectors are infrared detectors making use of the photoconductive effect that resistance is reduced when infrared radiation enters the detecting elements. The PbS and PbSe have superior features, such as higher Figure 6.1: InGaAs Image Sensor detection capability, faster response speed, and they also operate at room temperatures. However, the dark resistance, photo sensitivity and response characteristics change depending on the ambient temperature thus requiring care to ensure the best results.

91

InAs, InSb Photovoltaic Detectors Like InGaAs PIN photodiodes, InAs and InSb detectors are photovoltaic detectors having p-n junctions. Their spectral response ranges correspond to those of the PbS and PbSe detectors. The InAs and InSb detectors have faster response speeds and better S/N. MCT, InSb Photoconductive Detector Like PbS and PbSe photoconductive detectors, MCT (HgCdTe) and InSb photoconductive detectors are infrared detectors making use of the photoconductive effect that the resistance value of the detector element decreases when exposed to light. Pyroelectric Detector The pyroelectric detector is a thermal type infrared detector that operates at room temperatures. The pyroelectric detector consists of a PZT having the pyroelectric effect, a high resistor and a low-noise FET, sealed in a metal package to protect against external noise. The pyroelectric detector itself does not have wavelength dependence, but the combination of various window materials makes it usable for various applications. Two Color Detectors To expand the spectral response range, two or more types of infrared detectors are sometimes stacked in a sandwich structure or arrayed on a plane often called two-color detectors. For two-color detectors, combinations of Si-PbS, Si-PbSe, Si-InGaAs, Si-Ge, Ge-InSb and InSb-HgCdTe are available. The upper detector also serves as a short-wavelength cutoff filter for the lower detector.

7. Chemometrics
Chemometrics is the chemical discipline that uses mathematical and statistical methods for obtaining in an optimal way, relevant information on material systems. Measurements related to the chemical composition of a substance are taken, and the value of a property of interest is inferred from them through some mathematical relation. Chemometrics is applied to solve both descriptive and predictive problems in experimental life sciences, especially in chemistry. In descriptive applications, properties of chemical systems are modeled with the intent of learning the underlying relationships and structure of the system (i.e., model understanding and identification). In predictive applications, properties of chemical systems are modeled with the intent of predicting new properties or behavior of interest. Chemometrics is typically used for one or more of three primary purposes: To explore patterns of association in data; To track properties of materials on a continuous basis; and To prepare and use multivariate classification models. Exploratory Data Analysis Exploratory data analysis can reveal hidden patterns in complex data by reducing the information to a more comprehensible form. Such a chemometric analysis can expose possible outliers and indicate whether there are patterns or trends in the data, but the relationships between samples can be difficult to

92 discover when the data matrix exceeds three or more features. Exploratory algorithms such as principal component analysis (PCA) and hierarchical cluster analysis (HCA) are designed to reduce large complex data sets into a series of optimized views with emphasis on the natural groupings in the data most strongly influencing variables for those patterns. Continuous Property Regression In many applications, it is expensive, time consuming or difficult to measure a property of interest directly. Such cases require the analyst to predict something of interest based on related properties that are easier to measure. This leads to development of a calibration model which correlates the information in the set of known measurements to the desired property. Chemometric algorithms for performing regression include PLS & PCA and are designed to avoid problems associated with noise and correlations in the data. Chemometric regression lends itself handily to the on-line monitoring and process control industry, where fast and inexpensive systems are needed to test and make decisions about product quality. Classification Modeling Many applications require that samples be assigned to predefined categories, or "classes". A classification model is used to predict a sample's class by comparing the sample to a previously analyzed experience set, in which categories are already known. K-nearest neighbor (KNN) and soft independent modeling of class analogy (SIMCA) are primary reliable chemometric workhorses. They include the ability to reveal unusual samples in the data. Techniques in Chemometry Some of the techniques used in chemometrics are, namely: Multivariate calibration Pattern recognition, clustering, classification Multivariate curve resolution Multivariate statistical process control Multivariate analysis was a critical utility in applications of chemometrics. The data resulting from infrared is numbering in the thousands of measurements per sample often being multivariate (depending upon multiple variables). The structure of these data was found to be conducive to using techniques such as PCA and PLS. This is primarily because, while the datasets may be highly multivariate, there is strong and linear low-rank structure present. PCA and PLS have been shown over time very effective at exploiting the interrelationships in the data, and providing compact coordinate systems for further numerical analysis such as regression, clustering, and pattern recognition. Advantages of NIR Imaging Remote sampling is possible (good for hazardous materials).Also, no sample preparation required, saving time. Possibility of using it in a wide range of applications (physical and chemical), and viewing relationships difficult to observe by other means NIR technology has a good signal-to-noise ratio. This makes it easier for technicians and scientists to read and evaluate the results of a given NIR test. NIR is inexpensive in comparison to other spectroscopic techniques. NIR can be used for analyzing large samples, because near-infrared light can penetrate further than infrared light.

93

NIR detection is more sensitive than visible fluorescence and equal to or more sensitive than chemiluminescence. At longer wavelengths, bio molecules exhibit greatly reduced auto fluorescence, resulting in lower background and enhanced sensitivity when NIR fluorophores are used for detection.

8. Applications of NIR Based Inspection Systems


I. NIR Based Weathered Wood Inspection System NIR spectroscopy involves the measurement of the wavelength and intensity of the absorption of near-infrared light by a specific material. NIR spectroscopy is a promising candidate for monitoring change in lumber surface characteristics due to natural weathering. Estimating the residual strength of structural members is an important tool for inspection professionals. Quantitative techniques that can provide an indication of the weathered member strength are needed by engineers in order to more reliably predict a safe service-load capacity for timber structures Materials

The specimens can be of three types of materials: untreated wood, CCA (chromate copper arsenate) treated wood, and DDAC (didecyl dimethyl ammonium chloride) treated wood. Treated specimens were re-dried to 16 percent moisture content after treatment. Field Exposure

The woods are normally placed on racks above ground. The boards were subjected to different periods of exposure conditions (0, 12, 24, 36, 48, 60 months). These woods are naturally exposed to weathering. After field exposure, all woods were stored indoors at approximately 12 percent moisture content for several days prior to industrial use. Hence inspection for the weathering is required before its industrial use. NIR inspection system hardware

The specimen woods to be inspected are kept on the conveyor belt which transports the wood to the inspecting chamber. Inspecting chamber consists of the specialized lighting. Halogen lamp is used as a source for the inspection. The collimating lens is used to focus the light to the specimen wood. The collecting lens is used to collect the scatter light from the wood and put them into the monochromator. Light is dispersed and focused on CCD detector to measure its intensity. Thus, the digitalized signal of intensity is grabbed by the computer through the frame grabber for calculation. Eventually, it can work in nobody controlled situation even in performing the reference calibration operation. NIR Spectroscopy

In the inspecting chamber all of spectra obtained from the specimen weathered wood are measured and displayed in the control panel. NIR measurements are usually carried out at wavelengths between 350nm and 2,500nm. Spectra are collected at four equally spaced locations along the center line of weathered surface for each specimen. Ten scans are collected and averaged into a single average spectrum from each location. The four averaged spectra collected on each specimen are also averaged to provide a single spectrum that is then used for further analysis. The data set was further reduced by averaging the spectra

94 that are collected at 1 nm intervals, to a spectra data set at 5 nm intervals. Multiplicative Scatter Correction (MSC), a transformation method used to compensate for additive and /or multiplicative effects in spectral data, is applied to each exposure group separately.

Multivariate Analysis

The spectra collected on each sample spectrum for principal component analysis (PCA) and projection to latent structures (PLS) modeling. All of the NIR spectra are combined into a single data matrix (X-matrix) while the weathering exposure time constitutes the response matrix (Y-matrix). The X-matrix is mean centered variance normalized prior to performing the multivariate analysis. Calibration models (CALB) are constructed with about two thirds of the samples using full cross-validation. The model is then used to predict the response of the test set (TEST) that contains about one third of the samples that are not included in the original model. This conservative approach insures that the predictive capabilities of the model are reliable.

NIR Data for Wood

Figure 8.1. Relationship between the actual exposure time and the NIR-predicted weathering time (PLS model Calibration).

It is noted that the differences between weathered and unweathered wood are clear in both absorbance intensity and shape of the spectra bands. This is reflected at the lower wavelength range (<1400 nm). In general, the major peaks (>1400 nm) for weathered wood are greatly reduced due to a loss of wood compounds. If decay is also present, physical and mechanical properties of the bulk material will be affected, as decay fungi are able to cause rapid strength losses in wood associated with a loss of weight. These NIR spectral vibrations are strongly correlated with natural degradation in weathered wood. NIR is particularly sensitive to subtle differences associated with hydrogen bonding in wood, and the differences

95 between carbohydrate and lignin hydroxyls which might be impacted by the natural weathering process. The strong correlation coefficient between predicted and actual exposure time indicates that the NIR spectra are very sensitive to changes in the wood structure that are associated with degradation in weathered wood. The NIR spectra analysis, in conjunction with multivariate statistical analysis, potentially monitors changing surface conditions of wood structural members subjected to natural weathering. Control Actions Based upon the decision from the NIR spectral analysis and multivariate analysis using machine vision application software for image processing, the robotic mechanical arm checks the woods that are unsuitable for the industrial use. Conclusion

Hence NIR based inspection of weathered woods can effectively determine the strength of the wood for the industrial use at faster and without the destroying the specimen wood. Eventually, it can work in nobody controlled situation. This technology is largely adopted for the mass inspection and production process which increases the overall profit for the firm.

96

9. Projects on NIR based Inspection System at CEERI


I. NIR Based Quality Assessment of Edible Oils Introduction

Quality assessment of the edible oil is of great importance to the variety of food industries especially to the fried food industry. Edible oils that are more stable and give desired characteristics to the food are more preferred. The stability of the edible oil increases with the increasing degree of hydrogenation. However, during the hydrogenation process trans fatty acids are generated which are associated with an increased risk of hazards to human health. This fact has become a major concern for the use of hydrogenated oil to the consumer and in the fried food industry. Recently, as an alternative to hydrogenated frying oil, partially hydrogenated oils with a lower degree of unsaturation are being increasingly adopted. Evaluation of the degraded quality oils using conventional methods include the measurement of the total polar compounds (TPC), free fatty acid (FFA) level, carbonyl value, viscosity, etc are tedious, time consuming, and not amenable to on-line assessment. Degradation of oil during frying is followed by changes in free fatty acid (FFA) level, color, and viscosity, and by an increase in the number of polar molecules as a result of thermal and oxidative breakdown of the hydrocarbon chain in the oil. Near-infrared spectroscopy instruments are also recognized as effective tools for quality control in the food industry due to their fast and non-destructive measurements, there are no requirement for using hazardous reagents and solvents, and there can be more flexible sample handling. Edible oils include soybean oil, mustard oil, olive oil, sunflower oil and others. The edible oil which is produced in the industry contains mixture of the hydrogenated and its non-hydrogenated oil. Illumination The edible oil sample is filled to a Petri dish underneath the fiber bundle of the hyperspectroradiometer at a suitable distance. The distance was calculated based on the field of view of the fiber bundle. A DC regulated fiber-optic illuminator is used as the light source. Two fiber-optic light-guiding branches were mounted on a frame to guide light to the sample. Spectral Data Acquisition hardware Spectral data are acquired using a portable hyperspectroradiometer which measures reflectance spectra over a wavelength range of 3502,500 nm and a spectral resolution of 3 nm. Light reflected from the sample is collected by the fiber-optic bundle and sent to the hyperspectroradiometer for intensity measurement. A white reference panel with approximately 100% reflectance across the entire spectrum is used as a reference standard. At each wavelength reading, a ratio between the light intensity from the sample and the white reference is calculated. All spectra are recorded and stored on a PC. The spectral range of each spectrum is usually in the range of 3502,500 nm. The reflectance is normalized so that the values are within the range of 01.

Figure 9.1: Apparatus setup for spectral data collection

97

Data Analysis Due to significant noise bands observed in the ranges of 350400 and 7502,500 nm, for all the spectra, only the spectral range of 4001,750 nm are used for calibration and data analysis. The three parameters TPC, FFA and viscosity measured with the spectral analysis are used as spectral data for the further processing. The useless wavelength regions which have little or no significance are removed so as to simplify the calibration models and reduce the computation time. Also selected wavelength regions can be used as a basis to design on-line systems with lower costs. Concentration residuals, the difference between the actual and predicted concentration for samples, are used for outlier detection in a Figure 9.2: Figure: A reflectance spectrum of non-hydrogenated (0:100) soybean oil within the range of 4001,750 nm training set. The reflectance spectrum of the non-hydrogenated (0:100) soybean oil within the range of 4001,750 nm. Based upon the research and study of NIR inspection that with a fine resolution, the spectral reflectance of the oils can be used to effectively predict the three quality parameters, i.e. acid value, TPC, and viscosity. Partial least squares calibration models using a wavelength range of 4001,750 nm were able to track the changes of the three parameters of the heated oil samples with high accuracy. Based on the PLS plot for each constituent, the common spectral regions of the most variations (named, feature wavelength ranges) were found within the ranges of 450550, 850950, 1,1401,180 and 1,2001,300 nm.

(A)

(B)

98

(C) Figures 9.3: Predicted versus actual values using the PLS calibration models with the feature wavelength ranges on (A) acid value (B) total polar compounds (C) viscosity.

Conclusion

NIR spectroscopy has been a proven and fast technique with the capability of simultaneous prediction of oil quality parameters. Analysis can be done within a few minutes once a calibration model is developed. Hence it is used in inspection automation in assessing edible oil quality offering better quality products while satisfying consumers and raising the profit for the company.

II. NIR Inspection System for the Plastic Film Thickness Determination of Transparent Films and Layers

Measurement Principle

Interference of light originates from the interaction of light of a spectrally broader light source partially reflected on optical boundaries caused by index changes of the material. Interference occurs if the layers are transparent, smooth, and parallel, allowing both reflected signals to interact. The difference in path length between both rays is a function of the angle of incidence, and the refractive index and geometrical thickness of the layer. At a fixed layer thickness the constructive and destructive interferences occur at different wavelength. Therefore, the reflected spectrum of such a thin film shows a periodically modulated intensity. Analysis by e.g. a Fourier-Transform algorithm provides the information without destroying, changing or even touching the surface. If two layers feature different thicknesses and refractive indices the simultaneous thickness evaluation of both layers can be performed.
Figure 9.4: Principle of Interference.

99 Hardware Implementation

For these inspection systems modern high end spectrometer systems based on diode array and fibre optic technology are used. Due to their extremely small and compact design, diode array spectrometers are ideal instruments for film thickness determination during running production processes. Due to high speed electronics high frequency measurement are also possible. Choice of Measurement Range The choice of the measurement range will be determined by the thickness and the refractive Figure 9.5: Practical Setup for the inspection of plastic films index of the layer. Usually, thick films will be evaluated using the NIR-spectra and for thin films UV/Vis-spectra will be analyzed. For example: If three foils with refractive index n=1.49 have thickness 10m, 20m and 30m. Then the 10m-foil will show a good evaluable signal in the visible range, whereas the thicknesses of the 20m- and the 30m-foils will show a good signal in the NIR range. Evaluation

The analysis of the interference spectra is done with the help of Fourier-transformation. Together with the knowledge of the refractive index the film thickness of foils and layers can be calculated. In Figure the interference spectrum, the Fourier transformed spectrum and the film thickness of a layer calculated are shown. Conclusion

Thus through NIR based inspection system plastic layer thickness can be determined without destroying the plastic for effective segregation in the industry.
Figure 9.6: Graphical User Interface For the Inspection system

100

ARTIFICIAL NEURAL NETWORK


1. An Introduction
It has been a matter of great curiosity how the human brain can process the visual information of the world so accurately and intuitively within fraction of a second. It has inspired computer designers, engineers and programmers to develop such machines which can think and learn from its experience to meet the goal of designing automated intelligent systems. The detailed working of the human brain is still uncovered but the anatomy of brain suggests that human brain consists of over 100 billion neurons and synapses in the human nervous system. Studies of brain anatomy of the neurons indicate more than 1000 synapses on the input and output of each neuron. Although the neuron's switch time (a few milliseconds) is about a million fold times slower than current computer elements, they have thousand fold greater connectivity than today's supercomputers.

Figure 1.1: Human brain and neural cells network

Figure 1.2: A neuron cell and synapse

A biological neuron consists of following parts: 1. Soma or body cell is a large, round central body in which almost all the logical functions of the neuron are realized. 2. The axon (output) is a nerve fibre attached to the soma which can serve as a final output channel of the neuron. An axon is usually highly branched. 3. The dendrites (inputs) represent a highly branching tree of fibres. These long irregularly shaped nerve fibres (processes) are attached to the soma. 4. Synapses are specialized contacts on a neuron which are the termination points for the axons from other neurons. Todays supercomputer can perform over a billion calculations per second but they are not capable enough to comprehend the meaning of shapes that the brain can easily classify. In many real world problems where there is need for automated complex pattern recognition intelligent systems the conventional computers fail. Hence a new processing model can be developed by borrowing the features of the physiology of the brain known as Artificial Neural Network or simply Neural Networks.

101 In its most general form, an artificial neural network is a machine that is designed to model the way in which the brain performs a particular task or function of interest; the network is usually implemented using electronic components or simulated in software on a digital computer. Our interest will be confined largely to neural networks that perform useful computations through a process of learning. To achieve good performance, neural networks employ a massive interconnection of simple computing cells referred to as neurons or processing units. We may thus offer the following definition of a neural network viewed as an adaptive machine. A neural network is a massively parallel distributed processor that has a natural propensity for storing experiential knowledge and making it available for use. It resembles the brain in two respects: 1. Knowledge is acquired by the network through a learning process. 2. Interneuron connection strengths known as synaptic weights are used to store the knowledge. The procedure used to perform the learning process is called a learning algorithm. Artificial neural networks are also referred to as neuro-computers, connectionist networks, parallel distributed processors, etc. I. Models of a neuron A neuron has a set of n synapses associated to the inputs. Each of them is characterized by a weight. A signal xi, i=1,2n at the ith input is multiplied (weighted) by the weight wi, i=1,2n. The weighted input signals are summed. Thus, a linear combination of the input signals w1x1++wnxn is obtained. A "free weight" (or bias) w0, which does not correspond to any input, is added to this linear combination and this forms a weighted sum z=w0+w1x1++wnxn. A nonlinear activation function is applied to the weighted sum. A value of the activation function y= (z) is the neuron's output.

w0 w0 x1 w1
...

w1 x1

Z=

w x
i

(Z)
Output

( z ) f ( x1 ,..., x n ) wn x n

xn

wn

Figure 1.3: Schematic representation of a model of neuron

f ( x1 ,..., xn ) F ( w0 w1 x1 ... wn xn )
Where f is a function to be earned, x1,..xn are the inputs and is the activation function Activation function: An activation function is for limiting the amplitude of the output of a neuron. The activation function is generally non-linear. Linear functions are limited because the output is simply proportional to the input.

102

Figure 1.4: Different types of activation functions

II. Learning process Learning is a process by which the free parameters of a neural network are adapted through a continuing process of stimulation by the environment in which the network is embedded. The type of learning is determined by the manner in which the parameter changes take place. This definition of the learning process implies the following sequence of events: 1. The neural network is stimulated by an environment. 2. The neural network undergoes changes as a result of this stimulation. 3. The neural network responds in a new way to the environment, because of the changes that have occurred in its internal structure. Let wkj(n) denote the value of the synaptic weight wkj at time n. At time n an adjustment wkj(n) is applied to the synaptic weight wkj(n), yielding the updated value wkj (n +1) = wkj (n) + wkj (n) A prescribed set of well-defined rules for the solution of a learning problem is called a learning algorithm. Hebbian learning

Hebbs postulate of learning is the oldest and most famous of all learning rules:

103

1. If two neurons on either side of a synapse (connection) are activated simultaneously (i.e. synchronously), then the strength of that synapse is selectively increased. 2. If two neurons on either side of a synapse are activated asynchronously, then that synapse is selectively weakened or eliminated.

Figure: Synaptic connection.

To formulate Hebb's postulate of learning in mathematical terms, consider a synaptic weight wkj with presynaptic and postsynaptic activities denoted by xj and yk, respectively. According to Hebb's postulate, the adjustment applied to the synaptic weight wkj at time n is wkj (n) = F (yk (n), xj (n))

As a special case we may use the activity product rule,


wkj (n) = yk (n) xj (n)

Where, is a positive constant that determines the rate of learning. This rule clearly emphasizes the correlation nature of a Hebbian synapse.
III. Training Methods in ANN a) Fixed There is no learning required for the fixed-weight networks, so a learning mode is supervised or unsupervised.

b) Supervised learning In supervised training, both the inputs and the outputs are provided. The network then processes the inputs and compares its resulting outputs against the desired outputs. Errors are then propagated back through the system, causing the system to adjust the weights which control the network. This process occurs over and over as the weights are continually tweaked. The set of data which enables the training is called the training set. During the training of a network the same set of data is processed many times as the connection weights are ever refined. Example of this architecture: Multilayer perceptrons. a) Unsupervised learning

Figure: Supervised learning

In unsupervised training, the network is provided with inputs but not with desired outputs. The system itself must then decide what features it will use to group the input data. This is often referred to as selforganization or adaption. Examples: Kohonen network, ART.
Figure: Unsupervised learning

104

2. Similarities and Differences between ANN and Human Brain.

I. Similarities between ANN and Human Brain. The following are some similarities between ANN and human brain: Knowledge is acquired by the network/brain from its environment through a learning process.

Interneuron connection strengths, known as synaptic weights are used to store the acquired knowledge.

Both ANN and Human Brain derive their respective computing power through: a) the massive parallel distributed structure.

b) ability to learn and therefore generalize, which means ability to produce reasonable outputs for inputs not encountered during training.

I. Differences between ANNs and Human Brain. The differences between the brain and ANN includes : ANN are always constructed and trained for a particular specific task and purpose whereas the human brain has no limitations on capabilities and is easily re-trainable for any new situation.

ANN performs faster and more accurately, as compared to human brain, for functions or tasks which are primarily memory-based. Whatever an ANN learns gets hard-coded into the circuitry of the system and is not altered unless the ANN is trained again.

The structure of an ANN is already decided by the inventor and its behaviour is thus predictable to, essentially quite a large extent whereas the behaviour and structure of the human brain, whether through its macroscopic structure or its micro-analysis, still remains an area of active research.

Human Brain is more fault tolerant and does not require complete or exact information in terms of recalling an event or recreating the same output sequences whereas the ANN degree of fault tolerance is still a considerable issue of research.

105

3. Types of Artificial Neural Networks


Broadly the artificial neural networks are classified as follows: I. Feedforward neural network
The feedforward neural network is the first and the simplest type of artificial neural network devised. In this network the information moves in only one direction forward. From the input nodes data goes through the hidden nodes (if any) and to the output nodes. There are no cycles or loops in the network. Feedforward networks can be constructed from different types of units, e.g. the simplest example being the perceptron. Continuous neurons, frequently with sigmoid activation, are used in the context of backpropagation of error. Perceptron a) Single layer perceptron The perceptron is the simplest form of a neural network used for the classification of a special type of patterns, which are linearly separable. It consists of a single McCulloch-Pitts neuron with adjustable synaptic weights and bias (threshold). The McCullochPitts model of a neuron is sometimes used with the output either -1 or +1: { The single-layer perceptron shown has a single neuron. Such a Figure 3.1: Perceptron perceptron is limited to performing pattern classification with only two classes. Rosenblatt (1958) proved that if the patterns (vectors) used to train the perceptron are drawn from linearly separable classes, then the perceptron algorithm converges and positions the decision surface in the form of a hyperplane between the classes. The proof of convergence of the algorithm is known as the perceptron convergence theorem.

b) Multilayer Perceptron (MLP) as a multilayer feedforward network Multilayer neural networks consist if units arranged in layers. Each layer is composed of nodes connects to every other node in subsequent layers. Each Multilayer perceptron is composed of a minimum of three layers consisting of an input layer, one or more hidden layers and an output layer. The input layer distributes the inputs to subsequent layers. Input nodes have linear activation functions and no thresholds. Except for the input nodes, each node is a neuron (or processing element) with a nonlinear activation function. MLP is a modification of the standard linear perceptron, Figure 3.2: A Multilayer perceptron that solves the XOR which can distinguish data that is not linearly separable.
problem

106 II. Backpropagation Networks (BPN) A network should learn the relationship among a set of example patterns and it should be able to apply the same relationship between new input patterns. The network would then focus on the features of an arbitrary input that resemble other patterns seen previously, such as those pixels in the noisy image that look like a known character, and to ignore the noise. Such a system is Backpropagation network. Multilayer perceptrons have been applied successfully to solve some difficult diverse problems by training them in a supervised manner with a highly popular algorithm known as the error backpropagation algorithm. Basically, the error backpropagation process consists of two passes through the different layers of the network: a forward pass and a backward pass. In the forward pass, activity pattern (input vector) is applied to the sensory nodes of the network, and its effect propagates through the network, layer by layer. Finally, a set of outputs is produced as the actual response of the network. During the forward pass Figure 3.3: A Backpropagation Network showing the information flow. the synaptic weights of network are all fixed. During the backward pass, on the other hand, the synaptic weights are all adjusted in accordance with the error-correction rule. Specifically, the actual response of the network is subtracted from a desired (target) response to produce an error signal. This error signal is then propagated backward through the network, against direction of synaptic connections -hence the name error back-propagation. The synaptic weights are adjusted so as to make the actual response of the network move closer the desired response. III. Recurrent neural network (RNN) Contrary to feedforward networks, recurrent neural networks (RNNs) are models with bidirectional data flow. While a feedforward network propagates data linearly from input to output, RNNs also propagate data from later processing stages to earlier stages . RNNs can be used as general sequence processors.

Figure 3.4: A Simple recurrent neural network

Bidirectional Associative Memory (BAM)

The BAM consists of two layers of processing elements that are fully interconnected between the layers. The units may, or may not, have feedback connections to themselves. But there are bidirectional connections between the two layers.

107 The Bidirectional associative memory is content-addressable memory. A BAM consists of neurons arranged in two layers say A and B. The neurons are bipolar binary. The neurons in one layer are fully interconnected to the neurons in the second layer. There is no interconnection among neurons in the same layer. The weight from layer A to layer B is same as the weights from layer B to layer A. dynamics involves two layers of interaction. Because the memory process information in time and involves Bidirectional data flow, it differs in principle from a linear association, although both networks are used to store association pairs.
Figure 3.5: The BAM neural network with two

layers A and B.

Hopfield Memory

The Hopfield memory is a derivative of the BAM. Each neuron


is connected to every other neuron (in both directions, the neurons are not directly connected onto themselves). Weights are symmetric, i.e., wij = wji for all i, j. The Hopfield network (like similar attractorbased networks) is of historic interest although it is not a general RNN, as it is not designed to process sequences of patterns. Instead it requires stationary inputs. It is an RNN in which all connections are symmetric. Invented by John Hopfield in 1982, it guarantees that its dynamics will converge. If the connections are trained using Hebbian learning then the Hopfield network can perform as robust contentaddressable memory, resistant to connection alteration. The Boltzmann Machine

Figure 3.6: The Hopfield memory

The Boltzmann machine is different from the other networks in the respect that the output of the processing elements of the Boltzmann machine is a stochastic function of the inputs, rather than a deterministic function. The output of a given node is calculated using probabilities, rather than a threshold value or sigmoid output function. The learning will combine an energy minimization with entropy maximization consistent with the use of Boltzmann distribution to describe the energy state probabilities. These differences are a direct result of applying from statistical mechanics to the neural network. Counterpropagation Network (CPN)

The counterpropagation network is recently developed model of the neural network which a novel combination of previous existing networks. Instead of employing a single learning algorithm throughout the network, the CPN uses a different learning procedure on each layer. The learning algorithms allow the CPN to train quite rapidly with respect to other networks. The tradeoff is that CPN may not always yield sufficient accuracy for some applications. Nevertheless, the CPN remains a good choice for some classes of problems, and it provides an excellent vehicle for rapid prototyping of other applications. There are four major components: a) an input layer that performs some processing on the input data, b) a processing element called an instar, c) a layer of instars known as competitive network, and d) a structure known as an outstar which gives an output.

108 One of the major applications of the CPN include a image classification problem of spacecraft orientation system to recognize the spacecraft irrespective of its angle so that a robotic arm can grasp the spacecraft perpendicular to the long axis in space station.

Figure 3.7: The CPN network with five layers: two input layers (x and y), one cluster layer and two output layers (x and y) . The CPN gets its name from the fact that input vectors in the input layers and output layers propagate in the opposite direction. In the figure W and U are the update weights to cluster layer and V and T are the update weights to the output layers .

Self-Organizing Maps (SOM)

The self-organizing map (SOM) is an unsupervised neural network providing a mapping from a high dimensional input space to a low-dimensional, usually 1- or 2-dimensional, output space while preserving topological relations as faithfully as possible. Because of Kohonens work in the development of the theory of the competition, competitive processing elements are often referred to as Kohonen units. Self-Organizing maps are topology-preserving maps in which units located physically next to each other will respond to classes of input vectors that are likewise next to each other. Large dimensional input vectors are in fact projected down on the two dimensional map in a way that maintains the natural order of the input vectors. This dimensional reduction could allow us to visualize easily important relationships among the data that otherwise might go unnoticed. Physical neural network

A physical neural network includes electrically adjustable resistance material to simulate artificial synapses. Examples include the ADALINE neural network developed by Bernard Widrow in the 1960s and the memristor based neural network developed by Greg Snider of HP Labs in 2008.

109

4. Adaptive Resonance Theory (ART) Networks


One of the very interesting and useful features of the human brain is its ability to learn many new things without necessarily forgetting or altering the things learned in the past. It would be highly desirable if we could impart this capability to ANN. Most of the ANNs are specific to a particular pattern-classification problem i.e. they are trained by a typical set of exemplars or training patterns. During the training, the network learns by adjusting its weight values. Once trained, the network is deemed adequate for only that class of pattern-classification problems. If at some future time we want to add another pattern class to the learning memory of the system, it is not possible as the system has been designed and has been trained for only that particular class. Training the system for the new class of patterns would lead to the system forgetting the earlier learned patterns as the weight values would now adjust themselves according to this new pattern. As a consequence the system has to be trained again for all the exemplars including the new ones, without conformed classification claim. Also, if an ANN is presented with a previously unseen input pattern, there is generally no built-in mechanism for the network to appreciate the novelty of the input. What is being described before is the famous stability-plasticity dilemma, which can be encompassed by the following questions clearly: How can a learning system remain adaptive in response to a significant input, yet remain stable in response to irrelevant inputs? How does the system know to switch between its plastic and stable modes? How can the system retain previously learned information while continuing to learn new things?

The answer to all the above questions were given by Carpenter and Grossberg in their ART networks architectures. A key to solving the stability-plasticity dilemma is to add a feedback mechanism between the competitive layer and the input layer of the competitive network architecture. This facilitates the learning of new information without erasing the old one, automatic switching between stable and plastic modes, and stabilization of the encoding of the classes by the nodes. In an ART network, information, in the form of processing element outputs, reverberates back and forth between layers. If the proper pattern develops, a stable oscillation ensues, which is the neural network equivalent of resonance, during which only can learning or occur. Before the network has achieved this resonant state, no learning can take place, because the time required for changes in the processing elements weights is much longer than the time that it takes to achieve resonance. A resonant state can occur either when the system is presented with a previously learned input, reinforcing the input memory of this input or when this input does not match any previously stored input pattern and the network enters a resonant state whereupon the new pattern will be stored for the first time. Thus, the network quickly responds to previously learned data, yet remains able to learn when novel data are presented.

110

I. ART1 Networks
The ART1 structure has the following important features: Designed to cluster binary input vectors, allowing for great variation number of nonzero components. User has direct control of the degree of similarity among patterns placed on the same cluster units. Consists of two field units: 1. F1 units 2. F2 cluster units together with reset unit to control the degree of similarity of patterns placed on the same cluster unit.

The F1 and F2 layers are also connected by two sets of weighted pathways namely bottom-up and topdown. In addition to this, several supplemental units are included in the units to provide neural control of learning process. The learning process is designed so that it is not required either that patterns be preserved in a fixed order or the number of patterns to be stored is known in advance (i.e. more patterns can be added to the data set during the training period, if desired). Input data for both bottom-up and topdown weights are controlled by differential equations. ART1 is operated in fast learning mode in which weights reach equilibrium during each learning trial (presentation of pattern). Since the activation of F1units doesnt change during this resonance phase, the equilibrium weight s can be determined exactly and iterative solutions of differential equations are not necessary. Architecture

The architecture consists of computational and supplemental units: a. Computational units Architecture of computational units for ART1 consists of F1 units(inputs and interface inputs) and reset units that implements user control over the degree of similarity of patterns placed on the same cluster. F2 layer units Y1 Reset R F1(b) layer units (Interface) F1(a) layer units (input) S1 Si Sn
tji bij

Yi

Yn

X1

Xi

Xn

Figure 4.I.1: Typical ART1 architecture [Carpenter & Grossberg 1987b]

111 b. Basic Structure of ART1 Each unit in the FI(a)(input) layer is connected to the corresponding unit in F1(b)(interface) layer. Each unit in the F1(a) anf F1(b) layer is connected to the reset unit, which in turn is connected to the F2 unit(not shown in fig.). Each unit in the F1(b) layer is connected to each unit in the F2(cluster) layer by two weighted pathways.The F1(b) unit Xi is connected to the F2 unit Yj by bottom-up weights bij . Each unit in the F2 layer Yj is connected to each unit in F1(b) layer Xi by top-down weights tji. The F2 layer is a competitive layer in which only the uninhibited node with the largest net input has the largest activation.

Algorithm Nomenclature: n = number of components in the input vector. m = maximum number of clusters that can be formed. bij = bottom-up weights (from F1(b) unit Xi to F2 unit Yj). tji = top-down weights (from F2 unit Yj to F1 unit Xi). = vigilance parameter. s = binary input vector (an n-tuple). x = activation vector for F1(b) layer (binary). x = norm of vector x, defined as the sum of components xi. Description Initially binary input vector s is presented to F1(a) layer, and the signals are sent to X1 layer. These F1(b) units then broadcast to F2 layer over connection pathways with bottom-up weights. Each F2 unit computes its net input, and units compete for the right to be active. The unit with the largest net input sets its activation to 1; all others have activation of 0. Let index of winning unit is J. This winning unit becomes the candidate to learn the input pattern. A signal is then sent down from F2 to F1(b) (multiplied by top-down weights). The X units (in the interface portion of F1 layer) remain on only if they receive non-zero signals from both the F1(a) and F2 units. The norm of vector x i.e. x( the activation vector for the interface portion of F1) gives the number of components in which the top-down weight vector for the winning F2 unit tj and the input vector s are both 1.This is referred to as match. If the ratio of x to s is greater than or equal to vigilance parameter, the weights(top-down and bottom-up) for the winning cluster are adjusted. However, if the ratio is less than the vigilance parameter, the candidate unit is rejected, and other candidate must be chosen again as a candidate on this learning trial, and the activations of F1 units are reset to zero. The same input vector sends its signal to the interface units, which again sends this as the bottom-up signal to the F2 layer and the competition is repeated (but without the participation of input units). The process continues until either a satisfactory match is found (a candidate is accepted) or all units are

112 inhibited. The action can be taken if all units are inhibited for e.g. reducing the value of vigilance parameter, allowing less similar patterns to be placed on same cluster, or to increase the number of cluster units or designate the input pattern as non clustered. Training Algorithm Step0. Initialize parameters: L > 1, 0 < 1. Initialize weights 0< bij(0) < tji(0) = 1. Step1. While stopping condition is false, do Steps 2-13. Step2. For each training input, do Steps 3-12. Step3. Set activations of all F2 units to zero. Set activations of F1(a) units to input vector s. Step4. Compute the norm of s: s =
i.

Step5. Send input signal from F1(a) to the F1(b) layer: xi = si. Step6. For each F2 node that is not inhibited: If yj = -1, then yj Step7. While reset is true, do Steps 8-11. Step8. Find J such that yJ yj for all nodes j. If yj = -1, then all nodes are inhibited and this pattern cant be clustered. Step9. Recompute activation x of F1(b): xi = sitJi. Step10.Compute the norm of vector x: x = Step11. Test for reset:
i. ijxi.

113

If

, then

yJ = -1 (inhibit node J) (and continue, executing Step 7 again). If


, then proceed to Step 12.

Step12. Update the weights for node J (fast learning): bij(new) <

tji(new) = xi. Step13. Test for stopping condition. Comments. Step 3 removes all inhibitions from previous learning trial (presentation of a pattern). Setting y = -1 for an inhibited node (in Step 6) will prevent that node from being a winner. Since all weights and signals in the net are nonnegative a unit with a negative activation can never have the largest activation. In Step 8, in case of a tie, take J to be the smallest such index. In Step 9, unit Xi is on only if it receives both an external signal si and a signal sent down from F2 to F1, tJi. The stopping condition in Step 13 might consist of any of the following: No weight changes, no units reset, or maximum number of epochs reached. Parameters PARAMETER L bij PERMISSIBLE RANGE L>1 0 1 (vigilance parameter) 0< bij(0) < (bottom-up weights) SAMPLE VALUE 2 0.9

tji

tji(0) = 1 (top-down weights)

114

II. ART2

ART2 is designed to perform for continuous-valued input vectors the same type of tasks as ART1 does for binary-valued input vectors. It has more complex F1 field as continuous-valued input vectors may be arbitrarily close together. In ART2 F1 field contains a combination of normalization and noise suppression in addition to the comparison of bottom-up and top-down signals needed for the reset mechanism. Two types of continuous-valued inputs: (i) Noisy Binary Signals: These consist of patterns whose information is conveyed primarily by which components are on or virtually off rather than by the differences in magnitude of the components that are positive. The equilibrium weights found by the fast-learning mode are suitable for ART2, but are difficult to calculate than in ART1 as differential equations for weight updates depend on activations in F1 layer which are changing as learning progresses. (ii) Truly Continuous: These consist of range of values of the components carrying significant information and weight vector for the cluster units is interpreted as an exemplar for patterns placed on that unit. For these types of inputs slow-learning mode is more appropriate. The price of the ability of ART2 to deal with analog patterns is primarily increase in the complexity in the F1 processing layer. The ART2 F1 layer consists of 6 sublevels and several gain control units.

Architecture F1 layer consists of six types of units (w, x, u, v, p and q units). There are n units of each of these types (where n is the dimension of an input pattern), means n units of each of w, x, u, v, p and q units. Only one unit of each type is shown in figure. A supplemental unit (N) between the w and x units receives signals from all the w units, computes the norm of vector w, and sends inhibitory signal to each of the x units. Each of these (n x units) also receives an excitatory signal from the corresponding w units. A similar supplemental unit performs the same role between p and q units, and another does so between v and u units. Each x unit is connected to corresponding v unit also. The connections between units pi (of the F1 layer) and yj (of the F2 layer) do show the weights that multiply the signal transmitted over these paths. The activation of the winning F2 unit is d where 0<d<1. The activation function is also applied to xn and qn (normalized functions of x and q). Symbol indicates normalization; i.e. the vector qn of activations of the q units is just the vector pn of activation of the p units, normalized to approximate unit length. The connections between the units Pi (of the F1 layer) and Yj (of the F2 layer) do show the weights that multiply the signal transmitted over those paths. The activations of the winning F2 unit is d, where 0<d<1.

115

yj

yj tji

...............................

yj

bij Ri iiii cpi


Pj Qj

bf(qi)

Uj

Vj

aui

f(xi)

Wj

Xj

si
Figure 4.II.1: Typical ART2 architecture [Carpenter & Grossberg 1987b]

Action of F2 layer (units yj) is similar to that of ART1. The units compete in a winner-take-all mode for right to learn each input pattern. As in ART1, learning occurs only if top-down weight vector for the winning unit is sufficiently similar to the input vector. In ART2, some processing of the input vector is necessary because the magnitude of real-valued vectors may vary more than the binary input vector of ART1. ART2 treats small components as noise and doesnt distinguish between patterns that are merely scaled versions of each other. The p units play the role of interface F1 units in the ART1 architecture. The role of supplemental units in the ART1 is incorporated within F1 layer. Units xi and qi apply an activation function to their net input; this function suppresses any components of the vectors of activations at these levels that fall below the user selected value q. The connection paths from u to w and from q to v have fitted weights a and b respectively. Algorithm The input signal s = (s1,s2,.si,.sn) continues to be sent while all of the actions to be described are performed. At the beginning of a learning trial, all the activities are set to zero. The computation cycle (for a particular learning trial) within the F1 layer can be thought of as originating with the computation of activation of ui (the activation of unit vi normalized to approximately unit length). Next a signal is sent from each ui to its associated units wi and pi are then computed. Units wi sum the signal it receives from ui and the input signal si. pi sums the signal it receives from ui the top-down signal it receives if there is an active F2 unit.

116 The activations of xi and qi are normalized versions of signals at wi and pi, respectively. An activation function is applied at each of these units before the signal is sent to vi which then sums the signals, it receives concurrently from xi and qi, this completes one cycle of updating F1 layer. The activation function used is f(x) = {

With this function, the activation of u and p units reach equilibrium after two updates of F1 layer. This function treats any signal that is less than as noise and suppresses it (sets it to zero). Value of parameter is specified by the user. Noise suppression helps the net to achieve stable cluster formation. The net is stable when the first cluster unit chosen for each input pattern is accepted and no reset occurs. For slow learning weight vectors for the clusters converge into stable values. After the activations of the F1 units have reached equilibrium, the p units send their signals to the F2 layer, where a winner-take-all competition chooses the candidate cluster unit to learn the input pattern. The units ui and pi in the F1 layer send their signals to the corresponding reset unit Ri. The reset mechanism can check for a reset each time it receives a signal from pi, since the necessary computations are based on value of that signal and the most recent signal the unit Ri had received from ui. However this is done only when pi first receives a top-down signal. After the condition for reset have been checked, the candidate cluster unit either will be rejected as not similar enough to the input pattern or will be accepted. If the cluster unit is rejected it will become inhibited (prevented from further participation in the current learning trial), and the cluster unit with the next largest net input is chosen as the candidate. This process is continues until an acceptable cluster unit is chosen (or the supply of available cluster units is exhausted). When a candidate cluster unit is chosen that passes the reset conditions, learning will occur. Learning: Slow-learning: In this type of learning, only one iteration of weight update equations occurs on each learning trial. A large number of presentation for each pattern is required but relatively little computation is done on each trial. These repeated presentations are treated as epochs in the algorithm. Fast-learning: The weight updates continue until the weight reach equilibrium on each trial. Only a few epochs are required, but a large number of iterations through the weight-update-F1update portion of the algorithm must be performed on each learning trial (presentation of pattern). Training Algorithm The algorithm can be used for either fast-learning or slow-learning. In fast-learning, iterations of weight change followed by updates of F1 activations proceed until equilibrium is reached. In slowlearning, only one iteration of the weight-update setup is performed, but large number of learning trials is required in order for net to stabilize.

Parameters: Number of input units (F1 layer). n m a, b Number of input units (F2 layer). Fixed weights in the F1 layer; sample values are a =10, b = 10. Setting either a =0, b = 0 produces instability in the net; other than that the net is not particularly sensitive to the values chosen.

117 c Fixed weight used in testing for reset; sample value of c = 0.1. A small c gives a larger effective range of the vigilance parameter. Activation of winning F2 unit; a sample value is d = 0.9. However, c and d must be chosen to satisfy the inequality 1 e A small parameter introduced to prevent division by zero when the norm of a vector is zero. This value prevents the normalization to unity from being exact. Noise suppression parameter, a sample value is = The sample value may be larger than desired in some applications. Components of input vector (and other vectors in the F1 loop) that are less than this value are set to zero. Learning rate; a smaller value will slow the learning in either fast or slow learning mode. However, a smaller value will ensure that the weights (as well the placement of patterns on clusters) eventually reach equilibrium in slow learning mode. Vigilance parameter; Along with initial bottom-up weights, it determines how many clusters will be formed. Generally between 0.7 and 1. Sometimes its valuealso depends upon c and d.

Initial weights: tji(0) = 0 ; must be small to ensure that no reset will occur for the first pattern placed on the cluster unit. bij(0) to prevent possibility of new winner being chosen during resonance as the weight change. Larger values of bij encourage the net to form more clusters. Calculations for Algorithm (update F1 activations) Unit J is the winning F2 node after the competition. If no winning unit has been chosen, d will be zero for wi and pi can be done in parallel, as can the calculations for xi and qi. The update F1 activations are: ui =

wi = si + aui , pi = ui +tJi ,

xi =

, qi =

and vi = f(xi) + bf(qi)

The activation function is: f(x) = {

118 Algorithm: Step0. Initialize parameters: a, b, c, d, e, , , Step1.Do steps 2-9 N_EP times. (Perform specified number of epochs of training.) Step2. For each input vector s, do steps 3-8. Step3. Update F1 unit activations: wi = si, xi =

vi = f(xi). Update F1 activations again: ui =


wi = si + aui, pi = ui, xi = qi =

, ,

vi = f(xi) + bf(qi). Step4. Compute signals to F2 units: yj = Step5. While reset is true, do steps 6-7. Step6. Find F2 unit with the largest signal. (Define J such that yJ yj for j = 1, . . ., m.) If yJ = -1 all cluster units are inhibited; this pattern cannot be clustered. Step7. If tJ = 0, proceed to step 8. If tJ 0, then check for reset: ui =

ijpi.

pi = ui +tJi, ri =

119 If < , then yJ = -1 (inhibit J). (Reset is true; repeat Step5) If , then wi = si + aui, xi = qi =

, ,

vi = f(xi) + bf(qi) Reset is false; proceed to Step8.\ Step8. Update weights for winning unit J: If tJ 0, then tJi = biJ = Step9. Test stopping condition for number of epochs. In this algorithm we have used the following facts: 1. Reset cannot occur during resonance (step 8). 2. A new winning unit cannot be chosen during resonance. ; ;

120

CONCLUSIONS AND RECOMMENDATIONS


Today there remains a challenge to every firm or company from the high competition in the market to survive and prosper through economic growth. For an industry this challenge would mean to be able to produce its product sufficiently as according to the market demand while maintaining the quality or even improving for its share to increase in the market. The inference from the on-plant study of the machine vision based inspection system is that today the machine vision based inspection system is a staple technology for any industry for not only for the production of quality products and robust control but also for its economic growth. And there are choices regarding the technology to opt for a company or firm depending upon the product and budget. The success on the field also depends on this very choice. In the MV based inspection system, the choice of the illumination source and illumination geometry are very important. It is because lighting quality determines robustness of MVS as image quality depends on lighting irrespective of camera, frame grabber parameters. It is because camera is less versatile than human eye in uncontrolled conditions. Designing and following a rigorous lighting analysis sequence will minimize time, effort, and resources. Similarly, there is need of basic understanding of the sensor attributes framegrabber and application needs for narrowing the search for the right sensor as it is heart of the camera. Depending upon the resolution required, colour or monochromatic applications, image processing speed, space limitation in the plant one has to opt for the best suitable sensor technology and framegrabber. Equally in the imaging techniques there are options for how the product should be presented to the camera, either in bright field or dark field, and whether to use strobe or steady imaging. Additionally there are a large number of image processing techniques each with its merit and demerits. Moreover the real world problems are imprecise and uncertain in the industrial production. Today soft computing techniques have emerged as a solution to those conditions where imprecision and uncertainty comes to the picture. There are options to whether to go for PC hardware or embedded hardware. Most of the MV applications are PC based but for specialized applications embedded hardware can be a better which depends on the product. All these options actually provide lustre to the MV based inspection systems. Recently the combined soft-computing techniques such as Neuro-fuzzy techniques or combined neural and genetic algorithm techniques, which are equipped with combined capabilities, are being employed in the appliances to make them smart and user friendly. The number of industries such as pharmaceuticals, edible oil, fruit, wood today opt for Near Infrared based inspections systems as such systems are contactless with remote inspection. There are number of advantages for an industry to opt for NIR based inspection systems which includes; they are inexpensive, have good signal-to-noise ratio, can inspect large number of samples and are more sensitive than visible fluorescence and chemiluminescence. NIR spectroscopy has been a proven and fast technique with the capability of simultaneous prediction of product parameters. Analysis can be done within a few minutes once a calibration model is developed. Hence it is also suited for online

121 inspection systems. Hence in India also numbers of the companies like edible oil, pharmaceuticals have chosen for this technology. In the industry a large variety of situations require the system to learn new parameters or classes of situational variables as and when a new breed of situations becomes apparent. The system needs to be stable to the new input patterns without interfering with the previous patterns that the system recognizes correctly. ART networks are the key to this stability-plasticity dilemma. Owing to the todays need of industry the Adaptive Resonance Theory (ART) neural networks were implemented as a re-trainable classifier. The results pertaining to the ART 1 implementation are as follows: We implemented the basic alphabet classification where the inputs have different font and are noise suffered. We fed the input alphabet patterns as shown in the figure below (as ART1 only takes binary input).

(i)

(ii)

(iii)

(iv)

(v)

(vi)

(vii)

(viii)

When the vigilance parameter which measures the degree of similarity in the input patterns in ART network was set to 0.4, clusters of classification were as follows:

122 Cluster 1 Alphabet

When the vigilance parameter was set to 0.9, clusters of classification were as follows:

Cluster 1

Alphabet

123 3

Thus ART 1 successfully classified all the input alphabet patterns when the input patterns are noise affected or of different font. Here when there was entirely new class of input occurred, the network was able to learn the pattern without destroying the previous learned pattern which is vividly seen in the second case when the vigilance parameter was 0.9. Mainly the results in ART2 network was following: We employed the ART2 to classify the shapes which is a significant problem in the machine vision based inspections applications. We fed the images of variety of the shapes to the networks which are as follows:

(i)

(ii)

(iii)

(iv)

(v)

124

(vi)

(vii)

(viii)

(ix)

(x)

(xi)

(xii)

(xiii)

(xiv)

(xv)

The classification result of the ART2 network is shown below:

Table 1
Clusters 1 Shapes

125 4

Thus it can be inferred from above classification of the shapes by the ART2 network which correctly determined the class of the shapes even when their size, orientation and dimensions are different. Additionally as in ART1 network when the network encountered a new class of the shapes image the network retained the previous learnt shapes and classified it correctly. This approves the ART networks as re-trainable classifiers. The vigilance parameter in the ART network is to be chosen correctly and is depended on the input types and as well as input sequences. This is the limitation of the ART networks. The combined neural and fuzzy networks or neural and genetic algorithm contains both the capabilities of the networks making it powerful to make the appliances product smart and more user friendly.

126

REFERENCES

[1] Freeman James A. and Skapura David M.(1999). Neural Networks: Algorithms, Applications and programming techniques, International Students edition, Addison Wesley Longman. [2] Haykin Simon.(1998), Neural Networks: A comprehensive foundation, Prentice Hall International, Inc. [3] Hornberg Alexander.(2006), Handbook of Machine Vision. Wiley Publications [4] Laurene Fausett. (1994). Fundamentals of Neural Networks: Architecture, Algorithms, and Applications , Publisher Prentice Hall. [5] Pandya Abhijit S. and Macy Robert B. (1995), Pattern Recognition with Neural Networks, CRC Press. [6] Schalkoff Robert J.(1997 ). Artificial Neural Networks, Publisher McGraw Hill.

Internet resources referred during the entire project period are: www.machinevisiononline.com www.ieee.org www.graphics.stanford.edu www.cf.cs.ac.uk www.wikipeidia.org www.ai-jinkie.com

You might also like