Week 8-Module 8 Computer Vision

Module 8
Image Processing and Computer Vision

Digital Images: These are essentially grids of tiny squares
called pixels, each containing a color value. Computers
process images by manipulating these pixel values.
1. Image
Representation Color Spaces: Different methods exist to represent color
information in an image. Common ones include:
and Color
Spaces: RGB (Red, Green, Blue): Most commonly used, represents
colors by combining intensities of these three primary colors.
HSV (Hue, Saturation, Value): Represents color in terms of

hue (color itself), saturation (color intensity), and value
(brightness).
2. Applications of Computer Vision:
Computer vision utilizes image processing techniques to extract meaningful information from images. Here are some
application domains:
Medical Imaging: Analyzing X-rays, MRIs for disease detection and diagnostics.
Autonomous Vehicles: Object detection, lane recognition for self-driving cars.
Security and Surveillance: Facial recognition, anomaly detection in video surveillance.
Robotics: Object manipulation, navigation based on visual data.
Consumer Electronics: Gesture recognition, image editing tools.

3.
Convolution
Operation and Convolution: A mathematical operation that "slides" a filter
Image (a small matrix) over an image, performing element-wise
multiplication between the filter and corresponding pixel
Filtering: values. The sum of these products is then assigned to the
center pixel of the filter's output.
By applying different filters, we can achieve various effects
on an image:
• Blurring Filters: Average surrounding pixels, smoothing
out details (e.g., box blur, Gaussian blur).
Image • Sharpening Filters: Enhance edges by amplifying
Filtering: differences between neighboring pixels (e.g., Laplacian
filter).
• Noise Reduction Filters: Mitigate noise in images by
smoothing or using statistical methods (e.g., median filter).
Edges in an image represent boundaries
between objects. These techniques help
identify these boundaries:
4. Edge Sobel Operator: Calculates the gradient

Detection magnitude in horizontal and vertical
directions, highlighting strong edges.
Techniques:
Canny Edge Detection: A multi-stage
algorithm that considers edge strength,
direction, and suppresses weak edges to
provide a clean edge map.
Object Detection: Aims to identify
and locate all instances of a specific
object class (e.g., pedestrians, cars)
1. Object within an image or video.
Detection
and Object Localization: Focuses on
Localization: pinpointing the exact location of a
single instance of an object class, often
achieved by drawing a bounding box
around it.
2. Haar Cascades and SSD - Object
Detection Techniques:
Single Shot MultiBox Detector (SSD):

A deep learning-based approach that
Haar Cascades: An older method using utilizes a convolutional neural network to
rectangular features to identify objects simultaneously predict bounding boxes
efficiently. They are fast but less accurate and object class probabilities for all
compared to modern methods. objects in an image. This offers higher
accuracy but requires more
computational resources.
• Pre-trained models like SSD are trained on
3. Applying Pre- massive datasets and can be fine-tuned for
trained Models specific object classes relevant to your task.
• This saves time and effort compared to
for Object training a model from scratch. Popular
frameworks like TensorFlow and PyTorch
Detection: offer pre-trained models and tools for fine-
tuning.
4. Introduction to Facial Recognition Techniques:
• Face Detection: The initial step, identifying the presence and location of faces within an
image or video.
• Face Recognition: Matches the detected face against a database of known faces to identify
the individual.
5. Building a
Simple Facial Here's a simplified breakdown:
• Data Collection: Gather a dataset of images containing
Recognition faces of the individuals you want to recognize.
System: • Face Detection: Use a pre-trained face detection model to
locate faces in your images.
1 2 3
Cont... Feature Extraction: Extract

facial features from the
detected faces, like distances
between eyes, nose shape,
Training a Classifier: Train
a machine learning model
(e.g., Support
Vector Machine) to associate
Recognition: For a new
image, detect the face, extract
features, and use the trained
classifier to predict the
etc. These features represent extracted features with identity based on the closest
a unique "faceprint" for each known identities in match in the database.
individual. your dataset.
• Image Segmentation: This technique groups
pixels into meaningful segments corresponding to
objects or regions of interest within an image. It's
useful for tasks like self-driving cars (segmenting
lanes) or medical imaging (segmenting tumors).
• Image Classification: Determines the overall class
of an image (e.g., cat, landscape). While object
Additional detection focuses on individual objects,
classification treats the entire image as a single
Content: entity.
• Deep Learning Architectures: Convolutional
Neural Networks (CNNs) are the workhorses
behind many computer vision tasks. Understanding
their basic structure and function will provide a
deeper understanding of how these algorithms
achieve impressive results.
Facial recognition has become a prominent technology with a
wide range of applications. Let's delve deeper into its core
Unveiling the concepts:
Secrets of Facial
Recognition:
Techniques,
Algorithms, and 1. Unveiling the Face: Face Detection
Building a System
The Crucial First Step: Face detection acts as the

gatekeeper, identifying the presence and location of faces
within an image or video. It essentially separates the "wheat
from the chaff," focusing on facial regions before recognition
kicks in.
Viola-Jones Framework (Haar Cascades): This
traditional method utilizes predefined features like
edges and corners to identify rectangular regions
resembling faces. It's computationally efficient but
may struggle with variations in pose, lighting, or
Techniques occlusion (partial covering of the face).
for Face
Detection Deep Learning-based
Approaches: Convolutional Neural Networks
(CNNs) are revolutionizing face detection. These
algorithms learn powerful features directly from
large datasets, achieving higher accuracy and
handling variations better compared to Haar
cascades.
Beyond Detection: Matching the Faceprint: Once a face is
detected, facial recognition takes over. Its goal is to identify
the individual by comparing the detected face against a
2. Recognizing database of known faces.
the Face:
Unveiling
Identity
The Power of Facial Features: Facial recognition algorithms
extract a unique set of features from the detected face. These
features often include:
Geometric Facial Landmarks: Locations of

Measurements: Distances key points on the face like eye
between eyes, nose, mouth, etc. corners, tip of the nose, etc.
The extracted features act as a "faceprint" for the
person. Recognition algorithms like:
• Support Vector Machines (SVMs): Learn a separation
Matching hyperplane in high dimensional feature space to distinguish
between different identities.
Algorithms: • Nearest Neighbors: Compare the extracted features with
those in the database, identifying the closest match (nearest
neighbor) as the person's identity.
3. Building a Simple Facial Recognition System: A Hands-on Approach
While building a robust facial recognition system is a complex endeavor, let's explore a simplified
While approach:
Data Collection: Gather a dataset of images containing faces of the individuals you want to
Gather recognize. Ensure variations in pose, lighting, and expressions are captured for better performance.
Face Detection: Utilize a pre-trained face detection model (Haar cascade or a CNN-based model)
Face to locate faces in your images. Libraries like OpenCV offer pre-trained models for this purpose.
Cont...
Feature Extraction: Employ libraries like dlib or OpenCV to extract facial

features (e.g., distances between eyes) from the detected faces.
Training a Classifier: Train a machine learning model like SVM using labeled
data (images with corresponding identities). The model learns the association
between extracted features and known identities.
Recognition on New Images: For a new image, follow steps 1-3 to detect the
face and extract features. Feed these features to the trained classifier to predict the
identity based on the closest match in the database.
Data Quality: The performance of your system
heavily relies on the quality and diversity of your
dataset. More data with variations leads to better
recognition accuracy.
Computational Resources: Training deep learning

Important models for face recognition can be computationally
expensive. Consider cloud platforms or GPUs for
Considerations: faster training.
Privacy Concerns: Facial recognition raises

ethical and privacy concerns. Ensure responsible
use and user consent when deploying such systems.
Unveiling the
Mysteries of
Image
Segmentation:
Techniques and Image segmentation is a fundamental technique in computer
vision, aiming to partition an image into meaningful
Applications segments. These segments typically correspond to objects,
regions with similar characteristics, or distinct boundaries.
Here's a breakdown of key concepts, applications, and
advanced methods:
Medical Imaging: Segmenting tumors, organs, or
blood vessels in X-rays, MRIs, and CT scans for
diagnostics and analysis.
1.
Applications Autonomous Vehicles: Identifying lanes, traffic
of Image signs, and pedestrians for safe navigation.
Segmentation:
Object Detection and Recognition: Isolating objects
of interest for better recognition and classification.
Content-based Image Retrieval: Matching images

based on segmented objects or regions.
There are various segmentation techniques,
each with its strengths and weaknesses.
Here are some common methods:
2. Segmentation
Techniques: A
Spectrum of
Approaches Thresholding: A simple approach that
separates pixels based on their intensity
value. It works well for images with high
contrast between foreground and
background but struggles with more
complex scenarios.
Groups pixels with similar characteristics
(e.g., color, intensity) into regions. Techniques
include:
• Region Growing: Starts with a seed point and
Region-based iteratively incorporates neighboring pixels that
meet a similarity criterion.
Segmentation:
• Region Merging: Starts with individual pixels
and merges neighboring regions based on
similarity criteria.
3. Advanced
Segmentation Watershed Transformation: Simulates
water flowing downhill on a topographic
Methods: Beyond basic methods, advanced

segmentation techniques address challenges
like complex object shapes, overlapping
landscape derived from the image
intensity. Watershed lines are formed at
ridges, separating objects like catchment
Delving Deeper objects, and uneven illumination: basins. This method is effective for
segmenting objects with distinct intensity
variations.
Leverages user interaction to guide the
segmentation process. The user roughly
GrabCut outlines the object of interest (foreground) and
the background, and the algorithm refines the
Interactive segmentation based on these constraints. This
Segmentation: method is beneficial for segmenting objects
with complex shapes or in cluttered
backgrounds.
Here's a Watershed Transformation:
deeper look • The Analogy: Imagine an image as a 3D landscape where

pixel intensity represents elevation. Darker pixels are
at Watershed valleys, and brighter pixels are hills.
and GrabCut • Flooding Simulation: Watershed algorithms simulate

flooding this landscape, starting from pre-defined minimum
methods: points (seeds, often placed manually). Water from different
"flooded basins" will eventually meet at ridge lines.
• Catchment Basins as Segments: These ridge lines
represent the boundaries between objects. The resulting
segments (catchment basins) correspond to distinct objects
in the image.
GrabCut Interactive Segmentation:
User in the Loop: GrabCut leverages user Graph Cut Optimization: The algorithm builds a Refined Segmentation: GrabCut iteratively
interaction to guide the segmentation process. The graph where pixels are nodes and neighboring refines the segmentation based on user-provided
user roughly outlines the foreground (object of pixels are connected with edges. Weights are scribbles and the cost function
interest) and the background using a scribble tool. assigned to edges based on color similarity. The optimization, achieving a more accurate
goal is to minimize a cost function by assigning segmentation of the desired object.
foreground or background labels to
pixels, considering user scribbles and color
similarities.
The best segmentation method depends on the
specific image characteristics and the desired
outcome. Here are some general guidelines:
Choosing the For simple images with high contrast: Thresholding

can be effective.
Right
Method: For images with distinct regions of similar color or
intensity: Region-based segmentation is a good
choice.
For images with complex object shapes or uneven

illumination: Consider advanced methods like
Watershed or GrabCut.
Student Image
Processing
Project Ideas
with Here are some project ideas for students to
explore various computer vision techniques,
Computer categorized by difficulty level:
Vision
Techniques:
Beginner Level:
• Smart Photo Album:
• Implement a program that automatically organizes
a photo collection based on image content.
• Use color analysis to categorize photos
(e.g., beach photos, sunsets).
• Apply face detection to identify and group
photos based on people in them.
• Explore basic image segmentation to
separate foreground objects from the
background for categorization.
Develop a program that applies
Real-time Color different color filters
Filter: (e.g., grayscale, sepia) to a live
webcam feed.
Experiment with color space

conversions (e.g., RGB to HSV)
to achieve various effects.
Image • Implement an algorithm that
Mosaic creates a mosaic image using
smaller input images.
Creation: • Explore techniques like
image tiling or content-aware
image stitching.
• Traffic Sign Recognition:
• Train a simple image classifier to recognize
common traffic signs (stop, yield, speed limit).
Intermediate • Utilize pre-trained models like MobileNet and
fine-tune them for traffic sign recognition tasks.
Level: • Explore techniques for data augmentation
(artificially creating variations of traffic signs)
to improve model robustness.
• Develop a system that recognizes hand
Gesture gestures from webcam footage.
• Apply techniques like background
Recognition subtraction to isolate the hand
region.
System: • Explore feature extraction methods
to represent hand shapes and track
finger movements.
• Implement a program that allows users to virtually try on
clothes or accessories using their webcam.
Virtual Try- • Utilize facial landmark detection to accurately place
the virtual item on the user's face.
On App: • Explore image segmentation to separate the user's
face from the background for seamless integration.
• Self-Driving Car Lane Detection:
• Train a deep learning model to detect lane markings
in road images captured from a simulated car
environment.
Advanced • Utilize techniques like vanishing point
Level: estimation to understand the road perspective.
• Explore semantic segmentation to differentiate
between lanes, road markings, and other objects
on the road.
• Train a model to recognize facial
expressions (happiness, sadness, anger)
from video footage.
Facial Expression • Utilize facial landmark detection to
identify key points on the face.
Recognition • Explore feature extraction methods
to capture facial muscle movements
and expressions.
Medical
Image
• Develop a system to segment specific regions of
Segmentation interest (e.g., tumors) in medical images (X-
rays, MRIs).
for Disease • Utilize deep learning models like U-Net
Detection: specifically designed for medical image
segmentation tasks.
• Explore data pre-processing techniques for
handling medical image formats and variations.

Week 8-Module 8 Computer Vision

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Week 8-Module 8 Computer Vision

Uploaded by

Copyright:

Available Formats

Module 8

Image Processing and Computer Vision

HSV (Hue, Saturation, Value): Represents color in terms of

Autonomous Vehicles: Object detection, lane recognition for self-driving cars.

Security and Surveillance: Facial recognition, anomaly detection in video surveillance.

Robotics: Object manipulation, navigation based on visual data.

Consumer Electronics: Gesture recognition, image editing tools.

4. Edge Sobel Operator: Calculates the gradient

Single Shot MultiBox Detector (SSD):

Cont... Feature Extraction: Extract

The Crucial First Step: Face detection acts as the

Geometric Facial Landmarks: Locations of

Feature Extraction: Employ libraries like dlib or OpenCV to extract facial

Computational Resources: Training deep learning

Privacy Concerns: Facial recognition raises

Content-based Image Retrieval: Matching images

Methods: Beyond basic methods, advanced

deeper look • The Analogy: Imagine an image as a 3D landscape where

and GrabCut • Flooding Simulation: Watershed algorithms simulate

Choosing the For simple images with high contrast: Thresholding

For images with complex object shapes or uneven

Experiment with color space

You might also like