Professional Documents
Culture Documents
List The Basic Task of The Computer Vision System?
List The Basic Task of The Computer Vision System?
Detection
Classification
Segmentation
Localization
Images processing: the goal of image processing is to produce an image that is more advantages
for our purpose. Image processing is often used to prepare image for further analysis or help
human users to detect crucial details more easily
Traditional vision:
semantic segmentation pixel wise classification, which means in a mask where each pixel is
assigned to a class
photodiodes they are semiconductor advices that use light sensing techniques which operates
based on principle of the photoelectric effect
Analog device
Store charge in case of photon
Reading process is done row by row
Image properties: describe de light intensity at that specific position, the pixels are represent with
8 bit 0 means black and 255 white
Intensity transform It is a method of image enjanment, the given information is applied to every
pixel. New value of pixel depends on old value. It does not depend on old neighbor pixels.
The intensity transformation it is used on color images, also the intensity transformations is
applied to each channel independently
Variants:
Threshold
It is the simplest method of segmenting images. From a grayscale image, thresholding can
be used to create binary images
if the pixel intensity is below the threshold then it is 0, if it is above it is 1, the result is a
binary image
hysteresis thresholding In this method we set a threshold which results in two images. The
difference between the two will result in an image with proper edges
Histogram
underexposed: the histogram has a positive skew, because the image was recorded with too
fast exposure time
overexposed: the values are concentrated in the upper part of the histogram, because dark
pixels are two bright
these problems can be solve with histogram equalization algorithm. The goal of this algorithm is
provided a better approximation for the uniform distribution, nevertheless some information is
lost due to the merging.
Image noise, noise type: capturing images with a device might cause noise and speckles. The most
popular is gaussin noise, which is the consequence of the noisy nature of image sensor of the
surrounding electronics
Noises: goussin noise
Salt and paper noise: this noise it not too common but change the value of the pixels in a
significant manner
Quantization error (periodic noise) it is a consequence of analog digital conversion and the
periodic noise
Convulation filtering:
Filters
Smoothing
Sharpening
Edge defections filters
Each element of kenels is non negative and they sum up to 1 if sum differs then a
brightening/darkening step also happen. This means that instead of 2D filtering 1D
filtering can be done
Problems:
o Due to averaging some images details can be blurred, resulting in unsharp images
o Due to averaging significantly values are also forced to be average, this means that
salt and pepper noise is not eliminate
This filter orders pixels based on their intensity value, thus minimum, maximum, medium
value can be detected
The filter leaves the edge of an image untouched, but salt and pepper noise are filtered
magnificently.
This filter is very slow
This is the easiest method to detect edges is calculate derivatives in each direction to see
abrupt change in pixel intensities
It means edges noise several false detection can be included. If we want to solve this
problem, we ca apply gausin noise
Prewitt It is similar to gauss karnel they can detect each edge in ever direction
sobel oprations sooth in the direction perpendicular to the edge, it is less sensitive to noise
Canny algorithm
Using this algorithm the users can fix a threshold for edge detection
If we choose too high threshold value the result cannot detect lower contrast parts of edges
If the threshold is too small the result is a false detection in lower contrast part of edges
So the aim of this algorithm is to find the compromise.
1. Calculation Whit simple derivation filters we can calculate the direction of image and its
intensity
2. Starting from each following direction of gradient pixels with biggest gradient value are
kept while pixels with small pixels values are set to 0
3. Order derivatives in this step original blurred edges will be sharp
4. Threshold the image two threshold will be set, and with the small threshold will be
marked as real edges
Interpolation methods During the image processing it often occurs that an image is not available in the
format that someone wants. Also the people like to have image with high resolution. So to get this it is
necessary to determine the value of the new pixels through interpolation methods.
Furrier Transformation:
Furrier transformation can be applied to specify the inputs in on image plane
Furrier transformation is two dimensional
This transformation has only frequency, amplitude and direction, also helps to transform periodic
signals
The fourier-trnasform of an image has axial symmetry
FFT can be used if both sides of the image have power of 2 number of pixels
fourier-transform every frequency component has a complex conjugate pair
ideal filters: There are two types of ideal filters
1. Low pass filter: also called low frequency used to reduce image noise
2. High pass filter: also called high frequency used to find edges (sharpings)
Butterworth-filter: high pass and low pass filters have drawbacks which it can be solve them used
Butterworth-filter, this filter has sophisticated spectrum
Cosine transform (DCT): decrease cosine transform is similar to the DFT it also interprets the images a
periodic function, also store only number
Fast cosine transformation (FCT): it accelerates the calculation of the decrease cosine transform DCT
JPEG:
One of the most popular image compression standards, it uses DCT compression ratio of 1:10
In JPEG compression the original image with to much memory data can be compressed into
compressed image with small memory, nevertheless the image will lost some information
The information that is lost in the compression, it is information that our vision do not recognize.
Deconvolution: This operation it is a frequency-domain interpretation that mean that the spectrum of the
image will not be multiplied but divided by the spectrum of the filter.
Template matching: Algorithm of template matching can be applied to detect no-changing objects. We have a
reference image which is marched to the image in all positions.
To carry out the template matching convolutional between the template and image is calculated
In order to achieve the best results in template matching we need to minimize cost functions
Actually, template matching looks for edge detection
So sealing rotation and distortions make difficulty in template matching. If one of them is done then
they should be new templates that match above cases. Algorithm must be trained for these scenarios.
Harris detector
It also uses local structure matrix
It also used to detect edge like images points, result of harris detector can be a large negative number
Compered to harris detector KLT has a result more similar to human perception
Invariances:
Klt and harris operations are only for two transformations
1. Addictive intensity change
2. Rotation
Sitft detection:
Main principal of stift is that detection is done whit the help of different scale factors
Corner detection is done whit the Dog Kanels.
Siff algorithm generated the decision code for the image corners, this code consists of 128 numbers
The most revolutionary idea of the shift is the incorporation of rotational invariance
ORB detector:
Use fast algorithm to detect corners
The alforithm detects neighborhood analyzed pixels in circle form
Detect pixels in filtered whit KLT criterion
ORB descriptor:
Use brief description method
Thre are 256 predefined points pairs in the neighborhood of the key point
Scale invariance
Additive intensity transformation
Dimension Reduction
All irrelevant pixel with irrelevant information will be removed and relevant pixels will be
merged together
Principal component analysis
Assumes a normal distribution with zero mean
Subtracting the mean of the expected value of variables
Optimal dimension reduction algorithm for normal distribution
Kalman filter
it is an algorithm that uses a series of measurements observed over time
we have two estimates, one is from measurements, the other is from the prediction step
SIFT
is a feature detection algorithm in computer vision to detect and describe local features in
images
the sift is divided into two parts detector and description
The fundamental principle of the SIFT detector is that detection is made with the help of
different scaling factors, and each characteristic is also assigned a scale variable that will
be used for the descriptor code,
The fundamental principle of the SIFT description is to generate the descriptor code for the
image corners, this code consists of 128 numbers, The neighborhood is always selected from the
same image scale as the corner
Binary imagine:
It consists of pixels that can have one of exactly two colors, usually black and white. This means
that each pixel is stored as a single bit 0 or 1, also Binary images often arise in digital image
processing as masks or thresholding,
Learning algorithm:
Is the study of computer algorithms that improve automatically through experience
algorithms build a mathematical model based on sample data in order to make predictions
or decisions
Structure of learning algorithm
The notion of the machine learning can be describe while a formula (y=f(x,teta))
Each learning algorithm has its loss function/cost function
Also, the algorithm has optimization methods to manipulate and accelerate the learning
process
Learning Types:
It is a regression process where output (Y) is continues number
In learning type there is a Supervised learning, we called supervised learning if the correct
output is only partially known
Also there is another type called unsupervised learning: where only input class is available
Another type Reinforcement learning: this algorithm must able to make a sequence of
decision, but thre is not generally feedback after each decision
Difficulties of learning:
Underfitting: if we test the model on validations set, the result will be quite same as training
output
Overfitting: after a while, the validation error will begin to increase
Algorithm is only able to decrease the training error further if it begins to memorize the
input – output pints
Complex algorithm will memorize more dataset
Image classification
Consist in several arithmetic sequeces;
Capturing image (digitization), preprocessing and enhancing, feature extraction, decision
meaning
Linear regression:
Linear regression approximates the relationships between the inputs and outputs of the
training dataset with a linear equation.
The goal of linear regression is to minimize the square of the error
SVM (support vector machine)
Extension of lineal approximation for classification
We have to try to find the specific hyperplane that separates both classes whit highest
certainty
Nearest point to the hyperplane should be as far as possible.
Dropout
establishes some of the triggers when set to 0 and other triggers are calculated taking into account
these 0s
Batch Normalization
The main principle is that the average and the standard deviation of each trigger are calculated in
each iteration, and normalized according to the calculated statistics
Erosion and dilatation:
The erosion step eliminates small, noise-like objects, while the dilation step regrows large objects to
their original size
The main drawback of erosion and dilation is that they modify the size of the objects
we never use them individually, but we combine them
Skeletonizing
The skeleton is such a representation, where each pixel with value 1 has exactly one or two
neighbors with value 1
the skeleton is calculated with an iterative erosion algorithm
Object labeling and counting
The purpose of labeling and counting objects is to assign a label to separate
objects
The steps of
The first unlabeled pixel with value 1 has to be found and labeled with label L.
The assignment of label L to each neighbor of the evaluated pixel, which has a value of 1, and the
call of the second step for each of them.
Jump to step 1 and increment L
Segmentation
Segmentation is the process of dividing a digital image into multiple segments
The goal of segmentation is to simplify or change the representation of an image into
something that is more meaningful and easier to analyze
If segmentation does not happen on an object, but on an object class/category basis, then the
procedure is called semantic segmentation
Semantic segmentation
Segmentation is essential for image analysis tasks. Semantic segmentation describes the
process of connecting each pixel of an image with a class label