Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

Computer vision has the goal to extract, analyze, produce the relevant information from images or

image through tools like algorithm

List the basic task of the computer vision system?

 Detection
 Classification
 Segmentation
 Localization

Images processing: the goal of image processing is to produce an image that is more advantages
for our purpose. Image processing is often used to prepare image for further analysis or help
human users to detect crucial details more easily

Traditional vision:

 record input image


 image correction
 extract features
 make decision

semantic segmentation pixel wise classification, which means in a mask where each pixel is
assigned to a class

situation interpretation: they are founded decisions, based on visual information

problems during the classification

 illumination (pixel values are not too close to each other)


 deformation
 intro class variation

imaging it is the way or process of creating visual representations of object form

photodiodes they are semiconductor advices that use light sensing techniques which operates
based on principle of the photoelectric effect

 (PIN) P-N junction is targeted with photos


 An electron hole is generated
 Depletion of the diode
 Electric field moves them away

CCD(change couple device)

 Analog device
 Store charge in case of photon
 Reading process is done row by row

Cmosc (complementary metal oxide semiconduct)

 Each cell has its separate amplifier


 Process is faster
 Micro lenses are mounted onto each cell which direct photos
 Less energy consumption
 Cheaper
 Smaller delay
 Smaller dimension

Image properties: describe de light intensity at that specific position, the pixels are represent with
8 bit 0 means black and 255 white

Intensity transform It is a method of image enjanment, the given information is applied to every
pixel. New value of pixel depends on old value. It does not depend on old neighbor pixels.

The intensity transformation it is used on color images, also the intensity transformations is
applied to each channel independently

Variants:

 If a constant is added, the lightness of image change


 If multiplies by constant increase or decrease the constant of image
 We can add constant after the multiplication step to center new intensity values

Threshold

 It is the simplest method of segmenting images. From a grayscale image, thresholding can
be used to create binary images
 if the pixel intensity is below the threshold then it is 0, if it is above it is 1, the result is a
binary image

hysteresis thresholding In this method we set a threshold which results in two images. The
difference between the two will result in an image with proper edges

Histogram

 it describe the relative frequency of intensity values in an image channel


 helps us to detect and correct the effects of image acquisitions

underexposed: the histogram has a positive skew, because the image was recorded with too
fast exposure time

overexposed: the values are concentrated in the upper part of the histogram, because dark
pixels are two bright

these problems can be solve with histogram equalization algorithm. The goal of this algorithm is
provided a better approximation for the uniform distribution, nevertheless some information is
lost due to the merging.

Image noise, noise type: capturing images with a device might cause noise and speckles. The most
popular is gaussin noise, which is the consequence of the noisy nature of image sensor of the
surrounding electronics
Noises: goussin noise

Salt and paper noise

Quantization erro (periodic noise)

Salt and paper noise: this noise it not too common but change the value of the pixels in a
significant manner

Quantization error (periodic noise) it is a consequence of analog digital conversion and the
periodic noise

Convulation filtering:

 It is the most essential method to correct image errors and noise


 A small filtering window kenels is slid through the image and the value of each pixel is set
to the value according to its value and value of neighbor
 The filtering weights are chosen to sum up to 1, otherwise the value image would become
darker or lighter

Filters

 Smoothing
 Sharpening
 Edge defections filters

Smoothing (linear filters):

 Each element of kenels is non negative and they sum up to 1 if sum differs then a
brightening/darkening step also happen. This means that instead of 2D filtering 1D
filtering can be done
 Problems:
o Due to averaging some images details can be blurred, resulting in unsharp images
o Due to averaging significantly values are also forced to be average, this means that
salt and pepper noise is not eliminate

Rank filtering: this filter solves the problems of linear filtering

 This filter orders pixels based on their intensity value, thus minimum, maximum, medium
value can be detected
 The filter leaves the edge of an image untouched, but salt and pepper noise are filtered
magnificently.
 This filter is very slow

Sharpening edge detection:

 This is the easiest method to detect edges is calculate derivatives in each direction to see
abrupt change in pixel intensities
 It means edges noise several false detection can be included. If we want to solve this
problem, we ca apply gausin noise
Prewitt It is similar to gauss karnel they can detect each edge in ever direction

sobel oprations sooth in the direction perpendicular to the edge, it is less sensitive to noise

sharpening edge detection

 The sum of the weight is 1


 this filter have similar behavior to the edge detection, but it able to highlight fine details
changes
Edges: the may drawback of edges is the fact that the change in image is locally in one direction significant if
edge moves in perpendicular direction, then it can not be detected
To solve this, image corners can be used instead of edges. Actually, corners are by definition are such pixels,
from which in each direction, intensity of image changes insignificant manner.
One of the most useful image corner detection is KLT detector

Canny algorithm
Using this algorithm the users can fix a threshold for edge detection
If we choose too high threshold value the result cannot detect lower contrast parts of edges
If the threshold is too small the result is a false detection in lower contrast part of edges
So the aim of this algorithm is to find the compromise.

Canny algorithm has some steps:

1. Calculation Whit simple derivation filters we can calculate the direction of image and its
intensity
2. Starting from each following direction of gradient pixels with biggest gradient value are
kept while pixels with small pixels values are set to 0
3. Order derivatives in this step original blurred edges will be sharp
4. Threshold the image two threshold will be set, and with the small threshold will be
marked as real edges

Interpolation methods During the image processing it often occurs that an image is not available in the
format that someone wants. Also the people like to have image with high resolution. So to get this it is
necessary to determine the value of the new pixels through interpolation methods.

Interpolations: There are 3 types


1. Nearest neighbor approximation:
 the most easily method
 this method has approximation whit continues functions
 value of the pixel is equal to value of neighbor pixel
 in nearest neighbor image classification is carried out based
on image intensity

2. linear interpolation
 it is a linear function
 new value is average of new neighbor pixel
 can be extended by 4 neighbors
3. bilinear interpolation:
 this method is a extension of linear interpolation
 don not causes overshot
 reduce the smoot of the image

Furrier Transformation:
 Furrier transformation can be applied to specify the inputs in on image plane
 Furrier transformation is two dimensional
 This transformation has only frequency, amplitude and direction, also helps to transform periodic
signals
 The fourier-trnasform of an image has axial symmetry
 FFT can be used if both sides of the image have power of 2 number of pixels
 fourier-transform every frequency component has a complex conjugate pair
ideal filters: There are two types of ideal filters
1. Low pass filter: also called low frequency used to reduce image noise
2. High pass filter: also called high frequency used to find edges (sharpings)
Butterworth-filter: high pass and low pass filters have drawbacks which it can be solve them used
Butterworth-filter, this filter has sophisticated spectrum
Cosine transform (DCT): decrease cosine transform is similar to the DFT it also interprets the images a
periodic function, also store only number
Fast cosine transformation (FCT): it accelerates the calculation of the decrease cosine transform DCT
JPEG:
 One of the most popular image compression standards, it uses DCT compression ratio of 1:10
 In JPEG compression the original image with to much memory data can be compressed into
compressed image with small memory, nevertheless the image will lost some information
 The information that is lost in the compression, it is information that our vision do not recognize.
Deconvolution: This operation it is a frequency-domain interpretation that mean that the spectrum of the
image will not be multiplied but divided by the spectrum of the filter.
Template matching: Algorithm of template matching can be applied to detect no-changing objects. We have a
reference image which is marched to the image in all positions.
 To carry out the template matching convolutional between the template and image is calculated
 In order to achieve the best results in template matching we need to minimize cost functions
 Actually, template matching looks for edge detection
 So sealing rotation and distortions make difficulty in template matching. If one of them is done then
they should be new templates that match above cases. Algorithm must be trained for these scenarios.
Harris detector
 It also uses local structure matrix
 It also used to detect edge like images points, result of harris detector can be a large negative number
 Compered to harris detector KLT has a result more similar to human perception
Invariances:
Klt and harris operations are only for two transformations
1. Addictive intensity change
2. Rotation
Sitft detection:
 Main principal of stift is that detection is done whit the help of different scale factors
 Corner detection is done whit the Dog Kanels.
 Siff algorithm generated the decision code for the image corners, this code consists of 128 numbers
 The most revolutionary idea of the shift is the incorporation of rotational invariance
ORB detector:
 Use fast algorithm to detect corners
 The alforithm detects neighborhood analyzed pixels in circle form
 Detect pixels in filtered whit KLT criterion
ORB descriptor:
 Use brief description method
 Thre are 256 predefined points pairs in the neighborhood of the key point
 Scale invariance
 Additive intensity transformation

Dimension Reduction
 All irrelevant pixel with irrelevant information will be removed and relevant pixels will be
merged together
Principal component analysis
 Assumes a normal distribution with zero mean
 Subtracting the mean of the expected value of variables
 Optimal dimension reduction algorithm for normal distribution

Linear Discriminant Analysis


 Labels corresponding to the data
 Th aim of LDA is to reduce dimensionality in that way that the information useful for the
separation of class should remain
 LDA intends to maximize the variance od inter class variance
Hidden Markov Model
A process is called a Markov process if the next state depends only on the actual state
 If we want to use Markov Model process The first is to determine the probability of a sequence of a
hidden state.
 Then is to find the most probable sequence of hidden states

Kalman filter
 it is an algorithm that uses a series of measurements observed over time
 we have two estimates, one is from measurements, the other is from the prediction step

SIFT
 is a feature detection algorithm in computer vision to detect and describe local features in
images
 the sift is divided into two parts detector and description
 The fundamental principle of the SIFT detector is that detection is made with the help of
different scaling factors, and each characteristic is also assigned a scale variable that will
be used for the descriptor code,
 The fundamental principle of the SIFT description is to generate the descriptor code for the
image corners, this code consists of 128 numbers, The neighborhood is always selected from the
same image scale as the corner

Binary imagine:
It consists of pixels that can have one of exactly two colors, usually black and white. This means
that each pixel is stored as a single bit 0 or 1, also Binary images often arise in digital image
processing as masks or thresholding,

Learning algorithm:
 Is the study of computer algorithms that improve automatically through experience
algorithms build a mathematical model based on sample data in order to make predictions
or decisions
Structure of learning algorithm
 The notion of the machine learning can be describe while a formula (y=f(x,teta))
 Each learning algorithm has its loss function/cost function
 Also, the algorithm has optimization methods to manipulate and accelerate the learning
process

Learning Types:
 It is a regression process where output (Y) is continues number
 In learning type there is a Supervised learning, we called supervised learning if the correct
output is only partially known
 Also there is another type called unsupervised learning: where only input class is available
 Another type Reinforcement learning: this algorithm must able to make a sequence of
decision, but thre is not generally feedback after each decision
Difficulties of learning:
 Underfitting: if we test the model on validations set, the result will be quite same as training
output
 Overfitting: after a while, the validation error will begin to increase
 Algorithm is only able to decrease the training error further if it begins to memorize the
input – output pints
 Complex algorithm will memorize more dataset
Image classification
 Consist in several arithmetic sequeces;
 Capturing image (digitization), preprocessing and enhancing, feature extraction, decision
meaning
Linear regression:
 Linear regression approximates the relationships between the inputs and outputs of the
training dataset with a linear equation.
 The goal of linear regression is to minimize the square of the error
SVM (support vector machine)
 Extension of lineal approximation for classification
 We have to try to find the specific hyperplane that separates both classes whit highest
certainty
 Nearest point to the hyperplane should be as far as possible.

Gradient based algorithm:


uses derivatives to find the optimal value of the function
This one has 3 main steps search direction, step size, convergence check
 search direction to have a derivative in one dimension it is called slope
 step size when you reach the optimal chooses
 convergence when you reach the optimal chooses
Higher order derivatives
 Since gradient based algorithm is first order derivatives means that It is slow and step size
causes oscillation near the minimum. This can be solved by higher order derivatives
Newton method
 uses a second order Taylor series approximation, where minimum position can be solved by
setting expression to 0
Hessian matrix
 stores all second derivatives of the function, which can be used to determine the minimum
value
Backpropagation
 Calculates the derivative of the loss function, we can calculate the output of all nodes given
the inputs and the node functions, this is called forward pass
 If we know the node function and derivative of loss function the output of the nodes, we
can calculate the derivatives.
 Thus, we can calculate the gradients of all inputs and weights
 We use Backpropagation to compare derivatives between only any two points
 It helps us to understand how loss depends on weights and how to change weights in order
to decrease the loss
Convolutional networks
Layer types:
 fully connected layer
 Convolution layer
 Pooling
 Activations/Nonlinearities

Types of natural networks


Alex net
 it is a convolutional architecture, this convolutional uses a layer number around 10
Vgg:
 this neural network has variants whit layer numbers between 16 and 19
Inception:
 Its uses the fact that convolutional layers can exist not only sequentially
GoogLeNet this natural network consists of several inception blocks
DenseNet
 it is a natural network that it is based on the ResNet, this natural network uses connections within a
dense block not only between consecutive layers.

Dropout
 establishes some of the triggers when set to 0 and other triggers are calculated taking into account
these 0s
Batch Normalization
 The main principle is that the average and the standard deviation of each trigger are calculated in
each iteration, and normalized according to the calculated statistics
Erosion and dilatation:
 The erosion step eliminates small, noise-like objects, while the dilation step regrows large objects to
their original size
 The main drawback of erosion and dilation is that they modify the size of the objects
 we never use them individually, but we combine them
Skeletonizing
 The skeleton is such a representation, where each pixel with value 1 has exactly one or two
neighbors with value 1
 the skeleton is calculated with an iterative erosion algorithm
Object labeling and counting
The purpose of labeling and counting objects is to assign a label to separate
objects
The steps of
 The first unlabeled pixel with value 1 has to be found and labeled with label L.
 The assignment of label L to each neighbor of the evaluated pixel, which has a value of 1, and the
call of the second step for each of them.
 Jump to step 1 and increment L

Segmentation
 Segmentation is the process of dividing a digital image into multiple segments
 The goal of segmentation is to simplify or change the representation of an image into
something that is more meaningful and easier to analyze
 If segmentation does not happen on an object, but on an object class/category basis, then the
procedure is called semantic segmentation

Semantic segmentation
 Segmentation is essential for image analysis tasks. Semantic segmentation describes the
process of connecting each pixel of an image with a class label

You might also like