CV s2015 Lec 1

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

EC-803 Computer Vision

Lecture-1:
Course Introduction
Basic Transformations- Translation,
Scaling and Rotation, both in 2D & 3D
MATLAB or OpenCV

NUST College of E&ME, Spring 2015

Course Introduction
Instructor:

Mahmood Akhtar, PhD


(MAHMOOD@UNSWALUMNI.COM)

Lecture Timing: Thu 17302030 hrs, CR (DCE)-16

Topics:
Basic Transformations, Camera Model and Imaging
Geometry, Camera Calibration, Multiview Geometry,
Stereopsis, Structure From Motion, Linear Filters, Edges,
Texture, Segmentation by: Clustering Pixels; Split and
Merge; Mean Shift Algorithm; Graph-Theoretic
Clustering; Fitting a Model- Hough Transform; etc,
Tracking, Model-Based Vision, Finding Templates Using
Classifiers
NUST College of E&ME, Spring 2015

Geometric Transformations- to change sets of


points representing some object (study about
translation, scaling, rotation, etc)
Camera Model and Imaging Geometry- image
formation process, camera coordinates and 3D
world coordinates aligned / not aligned, how to
deal with different situations
Camera Calibration- process of estimating the
parameters of a pinhole camera model,
approximating the camera that produced a
given photograph or video, camera matrix
NUST College of E&ME, Spring 2015

Multiview Geometry- to understand how


several views of the same scene constrain its 3D
structure and camera configurations
Stereopsis- algorithms that mimic our ability to
fusing pictures recorded by two eyes and
exploiting the difference between them to gain
a strong sense of depth
Structure From Motion- to estimate the 3D
shape of a scene from multiple pictures when
cameras positions and parameters are a priori
unknown and may change over time
NUST College of E&ME, Spring 2015

Linear Filters- smoothing by averaging, Gaussian,


derivatives and finite differences, filters and
templates, scale and image pyramids
Edges and Texture- noise and edge detectorsLaplacian and gradient-based; extracting image
structure, analysis and synthesis using oriented
pyramids
Segmentation- subdivides an image or video into
its constituent regions or objects as required,
applications: summarising videos, finding machine
parts, finding people, finding building in satellite
images and searching a collection of images
NUST College of E&ME, Spring 2015

Tracking- problem of generating an inference


about the motion of an object given a sequence
of images. Major applications: motion capture,
recognition from motion, surveillance, and
targeting
Model-Based Vision- object recognition as a
correspondence problem- understanding of the
relationship between the position of image
features, and the position and orientation of an
object; application: registration of VOI in medical
imaging system
NUST College of E&ME, Spring 2015

Finding Templates Using Classifiers- a classifier is


anything that takes a feature set as an input and
produces a class label. Here, we would learn
about techniques for building classifiers with
example of their use in vision applications

NUST College of E&ME, Spring 2015

Text Book & References:


David A. Forsyth and Jean Ponce,
Computer Vision A Modern Approach,
2002 Ed (available from local market)
Class slides & selected research papers
to be distributed by the instructor
Mubarak Shah, Fundamentals of Computer Vision, 1997
(soft copy available online)
Linda Shapiro and George Stockman, Computer Vision, 2000
(soft copy available online)
Rafael C. Gonzalez and Richard E. Woods, Digital Image
Processing, 3rd Edition, 2009 (available from local market)
NUST College of E&ME, Spring 2015

Prerequisites:
Digital image processing
Working knowledge of C++ programming
Knowledge related to:
Euclidean and projective geometry
Linear Algebra
Vector calculus
Probability & Statistics

Yahoo Group:

NUST College of E&ME, Spring 2015

CV_CEME_S2015

Grading Policy*:
Surprise quizzes (Min 6)

8%

Programming assignments (Min 3)

7%

Sessional exam I

15%

Sessional exam II

15%

Project

15%

Final exam

40%

*Relative final grading policy applies

NUST College of E&ME, Spring 2015

10

Quizzes & Assignments:


Please make sure you visit CV_CEME_S2015 group every
day, for notifications about assignments & other related
material to be uploaded from time to time
Quizzes: 6 to 8, carrying 8% weight in the total marks
(best x out of y can be considered in the benefit of
students)
Assignments: min 3, carrying 7% weight in the total
marks. It may be written assignments or programming
assignments. Submission deadline will be given with the
assignment. Assignments submitted after the deadline
will not be accepted and will carry ZERO MARKS. Cheated
(i.e., matching) assignments will get ZERO MARKS.
NUST College of E&ME, Spring 2015

11

Project:
Project will carry 15% weight in the total marks
Project is supposed to be conducted individually (i.e., no
grouping)
Your project is most likely going to be an OpenCV
implementation of a recent CV related algorithm / work
Students are encouraged to visit IEEE Explore for 27th IEEE conf
on CVPR and they should start looking into different research
articles (published in 2014)
Project topics / problems should be selected and approval
should be obtained within the first four weeks of the course.
Project presentations will commence from week 13 onwards and
projects (i.e., CD containing draft of proposed novel work,
implementation code, presentation, etc) will not be accepted
after the submission deadline.
Projects consisting of downloaded codes or presentations will
not be accepted and will carry ZERO MARKS
NUST College of E&ME, Spring 2015

12

Vision
Process of discovering what is present in the world
and where it is by looking

NUST College of E&ME, Spring 2015

13

What is Computer Vision?


given an image or more, extract properties of the 3D
world:
- Traffic scene
- Number of vehicles
- Type of vehicles
- Location of closest obstacle
- Assessment of congestion
- Location of the scene captured
-

NUST College of E&ME, Spring 2015

14

Computer Vision
goal is to emulate human vision (which is limited to
the visual band of electromagnetic (EM) spectrum),
including learning and being able to make inferences
and take actions based on visual inputs

NUST College of E&ME, Spring 2015

15

Why Computer Vision?

An image is worth 1000 words


Many biological systems rely on vision
The world is 3D and dynamic
Cameras and computers are cheap

NUST College of E&ME, Spring 2015

16

Applications of Computer Vision

Autonomous cars, Planes, Missiles, Robots, ...


Space exploration
Aid to the blind, Sign language recognitions
Manufacturing, Quality control
Surveillance, Security, Biometrics
Image retrieval
Medical imaging & analysis
...

NUST College of E&ME, Spring 2015

17

Overview
Real World
Image Formation and
Camera Geometry
Modeling and Calibration
Image rectification

Recognition
Recognize
objects using
probabilistic
techniques

Processing on
Single Image
Linear Filters
Edge detection
Texture

Multiple Images
Multi-view geometry
Stereo imaging
Structure from motion

Segmentation
Interpretation
Interpret objects
using geometric
information

Impose some order on


group of pixels to
separate them from
each other or infer
shape information

Action
NUST College of E&ME, Spring 2015

18

Computer Vision Focuses on:

What information should be extracted?


How can it be extracted?
How should it be represented?
How can it be used to achieve the goal?

NUST College of E&ME, Spring 2015

19

Related Disciplines

Image processing
Pattern recognition
Computer graphics
Artificial intelligence
Machine learning

NUST College of E&ME, Spring 2015

20

Related Disciplines
Data
Processing

Computer
Vision

DATA
Computer
Graphics

IMAGES
Image
Processing

NUST College of E&ME, Spring 2015

21

Active Research Topics

Object recognition
Human behavior analysis
Internet and computer vision
Biometrics and soft biometrics
Large scale 3D reconstruction (city level)
Medical image processing
Vision for robotics

NUST College of E&ME, Spring 2015

22

Computer Vision Publications


Journals
IEEE Trans. on Pattern Analysis and Machine
Intelligence (TPAMI)
Internal Journal of Computer Vision (IJCV)
IEEE Trans. on Image Processing

NUST College of E&ME, Spring 2015

23

Computer Vision Publications


Conferences
International Conference on Computer Vision
(ICCV), once every two years
IEEE Conf. of Computer Vision and Pattern
Recognition (CVPR), once a year
Europe Conference on Computer Vision (ECCV),
once every two years

NUST College of E&ME, Spring 2015

24

Basic Transformations
Translation:

( x' = x + x0 , y' = y + y0 , z' = z + z0 )

x 1 0 x0 x
y = 0 1 y y
0

1 0 0 1 1
(2D)

NUST College of E&ME, Spring 2015

x 1
y 0
=
z 0

1 0

Images courtesy of Dr Imtiaz A Taj (MAJU)

0 0
1 0
0 1
0 0
(3D)

x0 x
y0 y
z0 z

1 1

25

Cartesian Coordinate System

Homogeneous Coordinate System

(Euclidean Geometry)

(Projective Geometry)

X
W = Y
Z

kX
kY
Wh =
kZ

k

W1 Wh1 Wh 4
W = W2 = Wh 2 Wh 4
W3 Wh 3 Wh 4

NUST College of E&ME, Spring 2015

26

Basic Transformations
Scaling:

( x' = S x x , y' = S y y , z' = S z z )

x s x
y = 0

1 0

0
sy
0
(2D)

NUST College of E&ME, Spring 2015

0 x
0 y
1 1

x s x
y 0
=
z 0

1 0

0
sy
0

0
0
sz

0 0
(3D)

0 x
0 y
0 z

1 1

27

Basic Transformations
Rotation (2D):
- around origin
x Cos
y = Sin

1 0

Sin
Cos
0

0 x
0 y
1 1

- around an arbitrary point


(not origin)
T-r p(R Tr p)

NUST College of E&ME, Spring 2015

r
28

MATLAB, or OpenCV
Image processing process of manipulating

image data in order to make it suitable for computer


vision applications or to make it suitable to present it
to humans

Computer vision goes beyond image

processing, helps to obtain relevant information


from images and make decisions based on that
information
Steps for a typical computer vision application:
Image acquisition Image manipulation
relevant information Decision making

NUST College of E&ME, Spring 2015

Obtaining

29

Most popular methods to develop computer vision


applications: OpenCV with C/C++, MATLAB and Aforge
MATLAB is the most easiest and the inefficient way to
process images
- an interpreter, not made to go fast but gives you the
opportunity to play with its functionalities

OpenCV is computationally the most efficient framework


-

designed for real time applications


code written in optimized C / C++
can take advantage of multicore processors
further automatic optimization possible using IPP libraries

AForge has qualities in between OpenCV and MATLAB


Matlab is a kind of sandbox for "playing" and learning (and
relatively slow). OpenCV is dedicated and specific (and fast)
NUST College of E&ME, Spring 2015

30

OpenCV has become hardest only because there is


no proper documentation and error handling codes
But OpenCV has lots of basic inbuilt image
processing functions (over 500 functions),
It is worthy to learn computer vision with OpenCV
Useful webpages on this topic:
http://opencv.org/
http://opencv-srf.blogspot.com/2010/09/what-is-opencv.html

NUST College of E&ME, Spring 2015

31

Assignment- 1
Download and install the latest release of OpenCV.
Build and run your first openCV program.
Related Tutorials:
- Installing OpenCV 3 on Ubuntu:
http://rodrigoberriel.com/2014/10/installing-opencv-3-0-0-on-ubuntu-14-04/

- Using OpenCV 3 with Eclipse:


http://rodrigoberriel.com/2014/10/using-opencv-3-0-0-with-eclipse/

NUST College of E&ME, Spring 2015

32

You might also like