Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

Ph. D.

Study Plan

Sune Keller

IT University, Copenhagen

Last Revision: September 2006

Project: PDE-based Video Processing

Starting date: 01-09-2004

Thesis submission date: 31-08-2007

Student Supervisors Sune Høgild Keller Professor Mads Nielsen


ITU, Department of Innovation, Image ITU, Department of Innovation, Image Group
Group Rued Langgaardsvej 7, DK-2300 Kbh. S.
Rued Langgaardsvej 7, DK-2300 Kbh. S. Phone: +45 7218 5075
Phone: +45 7218 5093 E-mail: malte@itu.dk Ass. Professor François Bernard Lauze
E-mail: sunebio@itu.dk ITU, Department of Innovation, Image Group
Rued Langgaardsvej 7, DK-2300 Kbh. S.
Phone: +45 7218 5070
E-mail: francois@itu.dk

Changes October 2006

The following sections have been updated:

 Papers, Reports, Presentations and Conferences: More talks, one conference, one paper, one
technical report, one planned paper.
 Educational Requirements – I have taken a course and another is running longer than planned.
 Leave – new possible partner involved in the commercialization project, but nothing certain and
thus left out of the time schedule.
 Teaching and Student Project Supervision – I have taught and supervised more and done other
duty work.
 Milestones – A pre-visit was cancelled, an extra visit at LA Rochelle was added.

Changes April 2006


The following sections have been updated:
 Study Visit Abroad – I will most likely not do a longer stay.
 Educational Requirements – I have taken another course.
 Teaching and Student Project Supervision – I have taught and supervised more and begun
registering other duty work.
 Papers, Reports, Presentations and Conferences – Talks – more activities.
 Leave – I might still go on leave to do industrial development of my project, but nothing is
certain yet and thus it is left out of the plan.
 Milestones – have been shifted around to account for not going on leave after all this period and
doing other stuff instead.

Changes October 2005


The following sections have been updated:
 Educational Requirements – I have taken another course.
 Teaching and Student Project Supervision – I have taught and supervised more.
 Papers, Reports, Presentations and Conferences – more activities.
 Leave – a new section.
 Time Schedule – leave and switch of two milestones.
Project Description Objective Advanced model based schemes for digital image and image sequence
restoration, especially inpainting, have been developed over the last 5-10 years. Today rather simple
schemes are used for digital video processing in broadcasting chains. The objective of this project is to
apply and further develop methods used for restoration and inpainting to the field of video processing,
focusing on video format conversion involving enhancement of space and time resolution, a work already
begun in my master thesis [1]. Background The digital technology has invaded the television and video
media, the DVD has taken over from VHS, digital cameras and video input cards have made it possible to
watch and edit video on PC’s and at the same time CRT displays are replaced by modern and larger
plasma and LCD displays and projectors. Conventional television sets, PC screens, flat panel displays,
projectors and digital cameras have different display formats. They differ by screen height and width in
pixels (spatial resolution), by the number of frames displayed each second (temporal resolution) and by
the manner each frame is scanned. This gives rise to a number of different video formats, and conversion
between the formats are needed, especially up-conversions requiring enhancement in the form of more
pixels of information than what is contained in the original video sequence.
Schemes used for up-conversions in video processing today are often developed ad hoc and heuristically
to fit a certain low cost hardware platform and the focus seems to be on solving the practical problem at
hand without any thought on the underlying theory ([2], [3]). Schemes developed for inpainting of regions
of missing image (sequence) data take their offspring in an attempt to model images and image sequences
as physical entities taking into account, that an image sequence is a projection of the real, physical world.
By translation of the causality, ordering and coherence of the physical world onto the recordings of the
world by using mathematical models of variation, high quality digital inpainting can be done ([4]). This
theoretical framework and its modelling is valid for video processing in general and in my master thesis it
is shown, that it can be applied to the video format conversion called deinterlacing with success. The
master thesis was the first attempt to convert an inpainting scheme into a video format conversion scheme
and this Ph. D. project will continue that work.
Why has nobody else tried this before? Because the inpainting is a rather novel discipline to image
sequence processing and the schemes are computationally heavy. Therefore it has never been introduced
to the world of video processing, a world mainly rooted in electrical engineering, while inpainting largely
has been researched in the world of mathematics and computer science.
A key feature to doing high quality image sequence processing is to estimate the motion in the scene
depictured to establish correlation over time between neighbouring frames in the sequence. Motion
estimation (ME) computes the optical flow in a sequence and the flow is then used in Motion
compensated (MC) schemes giving substantial improvements over non-motion compensated schemes
([3], [5] and [6]). In video processing very simple methods for ME and the following MC schemes are
used, where as ME and MC in inpainting integrates the flow in the afore mentioned advanced theoretical
framework ([7], [8]).
Theoretical Framework The theoretical framework uses as it first step Bayes’ Inference. In Bayes two
probability terms are formulated, likelihood and prior. The likelihood term mathematically states the data
already in existence; a set of pixels you want to keep. The prior term is a mathematical model of what you
think is supposed be; how you based on the known data can fill in the blanks between the kept original
pixels, typically given as a variation that tries to model the causality, order and coherence in video
recordings of the world, e.g. that changes over time is due to motion and lightening. Variations like Total
Variation (TV) are such (weak) priors. Even though TV seems a simple model mathematically, it is very
complex and one of the most advanced priors currently applicable to image sequences.
The a posteriori resulting from the likelihood and the prior in Bayes’ is then to be maximized (MAP:
maximize a posteriori) to get a result as close to the real/correct solution as possible given your
mathematical models. To do so, you reformulate the problem as an energy functional of the image
sequence, where the task is to minimize the energy to get the optimal solution. To do so you rewrite the
problem as a set of Partial Differential Equations (PDE) to be solved. In a few cases the optimal solution
can be found directly, but in most cases iterative methods are needed to get as close to the optimal
solution as possible.

Research Questions
First the key issues that needs to be addressed. Resolution enhancements to be researched are:
 Deinterlacing – the conversion of interlaced scan video to progressive scan video by creating the
missing lines – has been the subject of my master thesis. A motion adaptive (MA) total variation
PDE based deinterlacer has been developed, implemented and tested. Using ME and MC instead
of MA will improve PDE based deinterlacing significantly. So the question is: Will PDE-based
MC deinterlacing work and how much better than known methods will it be? Deinterlacing can
be seen as an enhancement either spatially or temporally (see [1] or [3] for details) and is
therefore closely related to the two next enhancements.
 Super Resolution (SR) is enhancement of the resolution in the 2D spatial dimensions only.
Deinterlacing gives a doubling of the pixel density in a given image sequence, but SR can be
either less (e.g. PAL 576x720 to XGA 768x1024 pixels), the same or more (e.g. 576x1024 to
1600x2000) and the question is: How much can you increase the resolution and get a high image
quality? This also depends on whether the goal is to make stills, increase a TV-input to the
resolution of a LCD-screen or something third. So will PDE based (MC) SR work and how good
will it be in a given setting? This can be decided by testing and comparing to the outputs of
known SR methods.
 Enhanced resolution in time can either result in a higher rate of frames per second (fps), e.g. 50
fps instead of 25 fps, or lead to high quality slow motion, super slow (SS), which is the extending
of an image sequence in time. Will PDE-based enhanced resolution in time work, how many new
frames can be inserted in a sequence without loss of quality and how does it compare to known
methods?
These three enhancements can also be combined for certain uses, e.g. if an interlaced PAL signal is to be
shown on an 768x1024 progressive scan plasma screen, then deinterlacing followed by SR is needed
Besides these three resolution enhancements, other PDE-based applications can be investigated to the
extend time permits:
 Given two camera positions, e.g. for a football match, any camera angle in between the two can
be generated by choice of the viewer.
 A scene is described by a high resolution 2D photographic image and a low resolution 3D depth
map acquired by a laser scan. Combining the information from these two can give a high
resolution 3D image of the scene. This can be applied to other multimodalities as well, e.g. in
medical imaging to transfer information from a high resolution MR scan to a low resolution PET
scan to get SR PET.

Other interesting areas of application can most likely be found. The key issue for these others including
the two given just above is whether the problem can be described and solved using the theoretical
framework outlined in this study plan.

Some additional issues and questions that are highly likely to be addressed as a part of the development of
the resolution enhancement schemes are:

 Statistical image sequence analysis to detect whether a sequence is progressive or interlaced to


choose wether to deinterlace or not.
 Motion/optical flow: improvement and optimization of ME for the PDE-based MC schemes.
 As most ME methods are optimized for progressive image sequences incl. PDE-based ME, special
care has to be taken when redesigning for use on interlaced image sequences.
 Total Variation is a rather primitive model of images and image sequences. Can the use of other
distributions/variations than TV improve the schemes? Can these other priors then give solvable
PDE's that improve the results? 5
 Can the schemes be improved by better numerical implementations?
 Can the schemes be improved by better data initialization?
 Collecting a set of image sequences making up a good general representation of video material
to give realistic testing.
 Finding and using test sequences used by others for easy comparison of results.
 Improvement of existing objective evaluation methods for video format conversions to make
them give a better measure of the subjective quality experienced by the human visual system
(HVS).
 Can the commonly used gradient descent solution for PDE’s be replaced by other iterative
methods to improve on quality of the results and/or the reduction of computational
complexity?
 Can iterative methods be improved?
 Can iterative solutions be replaced by direct solutions?

Plan from the beginning

The overall plan to understand and develop PDE based video processing is given here. It is a less
structured parallel to the Time Schedule that follows later.

Phase 1:

Get full and in depth understanding of the theoretical framework outlined in this study plan and the MC
Inpainting work done by Francois Lauze and Mads Nielsen ([7]) and work on PDE based motion
estimation ([7], [8] and other).

Attain broader knowledge on image sequence enhancement resolution and motion by literature studies.
Identification of research questions: By attaining further knowledge on the subject of PDE-based video
processing I might very possibly need to refine, add to and redefine the research questions given in this
study plan to optimize the outcome of my work.

First MC PDE based scheme: A method for frame rate doubling. Search for industrial/business partner,
trying to make first contact based on co-work with CCBR (CCBR being a business partner but not yet in
the field of tv/video/film).

Phase 2:

Develop, implement and test PDE-based MC deinterlacing, spatial super resolution and super
slow/super resolution in time. Do combinations of the three for specific task(s).

Phase 3:

Find other areas of applications. Define the problems in the framework of Bayes’ Inference and solving
by PDE’s, then develop, implement and test solutions.

Phase 4:
Write thesis.

Study Visit(s) Abroad

The intention of one long and/or several smaller study visit(s) abroad incl. conferences, workshops etc.
is to stay in an external research environment and get others angles on ones work. In the best case
scenario it will also result in international research collaboration.

My plan is to attend as many conference and workshops etc., where image sequence processing is a
subject.

I will visit for 2-4 weeks with The Mathematical Image Analysis Group of Professor Joachim Weickert at
the University of Saarland, Germany in fall and winter 2006 and also plan to go for a short visit to
Bernard Besserer at L31, The University of La Rochelle, France for two weeks in April 2007.

In January 2005 I visited the image group at University of North Carolina (UNC) in Chapel Hill, my main
host being Prof. Stephen Pizer. In a very packed two days program, I leaned a great deal about medical
image analysis, segmentation and advanced video processing and analysis.

Educational Requirements
As a Ph. D. student I am required to obtain 30 ECTS by attending courses, summer schools, conferences,
workshops etc. I am already in the process of fulfilling the requirement by having completed the
following Ph. D. courses:

Foundations of Image Analysis. Ph. D. course at ITU held by Ole Fogh Olsen, Mads Nielsen,
Kim Steenstrup Petersen and Francois Lauze, fall 2004, 7.5 ECTS.

Pattern Recognition, Ph.D. study group at ITU headed by Marleen de Bruijne, fall 2004, 7.5
ECTS.

Statistical Models of Images, Ph.D. course at ITU held by Kim S. Pedersen and Martin Lilholm,
spring 2005, 2.5 ECTS.

Non-Linear Shape Modelling, Ph.D. course at ITU hosted by Ole Fogh Olsen, lectured by Xavier
Pennec, Sarang Joshi and Mads Nielsen, fall 2005, 4.5 ECTS.
Stochastic Differential Equations in Image Analysis, Ph.D. course at ITU hosted by Mads
Nielsen, lectured by Bo Markussen, KVL, DK, Anne Cuzol, Rennes, FR, Gheorghe Postelnicu,
Boston, US, spring 2006, 3 ECTS.

Ongoing Image Canon, Ph.D. seminar/study group at ITU organized by Ole Fogh Olsen, sessions headed
by different members of the Image Group at ITU, 2006, 7.5 ECTS.

Total 25 ECTS so far (32.5 with ongoing).

I intended to take the pedagogical course for Ph.D.’s at ITU but did not have the time in august 2005
when it was offered, and now I have almost no courses left and don’t expect to take the course as earlier
planned.

Independent Studies
Besides taking courses, an important part of a Ph. D. study is to follow up on the development within
your area of research by conference attendance, reading papers and other literature. Also some of the
background knowledge needed for my research might not be covered in available courses and must
therefore be obtained by reading relevant literature. The references [2], [3] and [5] – [8] are examples of
this.

Besides living in my own little isolated world of my research project, it is also important to study
literature within areas related to but not covered by my project, e.g. topics of video and image processing
not related to inpainting and video format conversions. An example is the above mentioned Ph.D. study
groups Pattern Recognition and Image Canon.

Presentational Requirements
According to my contract I am to do a certain amount of duty work at ITU. 560 hours is to be spent
teaching. Another 280 hours are to be spent on other non-administrative presentational work at ITU to
possibly relieve the scientific staff. So far I have been assigned to the committee that is to get the library
at ITU up and running on full scale but don’t expect to spend much time on this if it ever gets to do any
work(?). The use of the remainder of the 280 hours will be decided by Mads Nielsen. So far I have also
spend time attending group meetings and being a member of the PhD study board as well as being the
webmaster of the home page of the Image Group.
Teaching and Supervision of Student Projects
To meet the requirement of 560 hours of teaching I have taught and supervised:

You might also like