DIP Projects Spring 2021

Digital Image Processing
By Dr. M. Arfan Jaffar
Semester Projects
Instructions
Semester projects are part of the digital image processing course. The project can be carried out
individually or in groups of two, three or up to four students per group. The workload of the project will
be approximately 70 hours per student. The group will be required to justify their workload.
Projects will be allocated on first-come-first-served basis. Groups will be required to submit.
1. Names/Registration Numbers of group members
2. Titles of three projects from the list given below (in order of preference, first being with the
highest preference).
Students not meeting the above deadline will be required to carry out the projects
allocated to them by the instructor.
List of Projects offered
1. Detecting Pornographic Video Content

With the rise of large-scale digital video collections, the challenge of automatically
detecting adult video content has gained significant impact with respect to applications
such as content filtering or the detection of illegal material. While most systems
represent videos with key frames and then apply techniques well-known for static
images. Investigating motion as another discriminative clue for pornography detection
can result in higher accuracy rates.
2. Automatic Video Tagging Using Content Redundancy

The analysis of the leading social video sharing platform YouTube reveals a high amount
of redundancy, in the form of videos with overlapping or duplicated content. This
redundancy can provide useful information about connections between videos.
Automatic analysis can reveal these links using robust content-based video analysis
techniques and exploit them for generating new tag assignments. The need for different
tag propagation methods for automatically obtaining richer video annotations is at its
peak. These methods can provide the user with additional information about videos,
and lead to enhanced feature representations for applications such as automatic data
organization and search.
3. Tube Filer: An Automatic Web Video Categorizer

While hierarchies are powerful tools for organizing content in other application areas,
current web video platforms offer only limited support for a taxonomy-based browsing.
Automatic multimodal categorization of videos into a genre hierarchy, and a support of
additional fine-grained hierarchy levels based on unsupervised learning are desirable.
4. A Real-Time/Video Automatic License Plate Recognition

(M-ALPR)
The Road Transport Departments of all the countries in the world endorse a
specification for car plates that includes the font and size of characters that must be
followed by car owners. However, there are cases where this specification is not
followed. New methodology to segment and recognize license plates automatically
needs to be developed. The new systems must to able to solve the problem of
segmenting different length licenses such as license with different number of character
and number.
5. Face Recognition from Video

In most paradigms of face recognition it is assumed that while a set of training images is
available for each individual in the database, the input (test data) consists of a single
shot. However, in many scenarios the recognition system has access to a set of face
images of the person to be recognized. We want to use this fact to do a better job in
recognition. This can be viewed in the framework of a general problem in classification:
if you have sets of observations from each class, and a set of observation from an
unknown class, what is the best way to label this new set, or in other words, how does
one compare sets of observations?
6. Gunshot Location and Surveillance

Approximately 60 to 80 percent of urban gunfire is never reported by citizens. When
citizens do call in, the average delay between the gunfire and the call is 2.5 minutes. And
reports often include imprecise information about shot direction, address, and number
of shooters, hampering law enforcement’s efforts to protect the public and apprehend
the suspect. A gunshot detection and location system relies on acoustic sensors and
GPS. The GPS coordinates that the system calculates can be sent to video surveillance
cameras that slew to the incident scene and nearby streets. The system also
automatically alerts dispatch and call centers in seconds, providing precise details
including the exact map location, nearest street address, indications of the number of
shooters, number of rounds fired, and types of weapons discharged. Complete
information provides the situational awareness that law enforcement needs to assess
the severity of the incident.
7. Perimeter Security and Asset Protection

Public safety agencies frequently need to lock down areas and establish clear and
defined perimeters. Examples include data centers, petroleum refineries, water
treatment facilities, police stations and other public buildings, railroad tracks, and
airfield areas of operations. Examples of temporary perimeters include sports and
concert venues, construction lay-down yards, hazmat incident scenes, and train wrecks.
Inadequate perimeter security on government and private buildings can be costly. For
example, damage to critical infrastructure can affect the entire community. And the
theft of tools and other materials in lay-down yards for building projects has been
estimated to range from 5 to 10 percent. The cost of building delays resulting from theft
generally far exceeds the cost of the assets themselves. Automated notification of law
enforcement in response to perimeter breaches can help to prevent theft in lay-down
yards, underscoring the value of public-private partnerships.
Law enforcement agencies can protect perimeters and assets at low cost by connecting
wired or wireless video surveillance cameras anywhere they cannot or do not want to
erect a fence. For asset protection, agencies can use video surveillance cameras to
create a buffer zone that extends from the fence line to the protected area surrounding
the asset. This gives law enforcement additional time to take action before the intruder
leaves the vicinity.
8. Learning Static Object Segmentation From Motion
Segmentation
Image segmentation is the discovery of salient regions in static images and has a history
reaching back to the Gestalt psychologists. There are many computer vision approaches
to this problem, but they are difficult to compare because there is no readily accessible
ground truth. In recent years, this situation has improved as researchers have used
human-segmented data to train and test boundary detection algorithms.
9. Vision Interfaces
Algorithms and systems for perceptive interfaces, which enable users to interact with
machines using natural expression and gesture and also allow machines to understand a
user`s physical environment. Computer vision algorithms to support two very useful
forms of interaction: first, enabling machines to interact with people through
multimodal conversation, and second, allowing devices to recognize objects of interest
to a user and provide situated search for information about those objects. Enabling
machines to understand multimodal communication and reference is extremely
valuable in many application areas.
10. A Hybrid Approach to the Skull Stripping Problem in MRI

Stripping the skull and other non-brain tissues from the structural images of the head is
a challenging and critical component for a variety of post-processing tasks. Large
anatomical variability among brains, different acquisition methods, and the presence of
artifacts increase the difficulty of designing a robust algorithm, thus current techniques
are often susceptible to problems and require manual intervention.
11. An Efficient Dense and Scale-Invariant Spatio-Temporal

Interest Point Detector
As video becomes a ubiquitous source of information, video analysis and action
recognition have received a lot of attention lately. In this context, local viewpoint
invariant features, so successful in the field of object recognition and image matching;
have been extended to the spatio-temporal domain. These extensions take the 3D
nature of video data into account and localize features not only spatially but also over
time.
12. An innovative algorithm for key frame extraction in video
summarization
The growing interest of consumers in the acquisition of and access to visual information
has created a demand for new technologies to represent, model, index and retrieve
multimedia data. Very large databases of images and videos require efficient algorithms
that enable fast browsing and access to the information pursued. In the case of videos,
in particular, much of the visual data offered are simply redundant, and we must find a
way to retain only the information strictly needed for functional browsing and querying.
Video summarization, aimed at reducing the amount of data that must be examined in
order to retrieve a particular piece of information in a video, is consequently an
essential task in applications of video analysis and indexing.
13. Automatic Face Naming with Caption-based Supervision

Over the last decades large digital multimedia archives have appeared, through
digitalization efforts by broadcasting services, through news oriented media publishing
online, and through user provided content concentrated on websites such as YouTube
and Flickr. Ongoing efforts are directed to develop methods to allow access to such
archives in a user-oriented and semantically-meaningful way. The volume of data in
such archives is generally large, and the semantic concepts of interest differ greatly
between different archives. As a result, there is a great interest in ‘unsupervised’
systems for automatic content analysis in such archives. These contrast with
‘supervised’ systems which require manual annotations to link content to semantic
concepts.
14. Cross-Media Alignment of Names and Faces

Aligning names and faces as found in images and captions of online news websites is a
laborious task. Developing accurate technologies for linking names and faces is valuable
when retrieving or mining information from multimedia collections.
Such cross-media alignment brings a better understanding of the cross-media

documents as it couples the different sources of information together and allows to
resolve ambiguities that may arise from a single media document analysis (e.g.
confusion between senior and junior George Bush). At the same time, it builds a cross-
media model for each person in a fully unsupervised manner, which in turn allows to
name the faces appearing in new images (with or without caption) or to show a picture
of the people mentioned in new texts. Because there are usually several names
mentioned in the text and several faces shown in the image, and not all of the names
have a corresponding face and vice versa, there are many possible alignments to choose
from, making crossmedia linking a non-trivial problem. However, analyzing a large
corpus of cross-media stories (images with captions) the re-occurrence over and over
again of particular face-name pairs provides evidence that they might indeed be linked
to the same person. This is based on the assumption that the two modalities are
correlated at least to some extent - a reasonable assumption for news stories where
both modalities give a description of the same event.
15. Detecting Objects in Large Image Collections and Videos by

Efficient Subimage Retrieval
From an image retrieval user’s point of view, the possibility of object-based instead of
image based queries is a clear benefit, because the relevance of images will not be
affected by changes in image viewpoint or background clutter anymore. However, most
image retrieval systems internally rely on purely global image representations, and they
are not able to handle queries that match only small regions within the images. Object
localization methods, on the other hand, are in principle capable of answering the
question of whether and where an object occurs in an image, but they are typically
overburdened when having to deal with very many candidate images, because they do
not scale well in terms of runtime and memory usage. Consequently, most existing
systems for object-based image retrieval either achieve high detection accuracy, but
work only for small image collections, or they can handle large set of candidate images,
but are limited in the types of local queries that they can answer.
16. Evaluation of local Spatio-temporal features for action

recognition
Local image and video features have been shown successful for many recognition tasks
such as object and scene recognition as well as human action recognition. Local space-
time features capture characteristic shape and motion in video and provide relatively
independent representation of events with respect to their spatio-temporal shifts and
scales as well as background clutter and multiple motions in the scene. Such features
are usually extracted directly from video and therefore avoid possible failures of other
pre-processing methods such as motion segmentation and tracking.
17. Learning Color Names from Real-World Images

Color names are linguistic labels that humans attach to colors. We use them routinely
and seemingly without effort to describe the world around us. They have been primarily
studied in the fields of visual psychology, anthropology and linguistics. Color naming is
different from the thoroughly explored field of color imaging, where the main goal is to
decide, given an acquisition of an object with a certain color, if objects in other
acquisitions have the same (or a different) color. Based on physical or statistical models
of reflection and acquisition systems object colors can be described independent of
incidental scene events such as illuminant color and viewing angle. The research
question of color naming is different: given a color measurement, the algorithm should
predict with which color name humans would describe it. It also allows for different
functionalities, for example within a content based retrieval context it allows to steer
the search to objects of a certain color name. The user might query an image search
engine for “red cars”. The system recognizes the color name “red”, and orders the
retrieved results on “car” based on their resemblance to the human usage of “red’.
Apart from the retrieval task color names are applicable in automatic content labeling of
images, colorblind assistance, and linguistic human-computer interaction.
18. Digital Image/Video Watermarking

The revolution in digital information has brought about profound changes in our society.
Many of the advantages of digital information have also generated new challenges and
new opportunities for innovation. Authenticating digital information with fair enough
imperceptibility and high detection resolution is the challenge of today’s research. The
objective for image authentication is to reject the malicious manipulations and accept
content-preserving manipulations for which the traditional cryptographic signature may
not be suitable. Several watermarking techniques are proposed to authenticate digital
images.
Applications of watermarking based authentication include trusted cameras, video

surveillance, digital insurance claim evidence, journalistic photography, and digital rights
management systems. It can be used commercially, such as GeoVision’s GV-Series digital
video recorders for digital video surveillance to prevent tampering. It can also be used
for real time services such as broadcast monitoring and security in communication.
19. Face Recognition:

The aim of this project is to locate faces in a given image. The given image can
have one or more faces (or in some cases no faces) in it. This is unlike a face
identification system, in which the aim is to recognize a person through his / her
photographs.
20. Barcode Reading:

We are all familiar with bar codes printed on various supermarket products. The project
involves to decode the barcode on a typical market product using a digital image of the
barcode.
21. Fingerprint Identification:

Given the fingerprint images of a person, the major task of this project is to
identify/match them with images in a given set of fingerprint images and to find the
best match.
22. Fuzzy Image Enhancement:

We have studied image enhancement using conventional mathematical/statistical
techniques. This project involves the use of fuzzy logic to solve such problems.
23. Fuzzy Image Restoration:

Bringing degraded images back to their original form is image restoration. The
traditional techniques for images are being discussed in the course. The project makes
use of fuzzy techniques for such restoration purposes.
24. Image Fusion:

Sometimes two images in different spectral ranges are available e.g. in visible and infra-
red frequency domains. Taken individually, the images lack information that might be
available in the other image. Image fusion combines such multi-spectral images to form
a single image which is more informative as compared to the individual images.
25. Fourier Descriptors for Leaf Classification:

Method of Fourier Descriptors provides a way of indentifying leaves from their images.
The good thing about Fourier descriptors is that they are scale, translation and rotation
invariant.
26. Restoration of Bound Document Images:

Distortions/degradations take place during photo- or document scanning projects. This
happen especially in those circumstances, when the document is bounded at one side.
The project will involve exploring different DIP techniques for restoring such images.
27. Automatic Coin Counter:

Many machines are automatic vending outlets accept coins. You will be given an image
of different coins (some of them may be overlapping as well). We intend to count the
coins automatically from this image using DIP techniques.
28. Counting Chocolate Chip Cookies:

To automate the process of industrial inspection, digital image processing (DIP)
techniques are widely being used. Food inspection is one such application of DIP
techniques. This project involves the counting and locating of chocolate chips in the
images of chocolate chip cookies.
29. Digital Image Watermarking:

Watermarking has been used over many centuries for authentication purposes. Some
secret information (which could also be in the form of an image) can be hidden into
another image without making any apparent (perceptual) changes in this cover image.
Information about the true owner of can be hidden within an image to protect the
authorization rights of the owner. The problem is becoming more severe with the
widespread use of multimedia content on digital devices and its preparation and
transmission over the internet. The project team is expected to implement at least one
hidden watermarking technique in this project.
30. Image Colorization:
Given a gray-scale image, the project team will develop and implement various
techniques to colorize such images to make them realistically colorful.
31. Photo-Stitching (Creating Panoramic Views from Multiple

Views):
Scenic photographs are generally taken at various orientations and angles. We plan to
stitch such images in such a way that a panoramic view is created from these multiple
images. The process of automatic image registration will be applied in this project.
32. Image Morphing:

Given two images, the image morphing will start from one image and gradually
transform it into the second image while taking and displaying the in-between images.
Use of different morphing techniques will be used in this project.
33. Forensic Image Processing:

Images taken at a crime scene are generally masked behind strong backgrounds. The
extraction of key information such as fingerprints, foot or shoe prints, handwritten
scripts etc., becomes very difficult in such circumstances. To help the investigators in
such circumstances, image enhancement techniques are used. This project involves the
study and implementation of such techniques.
34. Removal of Blocking Effects in Compressed Images:

Some image compression techniques (e.g. jpeg compression) lead to blocking effects in
the compressed images. This project is aimed to investigate and implement methods to
remove these blocking effects from the compressed images.
35. Research Oriented Projects:

A number of papers related with various DIP areas are available. Some details of these
projects are given in the class. The aim of these projects is to implement one of the
conference/journal paper published in recent years.
Project Deliverables
Will be announced later.
Presentation Schedule
The schedule of the presentations will be announced after the allocation of projects.

DIP Projects Spring 2021

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DIP Projects Spring 2021

Uploaded by

Copyright:

Available Formats

Digital Image Processing

By Dr. M. Arfan Jaffar

List of Projects offered

1. Detecting Pornographic Video Content

2. Automatic Video Tagging Using Content Redundancy

3. Tube Filer: An Automatic Web Video Categorizer

4. A Real-Time/Video Automatic License Plate Recognition

5. Face Recognition from Video

6. Gunshot Location and Surveillance

7. Perimeter Security and Asset Protection

10. A Hybrid Approach to the Skull Stripping Problem in MRI

11. An Efficient Dense and Scale-Invariant Spatio-Temporal

13. Automatic Face Naming with Caption-based Supervision

14. Cross-Media Alignment of Names and Faces

Such cross-media alignment brings a better understanding of the cross-media

15. Detecting Objects in Large Image Collections and Videos by

16. Evaluation of local Spatio-temporal features for action

17. Learning Color Names from Real-World Images

18. Digital Image/Video Watermarking

Applications of watermarking based authentication include trusted cameras, video

19. Face Recognition:

20. Barcode Reading:

21. Fingerprint Identification:

22. Fuzzy Image Enhancement:

23. Fuzzy Image Restoration:

24. Image Fusion:

25. Fourier Descriptors for Leaf Classification:

26. Restoration of Bound Document Images:

27. Automatic Coin Counter:

28. Counting Chocolate Chip Cookies:

29. Digital Image Watermarking:

31. Photo-Stitching (Creating Panoramic Views from Multiple

32. Image Morphing:

33. Forensic Image Processing:

34. Removal of Blocking Effects in Compressed Images:

35. Research Oriented Projects:

You might also like