Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 24

Lecture 2

Multimedia Databases

Yonsei University
2nd Semester, 2009
Sanghyun Park
Contents
 Introduction to MMDBMS
 Motivation
 Content-based retrieval
 Generic MMDBMS structure

 Image Retrieval
Information Sciences, 178(22), pp.4301~4313,
November, 2008, Elsevier

 Conclusion
Motivation (1/2)
 Multimedia is a much more powerful communication tool
than traditional data in our daily life
 Image showcase, graphic design, TV commercial, speech,
movie, hand phone multimedia message, etc

 There is a urgent need for more advanced systems


organizing and managing these multimedia data types
 Traditional relational databases are ‘no longer’ suitable for
complex multimedia data
 Automatic and robust systems which produce, transmit,
analyze, manage, and search multimedia data in a reliable
way are required
Motivation (2/2)
 How much video data?
 5,660 motion pictures are produced every year, amounting to
almost 6,078 hours
 Large amount of accumulated stock
 Effective organizations and tools are desired for browsing or
retrieving such huge amount of video resources for contents of
interest
Content-Based Retrieval
 It searches the database based on similarity of media
content
 Find similar images from the database
 Identify the position of a short query video in a long video
 Retrieve the most similar video clip w.r.t. the query clip

 It should support multiple features

 It should support spatial-temporal retrieval


A Generic Architecture of MMDBMS

Query feature construction


Feature query
Indexing
Extraction
Media Object MM result
Search Engine Media Object
DBMS
Compression
feedback
Feedback Query construction

 Media organization: organize the features for retrieval


(i.e., indexing the features with effective structures)

 Media query processing: accommodated with indexing


structure, efficient search algorithm with similarity
function should be designed
• Two categories of image retrieval
– Keyword based : current major technology
– Content-based

• Keyword based image retrieval (KBIR)


– Use file name or keywords as search condition
– Search accuracy is dependent on quality of keywords
– Difficult to annotate keywords for images

• Content based image retrieval (CBIR)


– Use visual feature (ex, color distribution)
– Good visual feature extraction is difficult
• Content-based image retrieval
– Extract visual features (ex, color, texture, shape)
– QBIC: color, texture, sample images and sketches
– Virage: color layout, texture and the contour of an object
– VisualSEEK: top, bottom, right and left color relation

• Weak points of CBIR


– Expensive computation time
– Uncomfortable search condition
• Text is more comfortable than image
• Keyword-based image retrieval
– Use descriptive keywords as a search condition
– The image retrieval accuracy will increase significantly if
the keyword described an image accurately
– Cheng et al: keyword by regional clustering
– Jeon et al: semantic hierarchy for refining confidence
– ALIPR: real-time computerized annotation system based
on the pixel information

• Weak points of KBIR


– Expensive keyword annotation
• Relevance feedback
– To improve quality of image retrieval

• Previous usage of relevance feedback


– Modify initial retrieval result or the search conditions
– Return more refined result(s)

• The motivation of our method


– Apply user’s feedback to image database refinement
• Two parts in proposed model
– Image collection: collect image into image DB
– Image retrieval: search images that the user wants
Image I
Visual features
VF1 VF2 …
1, 10, -9
3, -7, 48
0, -1, 11
• Low level features for the content-
based search

Keywords and confidences


bench 5 chair 3
dove 4 grass 2
Image
Database • Annotated keywords for keyword-
based search
• The degree of relevance between
the image and the keyword
New image
Content-based
image retrieval

PC 1.5
laptop 1.4
mouse 1.2
monitor 1.1

Mapping
keywords Search
results

… … … …
PC 1.5
Collect keywords
… … … …
… … … …
laptop 1.4
mouse 1.2 and confidences … …
… …
monitor 1.1 …



… …
… …
CD 0.6
tree 0.4 …


… … …
keyboard 0.1 … … …



…… …
keyword-based
image retrieval
Bench

Search
results
Initial retrieval results


bench 4.0 leather 2.5 chair 4.0 grass 2.0 ship 3.0 candy 2.1 table 3.0
bag 3.0 bench 3.2 bench 2.1 bench 1.8 bench 1.2 animal 1.8 bench 0.5
tree 1.2 leaf 0.5 black 3.2 sun 0.5 dove 2.4 bench 0.8 laptop 2.7
Initial retrieval results


bench 4.0 leather 2.5 chair 4.0 grass 2.0 ship 3.0 candy 2.1 table 3.0
bag 3.0 bench 3.2 bench 2.1 bench 1.8 bench 1.2 animal 1.8 bench 0.5
tree 1.2 leaf 0.5 black 3.2 sun 0.5 dove 2.4 bench 0.8 laptop 2.7

• Initial retrieved results are sufficient?


– If automatic keyword annotation has high accuracy
– But, automatic keyword annotation generally shows a low
accuracy level
• Relevance feedback : improve retrieval accuracy
– Positive image: images of interest to the user
– Negative image

bench 4.0 leather 2.5 chair 4.0 grass 2.0 ship 3.0 candy 2.1 table 3.0
bag 3.0 bench 3.2 bench 2.1 bench 1.8 bench 1.2 animal 1.8 bench 0.5
tree 1.2 leaf 0.5 black 3.2 sun 0.5 dove 2.4 bench 0.8 laptop 2.7

positive positive

• Rearrangement of images
– Rearrangement order should be based on visual feature
– What kind of visual feature plays a critical role in
distinguishing positive and negative images
• Discrimination power
– Ex) a query keyword ‘forest’
• The user is likely to focus on the color than the shape or pattern
when submitting user’s feedback
Visual feature 1


positive positive

Visual feature 2


positive positive

Visual feature 3


positive positive
• Discrimination Power of VFj (jth visual feature)
Np: # of positive images
Po j  Ne j
DPj  Nn: # of negative images
N p  Nn Poj: # of positive images among the top Npth images
Nej: # of negative images among the bottom N nth images

• Weight of VFj
DPj
wj  n

 DP
k 1
k
VF1 VF2 SUM
Image Iavg
0.3 ×0.2 0.8 ×1.0 0.86
positive
0.3 ×0.2 0.5 ×1.0 0.56

0.4 ×0.2 0.5 ×1.0 0.58


positive
0.8 ×0.2 0.6 ×1.0 0.76

0.3 ×0.2 0.3 ×1.0 0.36

0.6 ×0.2 0.1 ×1.0 0.22

0.4 ×0.2 0.2 ×1.0 0.28



bench 4.0 grass 2.0 chair 4.0 leather 2.5 ship 3.0 table 3.0 candy 2.1
bag 3.0 bench 1.8 bench 2.1 bench 3.2 bench 1.2 bench 0.5 animal 1.8
tree 1.2 sun 0.5 black 3.2 leaf 0.5 dove 2.4 laptop 2.7 bench 0.8

positive positive
Confidence
modification


bench 4.5 grass 2.0 chair 4.0 leather 2.5 ship 3.0 table 3.0 candy 2.1
bag 3.0 bench 2.5 bench 2.3 bench 3.2 bench 1.2 animal 1.8
tree 1.2 sun 0.5 black 3.2 leaf 0.5 dove 2.4 laptop 2.7 bench 0.3

positive positive additional


Motivation:
User’s feedback is too small
• Image database
– 9,281 images used in CalTech image research
– Images have a suitable keyword (answer keyword)
– Visual features of each image
• Extracted by MPEG-7 XM software
• Five features: color layout, color structure, homogeneous texture,
edge histogram, and region shape

• Training set
– 360 images (about 4% of the total number of image)
• Test set
– 8,921 images
• Parameter decision
– ThresholdSize: # of additional images
• Growth rate of recall
RE ( ExtendedFeedback)  RE ( NaiveFeedback )
Growth rate of recall   100
RE ( NaiveFeedback )
• Growth rate of precision
PR( ExtendedFeedback )  PR( NaiveFeedback )
Growth rate of precision  100
PR ( NaiveFeedback )
Conclusion
 Multimedia is a powerful tool for communication

 To increase retrieval accuracy, it is better to combine


keyword-based search, content-based search, and
relevance feedback

 High dimensional index problem


 Feature extraction
 Dimensionality reduction

You might also like