Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 48

INFORMATION RETRIEVAL

SYSTEMS
III B.TECH - I SEMESTER

COMPUTER SCIENCE AND ENGINEERING

SVSV PRASAD SANABOINA

1
UNIT-V

2
Information Visualization
The primary focus on Information Retrieval Systems has been in the
areas of indexing, searching and clustering versus information
display.

This has been due to the inability of technology to provide the


technical platforms needed for sophisticated display, academic’s
focusing on the more interesting algorithmic based search aspects of
information retrieval, and the multi-disciplinary nature of the
human-computer interface (HCI).

The core technologies needed to address sophisticated information


visualization have matured, supporting productive research and
implementation into commercial products.
modify representations of data and information or the display
condition (e.g., changing color scales)

use the same representation while showing changes in data (e.g.,


moving between clusters of items showing new linkages)

animate the display to show changes in space and time

enable interactive input from the user to allow dynamic movement


between information spaces and allow the user to modify data
presentation to optimize personal preferences for understanding the
data
If information retrieval had achieved development of the perfect
search algorithm providing close to one hundred per cent precision
and recall, the need for advances in information visualization would
not be so great.

Thus, any technique that can reduce the user overhead of finding
the needed information will supplement algorithmic
achievements in finding potential relevant items. Information
Visualization addresses how the results of a search may be optimally
displayed to the users to facilitate their understanding of what the
search has provided and their selection of most likely items of
interest to read.
Visual displays can consolidate the search results into a form easily
processed by the user’s cognitive abilities, but in general they do
not answer the specific retrieval needs of the user other than
suggesting database coverage of the concept and related concepts.

If information retrieval had achieved development of the perfect


search algorithm providing close to one hundred per cent precision
and recall, the need for advances in information visualization would
not be so great.
https://www.youtube.com/watch?
v=xAm2yiYrHDI
Introduction to Information Visualization The beginnings of the
theory of visualization began over 2400 years ago.

The philosopher Plato discerned that we perceive objects through


the senses, using the mind. Our perception of the real world is a
translation from physical energy from our environment into
encoded neural signals.

The mind is continually interpreting and categorizing our


perception of our surroundings.
Use of a computer is another source of input to the mind’s
processing functions. Text-only interfaces reduce the complexity
of the interface but also restrict use of the more powerful
information processing functions the mind has developed since
birth.

Information visualization is a relatively new discipline growing


out of the debates in the 1970s on the way the brain processes and
uses mental images. It required significant advancements in
technology and information retrieval techniques to become a
possibility.

One of the earliest researchers in information visualization was


Doyle, who in 1962 discussed the concept of “semantic road
maps” that could provide a user a view of the whole database.
The road maps show the items that are related to a specific
semantic theme. The user could use this view to focus his query on
a specific semantic portion of the database. The concept was
extended in the late 1960s, emphasizing a spatial organization
that maps to the information in the database.

In the 1990s technical advancements along with exponential


growth of available information moved the discipline into practical
research and commercialization. Information visualization
techniques have the potential to significantly enhance the user’s
ability to minimize resources expended to locate needed
information.
The way users interact with computers changed with the
introduction of user interfaces based upon Windows, Icons,
Menus, and Pointing devices (WIMPs).

Although movement in the right direction to provide a more


natural human interface, the technologies still required humans to
perform activities optimized for the computer to understand. A
better approach was stated by Donald A. Norman (Rose-96
a. reduce the amount of time to understand the results of a search
and likely clusters of relevant information.

b. Yield (every item) information that comes from the


relationships between items versus treating each item as
independent

c. perform simple actions that produce sophisticated information


search function
https://www.youtube.com/watch?
v=xAm2yiYrHDI
Information Visualization Technologies
•The theories associated with information visualization are being
applied in commercial and experimental systems to determine the
best way to improve the user interface, facilitating the localization
of information.

•They have been applied to many different situations and


environments (e.g., weather forecasting to architectural design).

•The goals for displaying the result from searches fall into two
major classes:
• document clustering and
• search statement analysis.
•The goal of document clustering is to present the user with a
visual representation of the document space constrained by the
search criteria.
Multimedia Information Retrieval

Definition: Multimedia information


retrieval is the process of satisfying
a user’s stated information need by
identifying all relevant text,
graphics, audio (speech and non-
speech audio), imagery, or video
documents or portions of documents
from a document collection.
Multimedia Information Retrieval

•Spoken Language Audio Retrieval


• Non-Speech Audio Retrieval
•Graph Retrieval
•Imagery Retrieval
•Video Retrieval
Spoken Language Audio Retrieval
•Just as a user may wish to search the archives of a large text
collection, the ability to search the content of audio sources
such as speeches, radio broadcasts, and conversations would be
valuable for a range of applications.

•An assortment of techniques have been developed to support


the automated recognition of speech (Waibel and Lee 1990).
These have applicability for a range of application areas such as
speaker verification, transcription, and command and control.
For example, Jones et al. (1997) report a comparative evaluation
of speech and text retrieval in the context of the Video Mail
Retrieval (VMR) project.
Spoken Language Audio Retrieval
•While speech transcription word error rates may be high (as
much as 50% or more depending upon the source, speaker,
dictation vs. conversation, environmental factors and so on),
redundancy in the source material helps offset these error rates
and still support effective retrieval. In Jones et al.’s speech/text
comparative experiments, using standard information retrieval
evaluation techniques, speaker-dependent techniques retain
approximately 95% of the performance of retrieval of text
transcripts, speaker independent techniques about 75%.
However, system scalability remains a significant challenge
Spoken Language Audio Retrieval
Non-Speech Audio Retrieval
In addition to content-based access to speech audio, noise/sound
retrieval is also important in such fields as music and movie/video
production.

Thorn Blum et al. (1997) describe a user-extensible sound


classification and retrieval system, called Sound Fisher
(www.musclefish.com), that draws from several disciplines,
including signal processing, psychoacoustics, speech recognition,
computer music, and multimedia databases.
Non-Speech Audio Retrieval
Just as image indexing algorithms use visual feature vectors to
index and match images, Blum et al. use a vector of directly
measurable acoustic features (e.g., duration, loudness, pitch,
brightness) to index sounds. This enables users to search for
sounds within specified feature ranges.

For example, Figure 10.2a illustrates the analysis of male laughter


on several dimensions including amplitude, brightness, bandwidth,
and pitch. Figure 10.2b shows an enduser content-based retrieval
application that enables a user to browse and/or query a sound
database by acoustic (e.g., pitch, duration)
Non-Speech Audio Retrieval
Just as image indexing algorithms use visual feature vectors to
index and match images, Blum et al. use a vector of directly
measurable acoustic features (e.g., duration, loudness, pitch,
brightness) to index sounds. This enables users to search for sounds
within specified feature ranges.

For example, Figure 10.2a illustrates the analysis of male laughter


on several dimensions including amplitude, brightness, bandwidth,
and pitch. Figure 10.2b shows an enduser content-based retrieval
application that enables a user to browse and/or query a sound
database by acoustic (e.g., pitch, duration)
Graph Retrieval
Another important media class is graphics, to include tables and
charts (e.g., column, bar, line, pie, scatter). Graphs are
constructed from more primitive data elements such as points,
lines, and labels.

An innovative example of a graph retrieval system is Sagebook


(Chuah, Roth, and Kerpedjiev 1997) created at Carnegie Mellon
University (see www.cs.cmu.edu/Groups/sage/sage.html).

SageBook, enables both search and customization of stored data


graphics. Just as we may require an audio query during audio
retrieval, Sagebook supports datagraphic query, representation
(i.e., content description), indexing, search, and adaptation
capabilities.
Graph Retrieval
Imagery Retrieval
•Increasing volumes of imagery -- from web page images to
personal collections from digital cameras -- have escalated the
need for more effective and efficient imagery access.

•Researchers have identified needs for indexing and search of


not only the metadata associated with the imagery (e.g.,
captions, annotations) but also retrieval directly on the content
of the imagery.

•Initial algorithm development has focused on the automatic


indexing of visual features of imagery (e.g., color, texture, shape)
which can be used as a means for retrieving similar images
without the burden of manual indexing (Niblack and Jain, 1993,
1994, 1995). However, the ultimate objective is semantic based
access to imagery.
Imagery Retrieval
Imagery Retrieval
Video Retrieval
The ability to support content based access to video promises access to video
mail (Jones et al., 1997), video taped meetings (Kubala et al. 1999),

surveillance video, and broadcast television. For example, Maybury, Merlino,


and Morey (1997) report on the ability to create “personalcasts” from news
broadcasts via the Broadcast News Navigator (BNN) system.

BNN is a web-based tool that automatically captures, annotates, segments,


summarizes and visualizes stories from broadcast news video.

What QBIC is to static stock imagery, BNN is to broadcast news video. BNN
integrates text, speech, and image processing technologies to perform
multistream analysis of video to support content-based search and retrieval.
BNN addresses the problem of time-consuming, manual video
acquisition/annotation techniques that frequently result in inconsistent, error-
full or incomplete video catalogues.
Video Retrieval

You might also like