Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 16

Chapter #4

Query Languages

Information Retrieval in Practice


By: Eng. Ali Hassan Ahmed
Keyword-Based Querying
A query is formulation of a user information need
Keyword-based queries are popular

1. Single-Word Queries Data Retrieval


2. Context Queries
3. Boolean Queries
4. Natural Language Information Retrieval
Single-Word Queries
 A query is formulated by a word
 A document is formulated by long sequences of
words
 A word is a sequence of letters surrounded by
separators
 What are letters and separators? e.g,’on-line’
The division of the text into words is not
arbitrary
Context Queries
 Definition
- Search words in a given context
 Types
 Phrase
>a sequence of single-word queries
>e.g, enhance retrieval
 Proximity
>a sequence of single words or phrases, and a
maximum allowed distance between them are specified
Boolean Queries
 Definition
 A syntax composed of atoms that retrieve documents, and of
Boolean operators which work on their operands
 e.g, translation AND syntax OR syntactic
Natural Language

A query is an enumeration of words and context


queries
All the documents matching a portion of the user
query are retrieved
Pattern Matching
 Data retrieval
 A pattern is a set of syntactic features that must
occur in a text segment
 Types
 Words
 Prefixes
e.q ‘comput’->’computer’ ,’computation’,’computing’,etc
 Suffixes
e.q ‘ters’->’computers’,’testers’,’painters’,etc
 Substrings
e.q ‘tal’->’coastal’,’talk’,’metallic’,etc
 Ranges
between ‘held’ and ‘hold’->’hoax’ and ‘hissing’
Allowing errors
 Retrieve all text words which all ‘similar’ to the
given word
 edit distance:
the minimum number of character insertions,
deletions, and replacements needed to make two
strings equal, e.q , ‘flower’ and ‘flo wer’
 maximum allowed edit distance:
query specifies the maximum number of allowed
errors for a word to match the pattern
Structural Queries
 Mixing contents and structure in queries
- contents: words, phrases, or patterns
- structural constraints: containment, proximity,
or other restrictions on structural elements
 Three main structures
- Fixed structure
- Hypertext structure
- Hierarchical structure
Fixed Structure
Document:a fixed set of fields
EX: a mail has a sender, a receiver, a date, a subject and a body field
Search for the mails sent to a given person with “Notes” in the
Subject field
Hypertext
A hypertext is a directed graph where nodes hold some
text (text contents)
the links represent connections between nodes or
between positions inside nodes (structural connectivity)
Hypertext : WebGlimpse

WebGlimpse: combine browsing and searching on


the Web
Hierarchical Structure
WAIS (Wide Area Information Service)

 Beginning in the 1990s


 Query databases through the Internet
Lists of References
 Overlap and nest are not allowed
 All elements must be of the same type,e.g only
sections, or only paragraphs.
 A reference is a pointer to a region of the
database.
Proximal Nodes
 This model tries to find a good compromise
between expressiveness and efficiency.
 It does not define a specific language, but a
model in which it is shown that a number of
useful operators can be included achieving good
efficiency.

You might also like