Professional Documents
Culture Documents
СL 7
СL 7
OF COMPUTATIONAL LINGUISTICS
INFORMATION RETRIEWAL SYSTEMS
Definition of Information retrieval systems (IRS)
:
“Information retrieval is a field concerned with the
structure, analysis, organization, storage, searching, and
retrieval of information.” It is the activity of obtaining
information resources relevant to an information need
from a collection of information resources (Calvin Mooers,
1950)
• Proximity is used
to restrict the
distance allowed
within an item
between two
search terms.
• Contiguous Word Phrases
• Fuzzy Searches provide the capability to locate
spellings of words that are similar to the
entered search term (“computer,” “compiter,”
“conputer,” “computter,” “compute.” )
• Term Masking
(does not work for
finding ranges)
Boolean Queries
• AND: both terms must be found
• OR: either term found
• NOT: record containing keyword omitted
• ( ): used for nesting
• +: equivalent to and
• – Boolean operators: equivalent to AND NOT
Document retrieved if query logically true as
exact match in document
Retrieval techniques by Hicholas Belkin and
Bruce Croft
Applications
• Digital libraries
• Media search
- Blog search
- Image retrieval
- Music retrieval
- News search
- Video retrieval
• Search engines
- Site search
- Desktop search
- Enterprise search
- Federated search
- Mobile search
- Web search
• Question answering
Tasks
• Task 1. Find information on features that make information retrieval system effective.
• Task 3. Choose 3 or more search engines, find some information and compare the output
1. DuckDuckGo
2. Shodan
3. TinEye
4. Ecosia
5. The Wayback Machine
6. FindSounds
7. Dogpile
8. Million Short
9. elgooG
Thank you!