Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 24

Chapter Three

Health Information retrieval


1

GIZAW HAILIYE
DEPARTMENT OF HEALTH INFORMATICS
FEBRUARY 2021
Information Retrieval
2

Definition: Information retrieval is the activity of obtaining


information resources relevant to an information need from a collection
of information resources.
 It is a field concerned with the structure, analysis, organization,
storage, searching, and retrieval of information.”
An information retrieval process begins when a user enters a query into
the system.
3

Queries are formal statements of information needs. User queries are


matched against the database information. Depending on the
application the data objects may be, for example, text documents,
images, audio, mind maps or videos.
Most IR systems compute a numeric score on how well each object in
the database matches the query, and rank the objects according to this
value.
The top ranking objects are then shown to the user.
The process may then be iterated if the user wishes to refine the query.
4

Every online database, every search engine, everything that is searched


online is based in some way or another on principles developed in IR
Information retrieval is at the heart of searching used in systems such
as DIALOG, LexisNexis & others
Understanding the basics of IR is a prerequisite for understanding how
searching of online systems works.
5

Information retrieval embraces the intellectual aspects of the


description of information and its specification for search, and also
whatever systems, techniques, or machines are employed to carry out
the operation.” Calvin Mooers,’
Example: To determine which plays of Shakespeare contain the words
Brutus AND Caesar and NOT Calpurnia
6

One way to do that is to start at the beginning and to read through all
the text, noting for each play whether it contains Brutus and Caesar and
excluding it from consideration if it contains Calpurnia.
The simplest form of document retrieval is for a computer to do this
sort of linear scan through documents.
This process is commonly referred to as grepping through text, after
the Unix command grep, which performs this process.
The way to avoid linearly scanning the texts for each query is to index
the documents in advance.
Components of IR
7

Information retrieval locates relevant documents on the basis of user


input such as keywords or example documents,
Consists of three components:
I. Query or Documents.
II. IR system
III. Ranked Results.
Components of IR
8

1. Query /Collections: store only a representation of the document or


query which means that the text of a document is lost once it has been
processed for the purpose of generating its representation.
2. IR System: Involve in performing actual retrieval function ,executing
the search strategy in response to a query.
3. Ranked Results: a set of documents which improves the subsequent
run after information retrieval.
Components of IR
9
sources of Information
10

Quantity of information can be overwhelming


Information from some websites may be:
 Biased
 out-of-date
 poor quality

The key to efficient searching is to know where reliable and relevant


information can be found
sources of Information
11

Generally, information sources in healthcare can be grouped in to four:


I. Systematic reviews/meta-analyses:
 These secondary sources of information consist of compilations of original articles that
have been vetted by independent researchers and clinicians.
 The most important vetting organization is the Cochrane Collaboration

II. Clinical Practice Guidelines:


 These reviews deal with large disease groups and treatment strategies.

III.Critically Appraised Topics (CATs):


 A CAT is a short summary of evidence on a specific clinical question.

IV.Original articles containing primary data:


 Mainly original articles based on randomized-controlled trials (RCTs).
Choosing appropriate sources
12

 The type of information source and search strategy to choose depends on


the subject area (medicine, dentistry, occupational therapy, etc.) and the
type of question being asked (drug effect, diagnostic problem, screening
issue, etc.)
1. Questions pertaining to treatment alternatives or therapeutic effects
involving common illnesses systematic reviews/meta-analyses.
2. General recommendations pertaining to more common illnesses
Clinical Practice Guidelines
3. Answers to specific clinical issues  CATs.
4. More special issues and new research findings original articles.
Developing a search strategy
13
Example
14
Boolean (Search) Operators
15

• Connect terms and locate records containing matching terms

• Inserted in a search box – AND, OR, NOT

• Must be in UPPERCASE when used

• AND, NOT operators are processed in a left- to right sequence. These


are processed first before the OR operators
• OR operators are also processed from left-to-right
AND Operator (to combine two concepts and narrow a search)
16

the AND operator is used to combine two concepts e.g.


hip AND fracture – in the shaded area; retrieves items
containing all the search terms
AND Operator
(to combine three concepts)
17

the AND operator is used to combine three concepts e.g. hip


AND fracture AND elderly – in the shaded area.
OR Operator
(info containing one
18
or other term; will
broaden a search)

renal OR kidney – in the shaded area with the


overlap in the middle having both search terms;
retrieves items containing either search term or
both search terms
NOT Operator
(in one term or the other
19
- will narrow a
search)

pig NOT guinea – in the shaded area;


eliminates items in 2nd term (guinea) or both
terms
Other search engine functions
20

• Phrase or proximity searching: “…” or (…)


– allows you to search for an exact phrase
– E.g. “information literacy”
– E.g. prevention and (malaria parasite)
• Truncation/wildcards: *
– allow you to search alternative spellings
– E.g. child* for child OR childs OR children
– E.g. parasite* for parasite OR parasites
• Alternate spellings: ?
– can be used to substitute for characters anywhere in a word
– E.g. wom?n would search for “woman” and “women”
Africa AND (malaria OR
21
tuberculosis)

malaria tuberculosis

africa

Africa AND (malaria or tuberculosis) – in the shaded area


The (OR) operator retains items in each term and the AND
operator is used to combine two concepts
Nesting Concept Sets and Boolean Logic
22

Set 1: (child$ OR
p?diatric$)
AND Set 1
Set 2: (otitis media
OR middle ear
infection$)
AND Set 3 Set 2
Set 3: (antibiotic$ OR
antibacterial
agent$)
More Search Techniques
23

• Field Specific Searching


– author, title, journal, date, url, etc.

• Language Restrictions, Humans or Animals, Gender and other limits

• Relevancy Ranking
– a grading that gives extra weight to a document when the search
terms appear in the headline or are capitalized
– every found document is calculated as 100% multiply by the angle
formed by weights vector for request and weights vector for
document found
Research4Life
24

Research4Life is designed to enhance the scholarship, teaching, research and


policy-making of the many thousands of students, faculty, scientists, and
medical specialists, focusing on Health, Agriculture, Environment and other
life, physical and social sciences in the developing world, through free or
low-cost access to academic and professional peer-reviewed content online.

Research4Life is the collective name for five programs – HINARI, AGORA,


OARE, ARDI, and GOALI.

It provides developing countries with free or low cost access to academic
and professional peer-reviewed content online.

You might also like