Professional Documents
Culture Documents
Assignment 3 BIM IR
Assignment 3 BIM IR
Assignment 3
Submitted by:
Saqlain Nawaz 2020-CS-135
Supervised by:
Sir Khaldoon Syed Khurshid
1. os:
○ Purpose: Provides functions for interacting with the operating system,
particularly used for file operations and directory traversal.
2. nltk:
○ Purpose: The Natural Language Toolkit (NLTK) library is used for
natural language processing tasks such as tokenization, stemming,
and part-of-speech tagging.
3. nltk.corpus.stopwords:
○ Purpose: NLTK's stopwords corpus provides a list of common English
stopwords, which are words typically excluded from text analysis due to
their high frequency and low informativeness.
4. nltk.stem.PorterStemmer:
○ Purpose: The PorterStemmer class from NLTK implements the Porter
stemming algorithm, which reduces words to their root or base form,
standardizing words for analysis.
Code Flow:
Main Execution:
● The script obtains the directory path of the code file and creates the inverted
index and binary term-document matrix using the create_index function.
● It enters a loop where the user can input search queries interactively.
● For each query, it represents the query, scores documents, ranks them,
retrieves the top 2 documents, and presents the results.
● The loop continues until the user enters "exit."