CT2 Set A

SRM Institute of Science and Technology
College of Engineering and Technology SETA

School of Computing
DEPARTMENT OF COMPUTING TECHNOLOGIES
SRM Nagar, Kattankulathur – 603203, Chengalpattu District, Tamilnadu
Academic Year: 2023 (ODD)
Test: CLAT-2(ANSWER KEY) Date: 02/11/2023
Course Code & Title: 18CSE359T & NATURAL LANGUAGE PROCESSING
Duration: 2 periods
Year & Sem: IV Year & VII Semester Max. Marks: 50 Marks
Course Articulation Matrix:

S.No. Course PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PO13 PO14 PO15
Outcome
1 CO1 3 3 - - - - - - - - - - - - -
2 CO2 3 2 - 3 - - - - - - - - - - -
3 CO3 3 2 2 3 - - - - - - - - - - -
4 CO4 3 3 2 2 - - - - - - - - - - -
5 CO5 3 2 2 2 - - - - - - - - - 2 -
Part - A
(10*1 = 10 Marks) Answer all Questions.
Q. No Questions Marks BL CO PO PI Code
1 Consider the following sentence. “Horse ran up the hill. 1 2 5 1 2.1.2
It was very steep. It soon got tired.” What type of
ambiguity is introduced due to the word “it”?
a. Syntactic
b. Pragmatics
c. cataphoric
d. Anaphoric
2 Spam email detection comes under which domain? 1 2 5 1 2.1.3
a. Text categorization
b. NER
c. Text Classification
d. Sentiment Analysis
3 Which of the following is an efficient representation of 1 1 5 1 2.2.2

text data?
a. Bag of words
b. TF-IDF
c. Word vector
d. BERT
4 For hate speech detection from Facebook messages, 1 1 5 4 2.2.3
________ NLP technique is used.
a. Text classification
b. Information Retrieval
c. Information summarization
d. Information Indexing
5 What is machine translation? 1 2 5 4 2.2.3
a. Converts one human language to another.
b. Converts human language to machine language.
c. Converts any human language to English
d. Converts Machine language to Human Language
6 What is the full form of NLG? 1 1 6 4 1.3.1
a. Natural Language Generation
b. Natural Language genes
c. Natural Language growth
d. Natural Language Generator
7 Given a sound clip of a person or people speaking, 1 2 6 4 2.1.3
determine the textual representation of speech.
a. Text-to-speech
b. Speech to text
c. Both A and B
d. None of the above
8 What are the input and output of an NLP system? 1 1 6 1 3.4.2
a. Speech and Noise
b. Noise and Written text
c. Noise and value
d. Speech and written text
9 In NLP, the algorithm decreases the weight of 1 2 6 1 2.2.2
commonly used words and increases the weight of
words not used very much in a collection of documents.
a. Term Frequency (TF)
b. Word2vec
c. Latent Dirichlet Allocation (LDA)
d. Inverse Document Frequency (IDF)
10 Which of the below are NLP use cases? 1 1 6 4 2.2.3
a. Speech biometric
b. Facial recognition
c. Detecting objects from an image
d. Text summarization
Part B (4*5=20Marks) Answer all Questions
11 Explain direct machine translation. 5 1 2 1 1.6.1
Ans: Using a simple rule structure, direct machine

translation breaks the source sentence into words,
compares them to the inputted dictionary, and then
adjusts the output based on morphology and syntax. This
method is time-intensive, as it requires rules to be
written for every word within the dictionary.
12 Describe the selection restriction in semantic 5 1 2 4 2.2.3
interpretation
Ans: Selectional restrictions place semantic constraints

on arguments and account for the implausibility of sen-
tences such as Colorless green ideas slept furiously.
They have been used in natural language understanding
for disambiguation and pronoun resolution.
13 Why is semantic interpretation assumed to be a 5 2 3 4 2.2.3
compositional process
Ans: The central problem of semantic interpretation is

plain: people have no trouble understanding the
meanings of sentences in their language that they have
never heard before. Thus, it must be possible to
determine the meaning of a novel sentence on the basis
of the meanings of its component parts.
14 Define the following 5 1 3 1 2.2.2
a. Vector space model
b. Term frequency
c. Inverse document frequency
Ans:
Vector space model:
The vector space model is an algebraic model that
represents objects (like text) as vectors. This makes it
easy to determine the similarity between words or the
relevance between a search query and a document.
Cosine similarity is often used to determine the
similarity between vectors.
Term Frequency:
Term frequency (TF) means how often a term occurs in
a document. In the context of natural language, terms
correspond to words or phrases.
Inverse document frequency:

Inverse Document Frequency (IDF) is a weight
indicating how commonly a word is used. The more
frequent its usage across documents, the lower its score.
The lower the score, the less important the word
becomes.
Part C (2*10= 20 Marks) Answer any two Questions

15 Explain text summarization and multiple document text 10 1 2 1 1.6.1
summarization with a neat diagram
Ans: Text summarization is the process of generating a
short, fluent, and most importantly accurate summary of
a longer text document. The main idea behind automatic
text summarization is to be able to find a short subset of
the most essential information from the entire set and
present it in a human-readable format. As online textual
data grows, automatic text summarization methods have
the potential to be very helpful because more useful
information can be read in a short time.
16 Describe different ways of building belief models in a 10 3 3 1 1.7.1

conversational agent
1. Rule-based
2. Retrieval-based
3. Generative methods
4. Ensemble methods
5. Grounded learning
6. Interactive learning
17 Explain the vector space model of information retrieval 10 4 3 4 1.7.1
Sol:
The Vector Space Model is an algebraic model used for
Information Retrieval. It represents a natural language
document in a formal manner by the use of vectors in a
multi-dimensional space and allows decisions to be
made as to which documents are similar to each other
and to the queries fired.
Course Outcome (CO) and Bloom’s Level (BL) Coverage in Questions
Question Paper Setter Approved by the Audit Professor/Course

Coordinator

CT2 Set A

Uploaded by

Copyright:

Available Formats

You might also like

CT2 Set A

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CT2 Set A

Uploaded by

Copyright:

Available Formats

SRM Institute of Science and Technology

College of Engineering and Technology SETA

Course Articulation Matrix:

3 Which of the following is an efficient representation of 1 1 5 1 2.2.2

Ans: Using a simple rule structure, direct machine

Ans: Selectional restrictions place semantic constraints

Ans: The central problem of semantic interpretation is

Inverse document frequency:

Part C (2*10= 20 Marks) Answer any two Questions

16 Describe different ways of building belief models in a 10 3 3 1 1.7.1

Course Outcome (CO) and Bloom’s Level (BL) Coverage in Questions

Question Paper Setter Approved by the Audit Professor/Course

You might also like