1-QP KEY A1NLP Cat1 Key

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

SCHOOL OF COMPUTER SCIENCE AND ENGINEERING

Continuous Assessment Test - I, January 2020


B. Tech - Winter Semester2019-20

Course Code : CSE4022 Duration : 90 Minutes


Course Title : Natural Language Processing Max. Marks : 50
Course Nbr : 2181,2183,2175,6629 Slot : A1

Answer all the questions


1. A. Define ambiguity. How is language ambiguous? (4)

Ambiguity is the presence of two or more possible meanings in a single passage.


Ambiguous language describes speech that doesn't have a singular meaning but represents different ideas,
objects, or individuals. This makes language more efficient. If we used one specific word for every concept,
object, or type of person then there would be too many words to make language easy to use.

B. Find the type of ambiguity (4)

i. John loved his son, and so did Sam.


ii. Mary ate a salad with spinach from California for lunch on Tuesday

John loved his son, and so did Sam. Semantic


Mary ate a salad with spinach from California for lunch on Tuesday Syntactic ambiguity

C. What is meant by anaphoric ambiguity? (2)

Anaphoric ambiguity occurs when the text offers two or more potential antecedent candidates either in the
same sentence or in a preceding one

2. Assume that you are consulted by your local police station. They need an insight of types of crimes in
your area over the past years. There isn’t a ready-made dataset available. But you are free to use digital
archives of local newspapers. How will apply the stages of NLP to the digital archive of newspapers? List
the different types of insights that you find out as you subject the data to each stage of NLP. Augment your
answer with a diagram. (15)

3. A. Are numbers ubiquitous in all types of texts in every language? Justify with an example (5)
Numbers are ubiquitous in all types of texts in every language, but their representation in the text can
vary greatly. For most applications, sequences of digits and certain types of numerical expressions,
such as dates and
times,moneyexpressions,andpercents,canbetreatedasasingletoken.Severalexamplesofsuchphrases can
be seen in Example: March 26, $3.9 to $4 million, and Sept. 24 could each be treated as a single token.
Similarly, phrases such as76centsashareand$3-a-shareconveyroughlythesamemeaning, despite the
difference in hyphenation, and the tokenizer should normalize the two phrases to the same number of
tokens (either one or four). Tokenizing numeric expressions requires the knowledge of the syntax of
such expressions, since numerical expressions are written differently in different languages

B. Define tokenization in sentence segmentation and do the tokenization for the sentence” God is Great! I
won the lottery” (4)

['God', 'is', 'Great', '!', 'I', 'won', 'a', 'lottery', '.']

C. Find your observation in the following sentence based on the punctuation ambiguities and brief about
them. (4)

Peter arrived in Singapore in January 1996, on his twenty-second birthday. Less than a year later, he had
married the boss's daughter Yi Ling. I'd like you to meet Mr.Mark Porter, Ms.Elizabeth Taylor, Capt. Eliot
Saunders and his wife Mrs.Saunders. I began teaching at UCLA on Mon. 29th Aug. 2018, after five years
with UNICEF.
Period and apostrophes

D. ASCII is the encoding standard used to represent Alphanumeric in digital form.


(1)

E. Text segmentation is the process of converting text corpus into its component words and sentences.
(1)

4. Explain in detail about inflectional and derivational morphology with suitable detailed example. (10)

You might also like