Corpus Ling Final Revision Spring 2021

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

Final Revision: Corpus Linguistics

Spring 2021

1. A researcher wants to know the most common figures of speech in


contemporary Arabic poetry. He should use ... corpus.
a. an b. a raw c. a d. a
annotated historical multilingua
l

2. A POS tagger is ...


a. a corpus annotated for word grammatical classes
b. a corpus annotated for phrase grammatical classes.
c. a computer software to annotate words for their grammatical
classes.
d. a computer software to annotate phrases for their grammatical
classes.

3. To guarantee maximum annotation accuracy, you would better


rely on ... annotation.
a. manua b. automatic
l

4. If a corpus has 1 million tokens and 350 thousand types, its lexical
richness rate will be ...
a. 350,000 / b. 1,000,000 / 350,000
1,000,00

5. The accuracy rate of a POS tagger equals ...


a. the number of correctly tagged words divided by the total number
of types
b. the number of correctly tagged words divided by the total number
of tokens
c. the number of mistakenly tagged words divided by the total
number of types
d. the number of mistakenly tagged words divided by the total
number of tokens

6. To-date, there are no automatic corpus annotation tools for


Arabic.
a. true b. false

7. Which of the following research questions is the most accurate?


a. What topics did women discuss on Twitter?
b. What topics did Egyptian women discuss on Twitter?
c. What topics do Egyptian women discuss on Twitter in Arabic?
d. What topics do Egyptian women discuss on Twitter in Arabic
from January to July 2020?

8. One main purpose of corpus linguistics is to ...


a. answer future questions
b. find regular patterns across language
c. identify those who misuse grammatical rules

9. To compare Egyptian and Lebanese Arabic at the vocabulary


level, you need a ... corpus.
a. comparabl b. paralle c. multilingu d. comparativ
e l al e

10.Word types are best defined as ...


a. unique non-duplicate b. low-frequency
words words

11.Spoken data is more accessible compared to written data.


a. true b. false

12.A corpus of all Shakespearean sonnets is a ... corpus.


a. monolingual, general b. monolingual, specialized
c. monolingual, synchronic d. monolingual,
comparable

13.To find all the derivations of ‘break’ in COCA, what should you
type in the search box?
a. =break b. break* c. [break] d. break.nn

14.How many languages do parallel corpora have?


a. exactly b. two or c. three or d. only one
two more more

15.Lexicography depends on ... corpora.


a. general b. specialize
d

16.To get the synonyms of ‘hear’ in COCA, what should you type in
the search box?
a. hear* b. [hear] c. =hear d. “hear”
17.The scientific method is ...
a. intuitive and b. experimental and
subjective subjective
c. intuitive and objective d. experimental and objective

18.To study the meaning changes of a word, you need a ... corpus.
a. synchroni b. diachroni
c c

19.‘I cannot do this anymore’. How many word tokens are in this
sentence?
a. 7 b. 5 c. 3 d. 14

20.Translators make the best use of ... corpora.


a. paralle b. comparabl
l e

You might also like