Corpus Linguistics - Semantic

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 11

Corpus-based study of

Syaza Akila Binti Zahari, 1421428
Nur Aqileen Nabiha Binti Zarmi, 1423986
Arifah Binti Zakaria, 1413376
Nur Insyirah Binti Ishak, 1420586
Literature Review
Title Corpus-Based Semantic Filtering in Discovering Derivational Relations

Author Maciej Piaseeki, Radoslaw Ramoeki and Powel Minda

Journal Artificial Intelligence: Methodology, Systems and Applications.


Objectives To develop a method for the classification of Polish word pairs that are derivationally
associated into a set of semantic derivational relations.

Methodology Corpus: Polish

- used Derywator, a morphological recogniser of Polish derivatives.
- semantic classification of the potential derivational pairs.

Findings - Semantic classifiers present better results than Derywator.

- The achieved result of the semantic classification outperformed significantly
classification based only on the word form analysis.
Research Questions
1. What is corpus-based semantic?
2. What are the things that people look at in studying
corpus-based semantic?
3. What are some of the corpora that people use to study

4. How people use semantic in their daily live?

1. What is corpus-based semantic?
Corpus-based - studies involve the investigation of corpora.

i.e. collections of (pieces of) texts that have been gathered according to specific criteria and are
generally analysed automatically.

Semantics - the study of the meanings of words and phrases in a language

- a sub-discipline of Linguistics
- Try to understand what meaning is as an element of language and how it is constructed by
language as well as interpreted, obscured and negotiated by speakers and listeners of

Corpus-based research assumes the validity of linguistic forms and structures derived from
linguistic theory. The primary goal of research is to analyse the systematic patterns of variation
and use for those pre-defined linguistic features.
2. What are the things that people look at in studying corpus-based
-Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy

presents a new approach for measuring semantic similarity or distance between words and

-Corpus-Based Cognitive Semantics: A Contrastive Study of Phrasal Verbs in English and Russian

analyzed five verbs that express begin in English and Russian

English: begin, start. Russian: nainat/naat. nainatsja/naatsja and stat

-Distributional Memory: A General Framework for Corpus-Based Semantics

3. What are some of the corpora that people use to study semantic?

Linguistic corpora are usually a mixed collections of texts.

For example collections of newspaper articles, book chapters, web pages and conversation

Stone and Dennis (2011) have used two types of corpora in their semantic research which
are TASA (Touchstone Applied Science Associates) providing texts with diversity of topics
also novels and newspaper articles. Another corpora are from Wikipedia encyclopedia.
4. How people use semantic in their daily lives?

- Language is acquired initially by replicating sounds for verbal speech and

replicating images for written speech (Langacker 1987).
- These sounds and images require to be assigned meaning, and this is where
semantics comes in.
- A lot of the meaning attached to language is bestowed through inferences.
- Human beings write things, and the reader infers the meaning of the write-up
basing on information available to him or her.
Language use in:

1) Face-to-face conversation
- Social interaction that allow us to detect body language, feelings, tone and reactions.
1) Online conversation
- Interaction over Internet including e-mail, instant messaging (IM), feedback on blogs, contact
forms on websites, industry forums, chat rooms and social networking sites
1) Search Engine Optimization (SEO)
- focused on optimizing a business' online presence so that its web pages will be displayed by
search engines when a user enters a local search for its products or services.
Baroni, M., Lenci, A. (2010). Distributional Memory: A General Framework for Corpus-Based
Semantics. Association for Computational Linguistics.

Biber, D.(2009-12-17). Corpus-Based and Corpus-driven Analyses of Language Variation and Use. In
The Oxford Handbook of Linguistic Analysis. : Oxford University Press. Retrieved 3 Dec. 2017, from

Divjak, D., Gries, S.Th. (2009). Corpus-Based Cognitive Semantics: A Contrastive Study of Phrasal
Verbs in English and Russian.

Jiang, J.J., Conrath, D.W. (1997). Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy.
Proceedings of International Conference Research on Computational Linguistics (ROCLING X).
Quinci, C. (2015). Defining and Developing Translation Competence for Didactic Purposes: Some
Insights from Product-Oriented Research. In Y. Cui, & W. Zhao (Eds.), Handbook of Research on
Teaching Methods in Language Translation and Interpretation (pp. 179-198). Hershey, PA: IGI Global.

Semantics. (n.d.). Retrieved December 3, 2017, from https://www.merriam-

Dennis, S. & Stone, B. (2011). Semantic models and corpora choice when using Semantic Fields to
predict eye movement on web pages. International Journal of Human-Computer Studies. 69 (11). Pp
720-740. Retrieved December 2, 2017 from

Barbu, E. Baroni, M. Murphy, B. & Poesio, M. (2009). Strudel: A Corpus-Based Semantic Model Based
on Properties and Types. Cognitive Science: A multidisciplinary Journal. 34 (2). Pp 222254.
Retrieved December 2, 2017 from

You might also like