Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

1

Annotated Bibliography

Shade, Benjamin, and Eduardo G. Altmann. "Quantifying the Dissimilarity of Texts."

Information, vol. 14, no. 5, 2023, p. 271. ProQuest Central Student; Publicly Available

Content Database, https://doi.org/10.3390/info14050271.

Published on May 2, 2023, "Quantifying the Dissimilarity of Texts" by Benjamin

Shade and Eduardo Altmann discusses the performances of different measures of

quantifying the dissimilarity between texts. The article primarily compared the

Jaccard Distance and the Jensen-Shannon divergence against general vector

embeddings created using the all-MiniLM-L6-v2 model, finding that the Jensen-

Shannon divergence performed very strongly across all 3 tasks, which are

clustering texts by author, subject, and time period, with the vector embeddings

also performing very well, while the Jaccard Distance wasn't as effective,. The

article discusses language models on a very high level, having a higher barrier to

entry than the other articles I have reviewed. The information contained,

describing techniques on how to quantify dissimilarity, is very useful for

comparing translated texts to quantify the accuracy of a translation, which can be

used to fine-tune the models used. However, while the information contained is

very useful, I wouldn't annotate this article, as it's too advanced for the moment.

Son, Jungha, and Boyoung Kim. "Translation Performance from the User's Perspective of Large

Language Models and Neural Machine Translation Systems." Information, vol. 14, no.

10, 2023, p. 574. ProQuest Central Student; Publicly Available Content Database,

https://doi.org/10.3390/info14100574.
2

Published on October 19, 2023, "Translation Performance from the User's

Perspective of Large Language Models and Neural Machine Translation Systems"

by Jungha Son and Boyoung Kim compares and contrasts the language translation

abilities of different Large Language Models, by comparing the capabilities of

Google Translate, Microsoft Translate, and ChatGPT, using corpora from

Workshop on Machine Translation as benchmarks. From this article, I could

annotate the parts on the different metrics used to compare the models, which are

their scores in the BLEU, chrF, and TER metrics, as well as their performance in

translating specific language pairs, to understand how Language Models are used

in language translation, as well as how they are graded and scored in their

capabilities. Skimming through the article, I could note down and utilize the

techniques for language translation in detail, as well as the background

knowledge so I can further build my foundation. Finally, I will definitely

read/annotate this article as it is incredibly useful for building up my foundations

for my Senior Project, before diving into higher-level knowledge such as specific

techniques, which are also detailed thoroughly in the article.

Zhu, Wenhao, et al. "Multilingual Machine Translation with Large Language Models: Empirical

Results and Analysis." Publicly Available Content Database, 2023,

www.proquest.com/working-papers/multilingual-machine-translation-with-large/

docview/2799277250/se-2?accountid=41498.

Published on April 10, 2023, with the current version revised as of October 29,

2023, "Multilingual Machine Translation with Large Language Models:

Empirical Results and Analysis" by Wenhao Zhu and others discusses 2 primary
3

questions, which are, "1) How LLMs perform MMT over massive languages?"

and "2) Which factors affect the performance of LLMs?" The article initially

compares and contrasts the performances of different LLMs, including ChatGPT

and LLaMA2-7B, for translating from English into other languages, comparing

their performances with those of Google Translate, and concluding that general

LLMs still have a long way to go for translation compared to the most common

LLM for translation, Google Translate. The article continues with the second

question by finding scenarios where LLMs excel/struggle. The article is

incredibly useful to me by showing the capabilities of different LLMs in different

areas, while also showing their strengths and weaknesses in translation. I will

annotate this source, as it will prove incredibly useful for the foundations of my

research, while also pointing different directions I should explore/study further.

You might also like