Professional Documents
Culture Documents
Plagiarism
Plagiarism
INTERNATIONAL UNIVERSITY
SCHOOL OF LANGUAGES
February/2023
VIETNAM NATIONAL UNIVERSITY – HO CHI MINH CITY
INTERNATIONAL UNIVERSITY
SCHOOL OF LANGUAGES
DECLARATION OF
UNDERGRADUATE THESIS PROJECT PAPER AND COPYRIGHT
Student: “I hereby certify that the attached material is my original work. No other
person’s work or ideas have been used without acknowledgement. I have not been
submitted, either wholly or substantially, for a degree in this university or elsewhere.”
______________________________
Signature of student
______________________________
Student’s ID No.
Date: 02/2023
Supervisor: “I hereby declare that I have read this thesis project paper and in my
opinion, this paper is sufficient in terms of scope and quality for the award of the
Degree of Bachelor of Arts in English Linguistics and Literature.”
______________________________
Signature of Supervisor
______________________________
Name of Supervisor
Date: 02/2023
VIETNAM NATIONAL UNIVERSITY – HO CHI MINH CITY
INTERNATIONAL UNIVERSITY
SCHOOL OF LANGUAGES
THESIS APPROVAL
by
Tran Ngoc Hong Phuc
APPROVED:
__________________________________________ _______________
Thesis Supervisor Approval Date
Vo Thanh Nga
__________________________________________ ____________
Thesis Reviewer Approval Date
Vu Hoa Ngan
__________________________________________ _______________
Dean Approval Date
Nguyen Huy Cuong
Acknowledgement
coming to an end after this thesis. Needless to say, I was not able to finish this paper
on my own. In fact, there were a lot of people supporting me throughout this semester
supervisor, Ms. Vo Thanh Nga, M.A for her patience and dedication to my paper
supervision. It was my greatest honor to work with her, as I learnt a lot of experience
from her in many aspects. Her guidance enabled me to structure my ideas, form the
content from the smallest details and finally complete this paper on time. She always
expressed her professionalism throughout the time we were working together on this
Secondly, I would like to express my gratitude to Ms. Vu Hoa Ngan for her
Department for their lectures and helpful guidance during my coursework and data
collecting.
Dang Hoai Phuong for all the resources needed for my thesis and remote supporting.
Moreover, I would like to thank Long N., my friend, who gave me technical
support on collecting data and decoding data. Also, another big thanks to 30
participants, all of whom were complete strangers to me in the first place, for their
Finally, I also want to send my love to my family for their indirect support and
EL046IU: Thesis
Author Note
eneniu18115@student.hcmiu.edu.vn
1
Table of contents
THESIS APPROVAL..................................................................................................3
APPROVED:................................................................................................................3
Abstract.........................................................................................................................3
Introdction....................................................................................................................4
Literature Review........................................................................................................5
Research gap............................................................................................................12
Methodology...............................................................................................................12
Study design.............................................................................................................12
Participants..............................................................................................................13
Materials..................................................................................................................13
Recording method....................................................................................................13
Research instrument................................................................................................13
Perception test................................................................................................................13
Production test................................................................................................................14
Data collection procedures......................................................................................15
Data analysis...........................................................................................................15
Perception test................................................................................................................15
Production test................................................................................................................15
Findings.......................................................................................................................16
Perception test.........................................................................................................16
2
Production test.........................................................................................................17
Discussion....................................................................................................................21
Conclusion..................................................................................................................25
References...................................................................................................................26
Appendices..................................................................................................................30
List of Tables
List of Figures
Abstract
In order to successfully communicate in a language, an individual must be able to
both perceive and produce that language. The study was to investigate how
initial aspirated plosives /p-t-k/. This research employed both quantitative and
participate in the study. The production and perception tests were used as the
instruments of this research. Also, the software Praat was employed to analyze the
recorded samples from production tests. In terms of perception test, the stimuli test
perceive the investigated sounds. The outcomes of both tests have revealed that the
bilabial /p/ sound was the most problematic while the alveolar /t/ would not be a
problem to Vietnamese students. Furthermore, the sound with the greatest mean VOT
value generated from isolated words was /t/ (85ms), followed by /k/ and /p/ with 83ms
and 65ms, respectively; however, in utterances, the maximum VOT can be found in
the alveolar stop /k/ (91ms), with 77ms and 60ms for /t/ and /p/. The findings of the
study are expected to contribute to the literature for the perception and production of
these English initial consonants in the Vietnamese context and further pedagogical
Introdction
Pronunciation makes a significant contribution to the success of a language
as well as present their speeches in a comprehensive way for the others (Gilakjani &
Sabouri, 2016). In other words, both perception and production aspects must
contribute to speech intelligibility. However, both perception and production are still
the challenges that non-native speakers, or especially non-major speakers, are tackling
in their target language (Ben, 2005). Moreover, ‘pronunciation emerged as by far the
greatest factor in unintelligibility, and the difficulty tended to increase with the gap
languages can create a plethora of difficulties for learners in the process of second
language acquisition (Dost & Bohloulzadeh, 2017). In the case of English and
exist in Vietnamese, such as /θ/ or /ð/, poses challenges for certain Vietnamese
Not only sounds that do not exist in speaker’s native language, for instance,/θ/
or /ð/, are not pronounced correctly by Vietnamese speakers, but even sounds that are
close to those existing in their first language can keep language learners from
pronouncing English Vietnamese and English share some consonants in the initial
position (Tang, 2007). According to Truong (2015), as exemplified in the study, the
5
English sounds/ʧ/ and /C/ (or /Ch/) by Vietnamese are pronounced quite similarly
even though they are not shared sounds. In addition, Hoang (1970) noted that the
replacement of / ʧ/ by /C/ occurred the second most frequently among those due to the
confusion by the informants. As a result, not only are non-shared sounds between
languages confusing but sounds that "appear" to be shared sounds are also disruptive
Although the fricatives and affricates sounds have been studied in detail,
insufficient attention has been paid to plosive stops. With regards to the aspirated
plosives /p, t, k/, there is a vast difference between English and Vietnamese. In
English, initial plosive sounds /p, k/ may be aspirated depending on context. On the
other hand, according to Dinh & Nguyen (1998), /p / and /k/ are unaspirated or
difficulties in pronouncing aspirated plosives since the students’ English accent has
sounds in both languages, the English aspirated plosives are chosen to be investigated
Literature Review
English plosive consonants
Generally, there are 2 types of plosives: voiced and voiceless. Each type is
buzz during articulatory closure phase indicates the voiced stops while the absence of
buzz thoughout this time implies voiceless stops. Acoustically, these two types of
On the other hand, the closing interval for voiceless stops is virtually blank. However,
for English, those physical rules separating the two categories work only in part due
to the fact that in the initial position, both sets are commonly produced with silent
seperates /p t k/ from /b d g/ (this attribute works in the initial position and medially
At the beginning of the syllable, these consonants are released with a small
explosion. Air escapes through the vocal cords at the post-release period, producing a
sound similar to /h/. This is referred to as aspiration. Then the vocal cords join
together and form a vowel. Phonological studies have shown that the listener
perceives an initial voiceless plosive as when the sound is emitted there is a delay
between the plosion and the beginning of the vowel (Awoonor-Aziaku, 2021). In
short, aspiration is a short frication noise that occurs before vowel formants and lasts
around 30ms.
unaspirated voiceless stop is used. This research focuses on the stops appearing at the
beginning of a word. When voiceless stop consonants are frequently articulated with
this extra puff of air when they are at the word's initial position. The aspirated sound
Roach (2009, pp. 26-27) lists out the properties, for instance, place of
PLACE OF ARTICULATION
VOICING
Bilabial Alveolar Velar
Voiceless p t k
(Roach, 2009, pp. 26-27)
The closure phase: the articulators move towards each other, make firm
The hold stage: the air stream is temporarily stopped at the place of
articulation (lip /p/, teeth /t/, soft palate /k/), so air pressure builds up behind the
closure.
The release phase: the speech organs separate abruptly and release the closure,
thus allowing the compressed air to escape quickly with a slight plosion.
voiceless plosives.
(Roach, 2009)
In this study, we only focus on voiceless aspirated plosives in the initial position
Concerning the manner of articulation and voicing, /p/, /t/, /k/ in English and
Vietnamese are all considered voiceless stops. However, it should be noted that /p, k/
in Vietnamese are unaspirated. Moreover, mild differences are found in the place of
articulation of these sounds, in which the consonant /t/ in English is an alveolar sound
while the sound /t/ in Vietnamese is a tooth-tongue sound. Meanwhile, the sound /p/ is
both a bilabial sound, and /k/ sound in English and Vietnamese are both velar sounds.
categories of stops on each place of articulation, which are /b/-/p/ for labial sounds,
/d/-/t/ for alveolar sounds, and /g/-/k/ for velar sounds. Due to the fact that the
aspiration.
On the other hand, as regards Vietnamese, in addition to the /tʰ/ sound (as in
‘thôi’), which is relatively the same as the aspirated /t/ in English, there is also a
voiceless unaspirated phoneme /t/ as in ‘tôi’ (‘me’). Additionally, there are two
Vietnamese phonemes that are easily misunderstood as /k/ sound in English, which
are the velar voiceless unaspirated /c/ as in ‘kiến’ (‘ant’) and the voiceless fricative /x/
as in ‘không’ (‘no’). One related voiced sound is the post-velar fricative /ɣ/ as in ‘ghế’
(‘chair’), which will not be taken into account as it is not regarded as a stop.
Therefore, the voiceless aspirated plosive /k/ sound is not found in the Vietnamese
participants. Moreover, only in loanwords does the sound /p/ occur in the first
‘pin’ (derived from French ‘pile’). However, in some cases, it will be replaced by a
‘pháo’ (‘firework’).
Summarized in the table below are all the Vietnamese sounds relating to
English plosives:
(IPA/Vietnamese 2022)
the participants will struggle with voiceless aspirated /k/ and voiceless aspirated /p/
since the Vietnamese language does not consist of a sound with similar manner nor
interval (in milliseconds) between the release of a stop consonant and the onset of
A VOT of type 3 is considered “negative” when the start of the voicing occurs
prior to the release of the stop. This is typical for voiced stops, whose voice onset
times are typically less than zero (Kaur, 2015). In contrast, a “positive” VOT, which
indicates a voiceless stop, occurs when the voicing begins after the stop has been
released (which means after the burst), resulting in a “voice lag”. The length of this
voice lag may vary depending on whether the voiceless stop is produced with
aspiration (type 1) or without aspiration (type 2). Moreover, a short voice lag that
10
takes place simultaneously or just after the burst can be referred to as “zero VOT”,
voiceless unaspirated sounds have a VOT of around zero, whereas aspirated sounds
have a positive VOT (Styler, 2012). Fundamentally, VOT is a key feature in the
from Lisker and Abramson's (1964) cross-linguistic research of Voice Onset Time
First, aspirated stops such as /p/, /t/, and /k/ have a mean VOT in the range of
60-100ms, which indicates a long voice lag. In contrast, the mean VOT of /b/, /d/
and /ɡ/ shows some diversity. In most of the cases, they have the features of an
unaspirated voiceless stop with a very short voice lag, and VOT values ranging from
0 to 25ms (Auzou et al., 2000). Only in some cases are they fully voiced stops with
negative VOT values within the range of -125ms to -75ms. This is backed up by
Roach (2009) as he asserts that the key factor in differentiating between /p/, /t/, /k/
and /b/, /d/, /g/ in English is not voicing, but aspiration, especially when they are in
the initial position. According to Roach, it is unnatural to produce the initial plosive
/b/, /d/, /g/ in a fully voiced manner. As a result, this study separates the plosives into
two categories: unaspirated plosives whose range is below 0 to 25ms and aspirated
Second, in terms of place of articulation, English velar stops have the highest
VOT, whereas English bilabial stops have the lowest. Auzou et al. (2000) believe that
11
an offset must be more than 15ms for /t/ and more than 30ms for /k/ to classify it as
Finally, in terms of the placements of the stops that are initial in isolated
words, initial in phrases, and medial in sentences, the VOT of the stops in sentences is
shorter than that of the stops in isolated words, demonstrating "temporal compression
Previous research
There are several studies that also investigated plosive sounds. Muis (2008)
conducted a study that focused solely on the voiced plosives /b, d, g/. Apparently, his
study was restricted to the ability to produce voiced plosives, as he only tested his
distinguish sounds in 63 minimal pairs including two pairs of plosives (/p/ vs /b/; /t/
vs /d/) in initial positions. The subjects were asked to do the perception test by
listening and choosing the words they hear. Then, in the production test, they were
asked to repeat after the recording. The results of both tests indicate that unvoiced
sounds are produced more accurately than voiced counterparts which were supposed
to be due to the difference in the sound systems of the L1-Austrian German- and the
L2-English. Also, the study proved that a highly accurate perception does not entail a
University of Languages and International Studies, who had finished four years of
English and took part in the final exam. The oral examination was taken as the
12
production test. Sound confusion is also detected in this study, with the percentage of
participants confusing between /t/ and /ʧ/; /p/ and /b/ standing at 25,5% and 17,6%
respectively. Also, the researcher mentioned the similarities and differences between
plosives.
variety of data-gathering methods were utilized during the study, including recording
notes. Specifically, the replacement of /p/ with /b/ or /f/ occurred by 26% of
participants. Also, the researcher has argued that the Vietnamese phoneme system is
responsible for the mispronunciation of /p/ to /f/ or /b/ due to the fact that /p/ does not
Research gap
topics, research on both the perception and production of English sounds in Vietnam
incorporate acoustic analysis, which can provide a more objective perspective on the
scarcity. Therefore, this study, with an aim to cover the gap left by previous research,
will analyze both perception and production aspects. This study aims to seek answers
Methodology
Study design
In this study, participants mainly have to do two perception tests and two
production tests. This research will take a mixed approach of both quantitative and
qualitative. The descriptive qualitative approach takes precedence over the other.
because it measures the test scores of both perception and production tests.
Participants
A total of thirty students from all majors enrolling the academic year of 2022-
2023 in the International University are randomly chosen for this study. All members
from both groups are guaranteed to be at the same level of English proficiency by
intermediate level or at band 5.5 IELTS) in the first semester (based on the result of
Placement test held by International University). These criteria are set in order to
Materials
use- Elementary level (Marks, 2007) are the key sources for assessment tasks. Audio
files are extracted from the Pronunciation in Use and Cambridge online dictionary
Recording method
smartphone (only for Iphone), and then analyzed by Praat, sampling size 22.050 Hz,
Research instrument
The research instruments of this study are production and perception tests.
Perception test
In this study, the perception test is a listening exam. The purpose of the exam
The perception exam is divided into two parts. (See Appendix B.)
Part 2: Listen and circle the word the same as the last word.
discrimination test (Part 2), which followed the ABX format (Liberman, Harris,
participants were required to listen to the target word and respond to a multiple-choice
question about the consonant the word began with. Six minimum pairs are included in
the exam. Meanwhile, the Part 2, Categorical Discrimination Test (CDT), is adapted
from an ABX format (Liberman et al., 1957) and consisted of 6 questions presenting a
two tokens in the triad relate to the same word, but the other refers to a separate
consonants occurring in the triads. Test takers have to decide whether the first
consonant of the third word in the sequence is (a) the same as the initial consonant of
15
the first word in the triad or (b) the same as the initial consonant of the second word in
the triad.
belonging to one of two categories, but listeners in discrimination tasks hear three
sounds and must react whether the first or second word is the same or different from
Production test
The production test consists of two tasks including the word read-aloud task in
which participants are asked to pronounce each isolated word three times, and each
1. The researcher asked for the permission from teachers for collecting data in the
3. The perception tests are handed out to students as the participants in the
classroom.
4. All the participants have to complete the perception tests. There will be a total of
12 items for two parts, and it will take around two minutes.
5. After finishing the perception tests, the participants are orally instructed to do a
production test. The process by which students read the words from the list will be
recorded by their own phones (iphone only), and there are a total of nine target
sounds (including isolated words and words in utterances), and it will take around
two minutes.
16
Data analysis
Perception test
There are 12 questions for both tests. The answer will be marked “correct” if it
matches with the provided answer key, or else it will result in “incorrect”. Moreover,
the percentage of incorrect answers for the target sounds will be calculated.
Production test
After the data is collected, the participants’ audios will be analyzed using
Praat focusing on Voice onset time (VOT) indicators. The beginning and the end of
In order to obtain the VOT values of the target stops, the two time points were
located on their waveforms. The voicing onset was located at the beginning of the
periodic wave on the waveform, and the burst was determined as the presence of the
first spike which signals the sudden change of noise caused by the stop release.
Findings
Perception test
The results of the two perception tests are shown in Figure 7 in which the
horizontal axis shows the mentioned plosives in the two tests while the vertical axis
Overall, the /p/ sound is by far the most confusing among the three targeted
plosives in the perception test. The number of errors made with the /p/ sound in Test 1
and Test 2 are 8 and 20 respectively. On the other hand, only one error was detected
for the /t/ sound in Test 1, with no errors occurring in Test 2. As a result, bilabial /p/
which has an anterior position of articulation shows the highest rate of error whereas
velar plosive /k/, the most posterior sound, shows a lower rate of error and alveolar
The nature of perceptual errors also reveals a phenomenon since the direction
voiceless plosive is misheard as a voiced one that has the same articulation. /p/ sound
is systematically misheard as /b/, and /k/ sound as/ g/. However, there is hardly a
The perception test of plosives shows that there are individual differences in
perceptual ability. For instance, there were students who only made one mistake in the
entire test whereas there were students who made as many as five mistakes. Such
Production test
Av 83 23 85 -117 / 18 65 -23;18
N 120 2 122 3 74 52
The findings for the initial stops of isolated words are presented in Table 2.
The results for each sound of /p/, /t/ and /k/ are divided into two groups, which are
aspirated and unaspirated. The first data row provides the average values of the VOT
for each category. The second row displays the range of value observed, and the third
As seen in Table 2, the mean VOT of /k/ and /t/ are 83ms and 85ms
respectively, while the mean VOT of /p/ is 65ms. Among the investigated voiceless
aspirated plosives, the alveolar stop /t/ has the highest VOT, while the bilabial stop /p/
displays the lowest. In regard to the VOT range, the voiceless aspirated /k/ exhibits a
wider range from 29ms to 156ms, compared to the aspirated /t/ and /p/, which present
each stop phoneme reveals that there are 52 cases of aspirated sounds /p/ that are
the number of mispronounced tokens for /t/ and /k/ are significantly lower, which are
-107 : -4;
R 34 : 151 /// 31 : 123 ///; /// 27 : 141
8 :24
N 43 1 42 2 32 12
The table 3 presents the findings for the initial stops of words in utterances.
The values presented (consisting of Av, R, and N) closely resemble those of Table 2.
In particular, the mean VOT of /k/, /t/, and /p/ are demonstrated in the order of
91ms, 77ms, and 61ms. The highest VOT can be found in the alveolar stop /k/,
whereas the lowest belongs to the bilabial stop /p/. The VOT range of the aspirated /k/
sound, with a range of 34-151ms, surpasses the VOT ranges of the aspirated /t/ and /p/
By examining the number of tokens for each stop phoneme in Table 3, it was
found that 12 of the 44 aspirated tokens of the /p/ sound were pronounced incorrectly
as unaspirated, which constitutes 27,3% of the total. Meanwhile, 2 out of the total 44
tokens of the /t/ sound were mispronounced (accounting for 4,5%) and 1 token of
With the data extracted from Table 2 and Table 3, Figure 3 provides the
utterances,
20
The data presented in the table indicate that the mean VOT of the plosive in
isolated words is greater than that of the same plosive in utterances, with the
exception of /k/. Specifically, while the mean VOT of the aspirated /k/ in isolated
The results of the test suggest that the /p/ sound is the most problematic for
value is recorded for the velar stop /k/ (followed by the alveolar stop /t/), whereas the
lowest value is seen in the bilabial stop /p/. Conversely, in isolated words, the alveolar
/t/ registers the highest mean score, with the bilabial /p/ maintaining the lowest figure.
Additionally, during the data analysis process, several errors are recorded
With regard to the bilabial /p/, one typical error involves some students being
confused when they attempted to relate the phonemes in their mother tongue to the
voiceless plosive /p/ in English. Specifically, for those with a negative VOT value,
can be seen with the voicing bar at a frequency below 1000Hz, indicating the sound
released in this case was voiced. Moreover, the voicing point is identified when the
periodic waves are present. The burst can be detected following the voicing point,
hence a negative VOT value, which necessarily means that the sound recorded was
In several cases, the phoneme /p/ could also be produced without any
aspiration, shown when the VOT values were between 0 and 25ms, which creates
another mistake.
the sound released was voiceless. However, there is no short frication noise displayed
prior to the beginning of the vowel formants shown in the spectrograms. Frication is
normally synonymous with aspiration, which means the absence of this indicates that
Considering the sound /k/ - the most posterior one, it is expected to be the
most difficult sound out of the three plosives for the Vietnamese to generate. In
contrast to this prediction, however, the number of problems revolving around /k/ was
the lowest. In 54 cases, this phoneme is pronounced with a longer aspiration, forming
a clearer /h/ sound and generating VOT values that surpass the range of 60-100ms.
Discussion
The results of the production test are not entirely in line with previous
research.
On the one hand, the average mean score of VOT values derived from the oral
production of utterances shows the same order in VOT length of three investigated
plosives as the findings from the VOT study by Lisker & Abramson (1964) regarding
the place of articulation. In particular, the velar stops have the highest mean VOT,
whereas English bilabial stops have the lowest. However, the mean score of VOT
values generated in the oral production of separated words was incompatible with
Lisker & Abramson’s (1964) findings in that the alveolar stops have the highest VOT.
On the other hand, in regard to the contexts where the investigated sounds
appear, the production test did not generate entirely parallel outcomes to Lisker &
Abramson’s (1964). While the two scholars report that the VOT of the stops found in
23
individual lexical items is greater in length than the figure found in sentential stops,
this is in fact contradictory to the test’s results where the VOT of velar stops in
sentences is longer.
However, perception and production tests both figured out that the /p/ sound is
the most challenging for participants and somehow indicates a relation between
There are several explanations that can be proposed regarding the differences
between the data collected and the theory suggested in the literature review.
Unaspirated /p/
As pointed out in Literature Review, English plosives are classified into two
categories which are not equivalent to those in Vietnamese (Thuật, 2000). In addition,
Vietnamese with no aspiration. When they encounter English consonants, they tend to
substitute Vietnamese consonants for English ones (Truong, 2015). This could have
prompted the participants into confusion between voiceless /p/ and voiced /b/, thus
replacing the former with the latter. Such a mistake might be caused by interference
‘sâm panh’ (‘champagne’) or ‘pin’ (‘battery’) where the initial plosive sound /p/ is
voiceless yet unaspirated, creating a short VOT value. For this reason, some
Vietnamese learners may find it difficult to pronounce /p/ accurately as they have a
tendency to omit the aspiration from this sound due to the influence of the unaspirated
/p/ in Vietnamese.
Unaspirated /t/
24
Unaspirated /t/ is a potential mistake that was rarely made in this production
test. This can be partly explained by the fact that in Vietnamese, there is an aspirated
voiceless stop which is / tʰ/ that is produced in the same manner as the /t/ in English,
albeit with a more gentle explosion in the initial position. This phonemic similarity
can possibly allow most students to pronounce /t/ with the aspiration needed.
English and Vietnamese, the majority of inaccuracy was associated with the plosive
/p/, which has been anticipated. In contrast to this, the accuracy rate of /k/ was higher
than initial expectation, whereas no significant proportion of errors was observed in /t/
the analytical process, the undergraduate participants have pronounced the plosive
The research has enhanced our understanding of how students perceive and
produce initial voiceless aspirated English plosives to some extent. It is expected that
these findings will be beneficial and valuable in Phonetic and Phonology research in
Moreover, the following suggestion may be drawn from the present perception
and production test for International University students (IU students). The results of
the perceptual and production test conducted among IU students indicate that they had
difficulties in identifying and producing syllable-initial plosives, with the error rate
following the respective order of /p/, /k/, and /t/. Based on the findings, it should be
recommended that teachers act as good models for proper English pronunciation, pay
closer attention to how their students pronounce words and provide more instruction
25
on English phonemes. Also, aspirated /p/ should be emphasized over the two sounds
that pupils are most prone to mispronounce, for example, by employing minimal
pairsor comparing the phonetic systems of the L1 and L2 languages. As for the
students, they should practice and improve their pronunciation of English sounds,
However, despite the valuable outcomes, this study is limited by its small scale.
positions, which does not include word-middle and final plosives or plosives in
clusters throughout the paper. Secondly, the sample size of 30 participants is not
minimum number required to ensure that the study is valid. Moreover, objectively,
using mediums, which were iPhones, as the Praat software could not be directly
utilized for recording purposes. Therefore, technical problems stemming from this
type of mobile device were a possible factor that might affect or even distort the
recording quality. Moreover, since the testing procedure was performed in an active
classroom with the presence of other undergraduates, background noise was barely
avoidable.
Conclusion
The conducted research has more or less made valuable contributions to the
addressed the existing research gap to an extent. The findings of this study indicate as
follows:
26
inconsistency between the outcomes of the production test and those reported in the
literature.
2. The sound /p/ is the most challenging as anticipated based on the differences
between English and Vietnamese, while students face almost no difficulty recognizing
produced in various situations- in isolated words and in utterances. They should also
In addition, the current study only focuses on aspirated plosives, and thus,
future studies should delve more deeply into unaspirated sounds at the same place of
References
https://doi.org/10.4236/ojml.2021.113034
University Press.
Vietnamese phonetics]. Ho Chi Minh City, Vietnam: Nhà Xuất Bản Giáo Dục
https://doi.org/10.5750/bjll.v10i0.1482
https://en.wikipedia.org/w/index.php?title=Help%3AIPA
%2FVietnamese&oldid=1090870294#cite_note-p-2
Hoang, T. T. (2014). The interference of the mother tongue in the first year students’
Hualde, J., Simonet, M. & Nadeu, M. (2011). Consonant lenition and phonological
https://doi.org/10.1515/labphon.2011.011
University Prewa.
Kaur, J. (2015). Factors Influencing Voice Onset Time (VOT): Voice Recognition.
Technology.
https://doi.org/10.1080/00437956.1964.11659830
29
Liberman, A. M., Harris, K. S., Hoffman, H. S., & Griffith, B. C. (1957). The
https://doi.org/10.1037/h0044417
Ly Bui, T. T., Mai, T. H., & Diep, H. N. (2021). Common errors in pronouncing final
https://doi.org/10.46827/ejel.v6i3.3640
Thuật Đoàn Thiện. (2000). Ngữ Âm tiếng việt. Nhà xuất bản Đại học quốc gia Hà Nội.
Appendices
INFORMED CONSENT
Ms. Tran Ngoc Hong Phuc and the School of Languages – IU-VNU is conducting
your participation in this research is to help the researcher obtain authentic data.
You were selected as a possible participant in this study because your educational
background, i.e., English level, majors, etc. is highly matched with the criteria of
the study.
B. PROCEDURES
If you agree to participate in this research study, the following will occur:
1. First, you will sign in this consent form in order to validate your participation in the
study.
32
2. Second, you will have a perception test on paper (5 minutes) in which you listen to
3. Third, you will have a production test (3-5 minutes) in which you pronounce
separate words and read-aloud sample sentences. The whole session will be
recorded. Your work will be used for further analysis and your results (scores) will
be kept confidential.
C. RISKS
perceiving and producing words/ sentences. We guarantee that it is ONLY for the
D. CONFIDENTIALITY
The records from this study will be kept as confidential as possible. No individual
identities will be used in any reports or publications resulting from the study. All
information will be always kept in locked files. Only research personnel will have
access to the files and only those with an essential need to see names or other
identifying information will have access to that particular file. After the study is
E. BENEFITS OF PARTICIPATION
There will be no direct benefit to you from participating in this research study. The
anticipated benefit of your participation in this study is to provide empirical data for
F. VOLUNTARY PARTICIPATION
Your decision whether or not to participate in this study is voluntary and will not
33
affect your relationship with the department of English. If you choose to participate
in this study, you can withdraw your consent and discontinue participation at any
G. QUESTIONS
If you have any questions about the study, please contact Ms. Tran Ngoc Hong
Phuc by calling (+84)0773772468. You can also contact any questions about the
CONSENT
TO KEEP.
Research Participant
Signature __________________________________Date
Interviewer
Perception test
34
currently taking a thesis this semester, and this small test will enable me to collect
Section 1: Listen to each question twice and choose the word you hear. There is no
Section 2: For each question, you will listen to 3 words namely A, B and C. You are
allowed to listen once and tick on A, B, or both in which the word is the same as the
Section 1: Listen to the recording and choose the word you hear
1. bear pear
2. past bast
3. tie die
4. cold gold
5. game came
6. down town
Section 2: Listen to the recording and choose the word same as C (you can
Production test
Note: The whole session will be recorded for scoring. Your score will not be
officially announced; instead, it will only be kept as evidence for this research.
1. cold
2. coffee
3. to
4. talk
5. pack
6. passport