Evidence of Internal Consistency in the Spectrographic

Analysis Protocol
*Leonardo Wanderley Lopes, *Allan Carlos França da Silva, *Itacely Marinho da Silva,
*Maxsuel Alves Avelino de Paiva, *Saulo Iordan do Nascimento Silva, †Larissa Nadjara Alves Almeida, and

Vanessa Veis Ribeiro, *yJo~ao Pessoa, and zLagarto, Brazil

Summary: Objective. To verify the validity in the internal consistency in the spectrographic analysis protocol
Material and Methods. Thirty-nine students of the Speech-Language Pathology graduate program and
38 speech-language pathologists, specialized in voice, participated in the study. The participants made visual
inspections of 10 spectrograms and marked the items of the SAP. For analysis of the internal consistency in the
SAP, the exploratory factor analysis (EFA) and confirmatory factor analysis were performed.
Results. Most items showed corrected item-total correlation above 0.3, indicating that the items have a good
relationship with each other and with the SAP as a whole. Six items presented values below the average, suggest-
ing the exclusion of these from the construct. However, three of these were maintained because they were judged
as important parameters in clinical practice, requiring the training of judges when using the SAP to properly
understand the items. The EFA regrouped the previous domains of the SAP into three factors. All items
presented a factor load above 0.4, suggesting the retention of all, except for the items previously indicated, for
exclusion. The confirmatory factor analysis corroborated with the EFA and its indexes.
Conclusion. The SAP has good internal consistency. All items have a good degree of relationship with each
other and contribute positively to the protocol as a whole. The final version of the SAP, at this stage, has 15 items
(from the 25 items of the initial SAP version), distributed among three domains.
Key Words: Acoustic−Protocols−Speech-Language Pathology−Validation studies−Voice disorders.

INTRODUCTION Most methods of acoustic analysis are based on the

The vocal assessment, in the clinical context, aims to charac- extraction of quantitative measurements of the sound
terize the impact of voice disorders in all their dimensions waves.7 The main advantages of this type of acoustic analy-
and to provide a global view of the voice disorder, with sis are the possibility of quantifying the results, comparing
sufficient information for decision making.1-3 For this, it is them with reference values, and high reproducibility of the
recommended to use information from a multidimensional algorithms used in the extraction of the measures.6-8 How-
evaluation, composed of visual evaluation of the larynx, ever, several of these measures have confounding factors
auditory-perceptual evaluation of voice, acoustic analysis, that compromise their reliability, such as the intensity of the
aerodynamic assessment, and vocal self-assessment.4 emission, the degree of aperiodicity of the sound wave, and
Among the procedures that make up the multidimen- the use of different algorithms between softwares.8-10
sional evaluation of the voice, acoustic analysis is one of the Spectrography is the main approach used in descriptive
most studied, both in the context of diagnostic confirmation acoustic analysis, since its three-dimensional graph can be
and in monitoring the effects of therapy.1 There are around analyzed qualitatively.11 The spectrogram has the frequency
15 acoustic analysis methods for evaluating vocal signals measures on the vertical axis, the emission time on the hori-
recorded in the literature.5 These methods are divided into zontal axis, and the amplitude of the sound wave compo-
two large groups: qualitative analysis, through observation nents by means of the color contrast in the spectrogram.12
of the visual pattern; and quantitative analysis, through the Visual inspection of the spectrogram is a method widely
extraction of quantitative measures, whether based on linear used in the clinical context, such as vocal assessment and
or nonlinear models.6 monitoring. Its main advantage is related to the possibility
of analyzing signals with a wide deviation range.
There are two classification systems that are commonly
used to describe spectrogram.13,14 First, Yanagihara13 clas-
Accepted for publication July 13, 2020.
From the *Speech-Language Pathology Department, Universidade Federal da Par- sifies the spectrograms as types 1, 2, 3, and 4, according to
aíba - UFPB, Cidade Universitária, Jo~ao Pessoa, Paraíba, Brazil; yProgram of Deci- the presence of noise in the spectrogram and the regularity
sion and Health Models, Universidade Federal da Paraíba - UFPB, Cidade
Universitária, Jo~ ao Pessoa, Paraíba, Brazil; and the zSpeech-Language Pathology of the harmonics. For the author,13 there is a relationship
Department, Universidade Federal de Sergipe - UFS, Lagarto, Sergipe, Brazil. between the auditory-perceptual evaluation of voice and the
Address correspondence and reprint requests to Leonardo Wanderley Lopes, Uni-
versidade Federal da Paraíba − UFPB, Cidade Universitária, Conjunto Presidente spectrogram; the intensity of the vocal deviation is related
Castelo Branco III, Jo~ ao Pessoa, Paraíba 58051-900, Brazil. to noise and the reduction of harmonics in the spectrogram.
Journal of Voice, Vol. &&, No. &&, pp. &&−&& This classification is more used in the clinical context of
0892-1997 vocal assessment.11 Titze,14 however, classifies the signal as
© 2020 The Voice Foundation. Published by Elsevier Inc. All rights reserved. types I, II, and III, based on the theory of nonlinear voice
2 Journal of Voice, Vol. &&, No. &&, 2020

production. For him,14 there is a continuum; the spectro- MATERIAL AND METHODS
gram becomes more periodic as the stability of the vibratory Design
pattern of the vocal folds increases, and the spectrogram This study is of a methodological design,16 evaluated and
becomes more aperiodic as the vibration of the vocal folds approved by the Research Ethics Committee of the Univer-
becomes more chaotic. This classification schema assists in sidade Federal da Paraíba (number 508’200 / 2013).
the decision of which type of analysis should be completed;
linear or nonlinear, and extracting measurements or analyz-
ing the pattern. This classification is more used for research.11 Sample
These authors proposed classifications but did not aim to For the analysis of the internal consistency in the SAP,
develop a standardized protocol for spectrographic analysis speech-language pathologists (SLPs) and speech-language
and characterization of individuals with and without vocal pathology students were recruited. Recruitment was carried
deviation, for both clinical use and research. Thus, there is no out via email. To invite the SLPs, a list of names and e-mails
valid instrument for this purpose. In research using spectro- of the associated SLPs was requested to recruit members of
graphic analysis, authors commonly use their own descriptors the Voice Department of the Brazilian Society of SLP. To
for the spectrogram, or use the Yanagihara and Titze invite the students of the speech-language pathology gradu-
classification.11 ate program, a list of names and emails of the students
To elaborate and validate speech-language pathology enrolled in voice area subjects was requested to recruit
testing instruments, some steps are necessary.15 The steps speech-language pathology graduate students from the Uni-
are, validity based on the content of the test, validity based versidade Federal da Paraíba.
on the response processes, validity based on internal consis- The selection criteria for SLPs were as follows: gradua-
tency, validity based on the relationship with other varia- tion from a speech-language pathology program; speciali-
bles, reliability/precision, and fairness of the test and zation in the voice area; current job in the voice area; and
accuracy. signing the informed consent form. The selection criteria
Recently, our research team has proposed a spectro- for speech-language pathology students were as follows:
graphic analysis protocol (SAP) that seeks to characterize being active in the speech-language pathology graduate
vocal deviation11 from temporal aspects, frequency of program at the Universidade Federal da Paraíba; being
harmonics, and noise in the spectrogram. The protocol enrolled in voice subjects; and signing the informed con-
has already gone through the first validation stage, in sent form. Students who had not completed the voice
which the content evidence, clarity, and relevance of the course were excluded, in order to ensure that the students
items in their domains were verified.11 We encourage had already undergone spectrographic reading training,
readers to access the manuscript on the first stage of SAP which occurs in this course. To qualify, the volunteers
validation. To proceed with the validation process, the answered questions about their training and professional
validity based on the response processes step was carried performance.
out. In order to proceed with the validation of the instru- Thirteen students were excluded due to the aforemen-
ment, it is necessary to assess the validity based on inter- tioned eligibility criteria. The final sample consisted of
nal consistency.15 At this stage, the degree of relationship 38 SLPs and 39 speech-language pathology students who
between the test items and how much they explain the passed the predefined eligibility criteria for the research.
variability in the outcome results should be analyzed
from the application of the test to a sample of the popula-
tion which will use the instrument in its final version, Material
whether clinicians (as is the case of this study) or patients. To carry out this research, the latest version of the SAP was
Specifically, this is done in order to examine the possibil- used, after validation based on the response processes step
ity of reducing the number of items and the existence of (Figure 1), containing 18 items.
different dimensions in the test.15 The SAP was designed with the aim of analyzing a nar-
Considering the importance of spectrographic analysis in row band spectrogram of a sustained vowel in the context
the clinical context, and the lack of standardization regard- of a clinical evaluation of voice disorders. Further details on
ing descriptors, procedures, protocols, and recommenda- the conceptual aspects and the dimensionalities of the con-
tions, it is necessary to develop and validate an instrument structs involved in the preparation of the instrument can be
that directs an evaluation toward specific and important consulted in Lopes, Le~ao, and Alves.11 Initially, five dimen-
aspects of spectrographic analysis. The validation of this sions for the SAP were determined, considering the tempo-
protocol can assist professionals in the evaluation and thera- ral aspects of the emission, the distribution of energy and
peutic monitoring of patients with vocal disorders, standard- noise according to the frequency bands, as well as the mor-
ize terms to facilitate communication and data sharing phological description of the harmonics. In each of the SAP
between professionals, and enable the comparison of research domains, items were selected which corresponded to more
that analyzed this outcome. specific descriptions of information related to the domain.
Thus, the objective of the present study was to verify the The initial version used in this research had 18 items, as pre-
validity of the internal consistency of the SAP. viously mentioned.
Leonardo Wanderley Lopes, et al Evidence of Internal Consistency in the Spectrographic Analysis Protocol 3

1) Onset of emission

a) ( ) Presence of noise or irregularity at the onset of emission

2) Temporal aspects of emission


b) ( ) Change in the configuration of the spectrographic tracing in the time domain

c) ( ) Gradual definition / tracing loss

d) ( ) Presence of abrupt tracing interruptions

3) Distribution of energy in the tracing


e) ( ) Presence of harmonics with a low brightness

f) ( ) Decreased energy and number of harmonics above 4,000 Hz

g) ( ) Presence of harmonics above 4,000 Hz

h) ( ) Decreased energy or reduced number of harmonics up to 4,000 Hz

i) ( ) Energy increase between 1,000-3,000 Hz

j) ( ) Decreased energy level over the entire frequency range along the tracing

k) ( ) Increased energy level over the entire frequency band along the tracing

4) Description of harmonics

l) ( ) Presence of irregular horizontal striations between harmonics

m) ( ) Presence of poorly defined harmonics or harmonic sketches

n) ( ) Predominant presence of low amplitude harmonics

o) ( ) Presence of harmonics with irregular trajectory and morphology (non-rectilinear)

5) Distribution of noise in the tracing


p) ( ) Presence of noise between harmonics below 4,000 Hz

q) ( ) Presence of additional diffused noise above 4,000 Hz

r) ( ) Replacement of harmonics for noise in the spectrographic tracing

FIGURE 1. Spectrographic analysis protocol (SAP) version after validation of response processes.

To use the SAP, the clinician must initially make a visual validation regarding its psychometric properties and may
inspection of the spectrogram of a sustained vowel. Then, undergo changes regarding its structure and score assign-
he must mark the items observed in the spectrogram in the ment until its validation is concluded. It is currently being
SAP. So far, there is no defined score or cut-off point for the used for training clinicians to adopt a consensus-based
instrument. The SAP is currently going through all stages of description of the spectrogram of dysphonic patients. At the
4 Journal of Voice, Vol. &&, No. &&, 2020

end of the validation steps, it is expected to propose a score instructions on how to fill in the form, along with a link to a
for use in clinical reports and cut-off points depending on file (pdf) on the Dropbox platform to perform the anchor
the presence and degree of deviation in vocal quality. training, and consult when necessary, recognizing what
each item visually represents. These anchors were spectro-
grams with markings composed of arrows and blue lines
Procedures under the spectrogram, which indicated the presence of
At first, spectrograms were selected to compose the internal each item. Figure 2 shows four examples of anchors used to
consistency assessment questionnaire from the bank of the train judges.
Voice Laboratory of the Universidade Federal da Paraíba. The judges evaluated the 10 spectrograms corresponding
This database of the laboratory contains the registration of to the emission of the vowel / Ɛ /. For each spectrogram it
personal data, anamnesis, self-assessment protocols, data was necessary to perform a visual inspection and mark with
from the auditory-perceptual evaluation, and acoustic anal- the cursor PAE items identified in each spectrogram. The
ysis of the voices of 1,400 patients evaluated at the labora- item's marking was computed as "presence" and the non-
tory in the institution between April 2012 and April 2019. marking was recorded as "absence" of the item in the respec-
The spectrograms used were generated in a standardized tive spectrogram. Considering that the current version of the
way with the following materials: Fonoview software ver- Instrument has 18 items, for each spectrogram the judges
sion 4.5 (CTS Informatics, Pato Branco, Paraná, Brazil), were able to mark between 1 and 18 items. Spectrographic
Dell all-in-one desktop (Eldorado do Sul, Rio Grande do evaluation lasted between 15 and 20 minutes. However,
Sul, Brazil), unidirectional cardioid microphone, Sennhe- there was no time limit (Figures 3 and 4).
iser, model E-835 (Sennheiser, model-835, Hannover, Ger- The judges' scores made on the digital platform were
many) located on a pedestal and coupled to a Behringer automatically transferred to an Excel spreadsheet, comput-
preamplifier, and a model U-Phoria UMC 204 (U-Phoria ing the individual frequency of each item and the total num-
UMC 204, Willich, Germany). The collection methods were ber of items marked by the evaluators for each spectrogram.
also standardized, based on the following procedures: voice
collection in a recording booth with acoustic treatment and
noise below 50 dB SPL, with a sampling rate of 44,100 Hz, Statistical analysis
40 ms windowing, update time of 2.5 ms, dynamic ampli- The statistical package for social sciences—SPSS version
tude range of 60 dB, frequency limit of 7,500 Hz, and mini- 18.0 (IBM Corporation, New York, NY) and AMOS pack-
mum time interval of 3 seconds. The sustained vowel / Ɛ / in age were used for statistical analysis. To obtain validity
habitual pitch and loudness was used as a sample of patients based on internal consistency, the following procedures
over 18 years old of both sexes. were performed:
Two researchers in the present study selected the spectro-
grams. The main criterion for selecting these spectrograms 1) Analysis of the corrected item-total correlation was
was the consensus among researchers regarding the image used to verify how much the items are related to the
of the spectrogram and presence of at least one of the items others and to the evaluated construct. Items with cor-
evaluated, which included patients with or without a voice rected item-total correlation with values below 0.3
disorder. In each spectrogram, the judges should judge the were excluded.17 The Kaiser-Meyer-Olkin (KMO) and
25 items of the initial version of SAP. Several rounds of Bartlett's test (P < 0.001) were also calculated to verify
selection were carried out until we reached the total of the adequacy of the sample to perform the exploratory
10 spectrograms in which there was a probability that each factor analysis (EFA).
of the 25 SAP items would appear in at least 50% of these 2) The EFA with principal component extraction method
10 spectrograms. Thus, 10 spectrograms were chosen, and was used to verify the possibility of reducing the num-
the selected spectrograms were saved in image format (jpeg). ber of items in the SAP from the interrelationships
Previously, a third researcher specialized and active in the between the items. Initially, the components were
voice area, with experience in spectrographic analysis, was extracted following the orthogonal method with vari-
recruited and made a visual-perceptual analysis of the max rotation. The analysis of commonality was per-
10 spectrograms using SAP. The answers of this third formed, whose value indicates how much of the total
researcher were used to confirm whether the 10 spectro- variance is being loaded in each item tested. Values
grams selected by consensus contained the SAP items that above 0.4 were considered acceptable. We opted to
would be evaluated by the judges in the later stage. Thus, exclude factor loads below 0.4, intermediate value, by
this third researcher answered the same online questionnaire which we reduce the risk of excluding an item that
that would be sent to the judges. could be relevant, or include an item with a very low
Participants who passed the selection criteria were called factor load that could compromise the instrument. The
judges and received an e-mail with the link for remote access Kaiser criterion was used to select the number of fac-
to the digital SurveyMonkey platform. On the platform, tors, through the analysis of the sedimentation graph
spectrograms and SAP’s items were available. On the first of factor components, that is, factors with eigenvalues
screen of the digital platform, the judges received all above 1.0 were considered. From the analysis of the
Leonardo Wanderley Lopes, et al Evidence of Internal Consistency in the Spectrographic Analysis Protocol 5

FIGURE 2. Examples of four spectrograms used for the training and calibration of judges. Blue shapes highlight the SAP item that should
be observed in each spectrogram: (A) Presence of abrupt spectrogram interruptions (item “d” SAP); (B) Presence of harmonics above
4,000 Hz (item “g” SAP); (C) Decreased energy or reduced number of harmonics up to 4,000 Hz (item “h” SAP); (D) Increased energy level
over the entire frequency band along the spectrogram (item “k” SAP).
6 Journal of Voice, Vol. &&, No. &&, 2020

FIGURE 3. Sedimentation graph of factor components.

rotated component matrix, the components with factor items. It was decided to exclude items “b,” “j,” and “r,”
loads above 0.4 were kept.17 and to keep “c,” “l,” and “m.”
3) Cronbach's alpha was used to verify the reliability and The KMO of 0.816 and the Bartlett's test (P < 0.001) indi-
internal consistency of the SAP. cated the adequacy of the sample, significant correlations
4) The confirmatory factor analysis (CFA) was performed between the items, and the possibility of performing the SAP.
to verify the agreement of the factors obtained by the Through the analysis of the sedimentation graph of factor
EFA, and to confirm the adequacy of the model.18 To components, using Kaiser's criterion of eigenvalues above 1,
analyze the convergent validity, the criterion of Steen- three factors were observed and were explained together.
kamp and Van Trijp18 was used. The criterion suggests The three factors explained 52.04% of the total variance.
that the construct/factor is valid when the factor loads Factor 1 represented 33.35% of the variance, Factor 2 repre-
are strong (>0.50) and significant (critical region > t sented 9.67%, and Factor 3 represented 9.01% (Table 2).
critical, a). The maximum likelihood method was used The communality ranged from 0.699 to 0.312; however,
to test the confirmatory factor model. The indices were only one item had a value below 0.4. There was no need to
evaluated: discrepancy function (c2); standardized chi- exclude the item, since its exclusion would not increase the
square (c2/gl); goodness of fit index (GFI); adjusted instrument's internal consistency. The factorial loads of the
goodness of fit index (AGFI); Tukey-Lewis index items were related to one of the factors and the items were
(TLI); comparative fit index (CFI); parsimony good- distributed as follows: Factor 1 (items "d," "e," "f," "I," "g,"
ness of fit index (PGFI); root mean square error of "h," "n," and "q"), Factor 2 (items "c," "m," " p," and "o"),
approximation (RMSEA). These indices allowed the Factor 3 (items "a," "k," and "l"). Regarding reliability of
model to be checked and adjusted. the instrument, the value of the Cronbach's alpha coefficient
was 0.843, which shows good general internal consistency
and allows us to infer that all items correspond to the same
RESULTS construct.
Table 1 shows the items of the SAP, the descriptive analy- Tables 3 and 4 show the correlation between the factors
sis of the sum of the answers given by the judges for each and the adjustment indexes of the model produced by the
item, the corrected item-total correlation and Cronbach's CFA. It was observed that there was convergent validity in
alpha if the item were excluded. The item-total correla- the three proposed factors, in which the factorial loads of its
tion coefficient obtained for items “b,” “c,” “j,” “l,” “m,” items were strong and significant and correlated with each
and “r” was below 0.3. However, these items did not other (Table 3). Using the maximum likelihood method, it
influence the decrease of the instrument’s internal consis- was found that all indexes were within the suggested values
tency, and they may be excluded or not from the set of for acceptance of the confirmatory factor model (Table 4).
Leonardo Wanderley Lopes, et al Evidence of Internal Consistency in the Spectrographic Analysis Protocol 7

1) Distribution of energy

g) ( ) Presence of harmonics above 4,000 Hz

f) ( ) Decreased energy and number of harmonics above 4,000 Hz

q) ( ) Presence of additional diffused noise above 4,000 Hz

i) ( ) Energy increase between 1,000-3,000 Hz

n) ( ) Predominant presence of low amplitude harmonics

e) ( ) Presence of harmonics with a low brightness

h) ( ) Decreased energy or reduced number of harmonics up to 4,000 Hz

d) ( ) Presence of abrupt tracing interruptions

2) Instability of the system


m) ( ) Presence of poorly defined harmonics or harmonic sketches

p) ( ) Presence of noise between harmonics below 4,000 Hz

o) ( ) Presence of harmonics with irregular trajectory and morphology (non-rectilinear)

c) ( ) Gradual definition / tracing loss

3) Asynchrony of glottic cycles


l) ( ) Presence of irregular horizontal striations between harmonics

a) ( ) Presence of noise or irregularity at the onset of emission

k) ( ) Increased energy level over the entire frequency band along the tracing

FIGURE 4. Spectrographic analysis protocol (SAP) version after validation based on internal consistency.

DISCUSSION contained in the analysis of the temporal aspect of the

The internal consistency analysis step is necessary in valida- emission. The first refers to the sustained emission, in
tion studies of assessment instruments. Two of its objectives which the vocal signal can start with stability and end
are as follows: to verify the degree of relationship between unstable or in reverse. The second is more specific and
the items; and verify the need to reduce the number of items indicates whether the vocal signal started stable and
and reorganize the protocol construct based on the applica- ended unstable. Dysphonic voices may have few harmon-
tion to the target audience. These procedures are carried out ics and reduced loudness, which may be due to tension,
from the EFA and CFA.15 asthenia, breathiness, low glottal resistance, and inade-
The items “b,” “c,” “j,” “l,” “m,” and “r” showed a weak quate resonance.12,19,20 Throughout the emission, a voice
relation with the protocol items as a whole as quantified by with any of these altered parameters may gradually lose
the correct item-total correlation. Values below the standard its brightness and reduce the number of harmonics in the
(0.3) suggested the removal of items,17 which was analyzed spectrograph, especially if it is associated with pneumo-
through the observation and analysis of each item. phono-articulatory incoordination, reduced vocal resis-
Item “b” (“change in the configuration of the spectro- tance, and inadequate glottal coaptation.12,19,20 Thus, it
gram in the time domain”) is redundant with item “c” was decided to exclude item “b,” and to keep item “c,”
(“gradual definition / spectrogram loss”). Both are because item “c” is more specific and allows inferring
8 Journal of Voice, Vol. &&, No. &&, 2020

Analysis of the Corrected Item-Total Correlation and Cronbach’s Alpha of the Items of the Spectrographic Analysis
Corrected Item-Total Cronbach’s Alpha if the
Item Mean SD Correlation Item Is Deleted
a) Presence of noise or irregularity at 0.67 0.47 0.31 0.84
the onset of emission
b) Change in the configuration of the 0.77 0.41 0.25 0.84
spectrographic spectrogram in the
time domain
c) Gradual definition/spectrogram 0.84 0.36 0.27 0.84
d) Presence of abrupt spectrogram 0.87 0.33 0.52 0.83
e) Presence of harmonics with a low 0.71 0.45 0.52 0.83
f) Decreased energy and number of 0.76 0.42 0.55 0.82
harmonics above 4,000 Hz
g) Presence of harmonics above 0.62 0.48 0.59 0.82
4,000 Hz
h) Decreased energy or reduced 0.55 0.49 0.61 0.82
number of harmonics up to
4,000 Hz
i) Energy increase between 1,000 0.61 0.49 0.58 0.82
and 3,000 Hz
j) Decreased energy level over the 0.63 0.48 0.26 0.84
entire frequency range along the
k) Increased energy level over the 0.55 0.49 0.36 0.83
entire frequency band along the
l) Presence of irregular horizontal 0.62 0.48 0.21 0.84
striations between harmonics
m) Presence of poorly defined 0.50 0.50 0.29 0.84
harmonics or harmonic sketches
n) Predominant presence of low 0.53 0.50 0.51 0.83
amplitude harmonics
o) Presence of harmonics with 0.61 0.49 0.50 0.83
irregular trajectory and morphol-
ogy (nonrectilinear)
p) Presence of noise between 0.66 0.47 0.63 0.82
harmonics below 4,000 Hz
q) Presence of additional diffused 0.63 0.48 0.59 0.82
noise above 4,000 Hz
r) Replacement of harmonics for 0.64 0.48 0.27 0.84
noise in the spectrogram

about the aerodynamic and biomechanical properties parameter in the assessment.13,14 Its appearance in spectrogra-
underlying the analysis of a sustained voice emission. phy is due to the bifurcation of harmonics caused by the vibra-
Items “l” (“presence of irregular horizontal striations tory irregularity of the vocal fold mucosa, which in the
between harmonics”) and “m” (“presence of poorly defined auditory-perceptual evaluation of the voice can be observed as
harmonics or harmonic sketches”) also showed lower than roughness and / or intense vocal deviation.21 The presence of
expected values in the corrected item-total correlation. How- poorly defined harmonics or harmonic sketches occurs in sit-
ever, both items are of great importance for visual analysis in uations of alteration of the vibratory amplitude of the vocal
spectrography. The presence of irregular horizontal striations folds or decrease in the amplification of the sound in the vocal
between harmonics, called subharmonics, has been considered tract.14,21 When the harmonics are seen in this way in the spec-
since classical spectrographic analysis as an important trogram, it can be indicate the beginning of a voice disorder.22
Leonardo Wanderley Lopes, et al Evidence of Internal Consistency in the Spectrographic Analysis Protocol 9

Exploratory Factor Analysis of the Spectrographic Analysis Protocol Items
Item Factor 1 Factor 2 Factor 3 h2
g) Presence of harmonics above 4,000 Hz 0.745* 0.000 0.149 0.577
f) Decreased energy and number of harmonics above 4,000 Hz 0.732* 0.094 0.103 0.555
q) Presence of additional diffused noise above 4,000 Hz 0.718* 0.293 0.150 0.624
i) Energy increase between 1,000 and 3,000 Hz 0.696* 0.040 0.255 0.551
n) Predominant presence of low amplitude harmonics 0.636* 0.039 0.319 0.508
e) Presence of harmonics with a low brightness 0.600* 0.137 0.202 0.419
h) Decreased energy or reduced number of harmonics up to 4,000 Hz 0.587* 0.184 0.397 0.537
d) Presence of abrupt spectrogram interruptions 0.572* 0.263 0.089 0.404
m) Presence of poorly defined harmonics or harmonic sketches 0.012 0.655* 0.262 0.498
p) Presence of noise between harmonics below 4,000 Hz 0.512 0.636* 0.029 0.667
o) Presence of harmonics with irregular trajectory and morphology (not rectilinear) 0.494 0.573* 0.249 0.634
c) Gradual definition/spectrogram loss 0.087 0.548* 0.060 0.312
l) Presence of irregular horizontal striations between harmonics 0.165 0.109 0.709* 0.541
a) Presence of noise or irregularity at the onset of emission 0.050 0.573 0.607* 0.699
k) Increased energy level over the entire frequency band along the spectrogram 0.361 0.117 0.369* 0.400
Explained variance (%) 33.35 9.67 9.01
Cumulative explained variance (%) 33.35 43.03 52.04
Cronbach’s alpha 0.846 0.624 0.427
Cronbach’s alpha (set of items) 0.843
Exploratory factor analysis; Extraction method: Principal component analysis; Rotation Method: Varimax with Kaiser normalization.
h2 = commonality.
*Significant values.

The item “j” (“decreased energy level over the entire fre- Item “g” had the highest factorization among all. The
quency range along the spectrogram”) and the item “r” presence of energy above 4,000 Hz in a spectrogram may be
(“replacement of harmonics for noise in the spectrogram”) due to phonation tension, to increased subglottic pressure,23
presented a lower than expected value in the corrected item- or to trained voice. In contrast, item “f” translates the
total. It was observed that the items have redundancy with acoustic correlate of voices that have decreased loudness,24
the item “h” (“decreased energy or reduced number of har- which may be present in normal untrained voices.
monics up to 4,000 Hz”), and with the item “p” (“presence In general, item “q” is related to the presence of additive
of noise between harmonics below 4,000 Hz”), respectively, noise in the emission (perceived audibly as breathiness) and
which have a strong relationship with the protocol items. to voices with intense deviations, with a predominance of
Thus, it was decided to exclude items “j” and “r.” turbulent transglottic airflow.13,22
The analysis indicated the adequacy of the sample, signifi- Item “i” is related to tension in the supraglottic region or
cant correlations between the items, and the possibility of laryngopharyngeal resonance. It is a parameter that charac-
performing the EFA. Thus, we proceeded with the EFA of terizes the type of vocal adjustment performed, specifically
the 15 items selected to remain in the protocol in order to regarding constriction in the oropharyngeal region in some
observe how the items are grouped and in how many fac- cases of vocal hyperfunction.11,25
tors. The factorial loads of the 15 items were above 0.4, the The poorly defined harmonics are related to the vibratory
total variance of the protocol is well distributed for each regularity of the vocal folds in association with the offered
item, and the inter-item relationship is good. glottic resistance. The items “n” and “e” are related to this
Previously, the SAP had been divided into five dimen- characteristic and have associated physiological correlates.
sions of analysis: onset of the emission, temporal aspects of Item “n” presents harmonics with low amplitude that are
the emission, distribution of energy in the spectrogram, found in deviated voices, resulting from the decrease in the
description of harmonics, and distribution of noise in the vibratory amplitude of the vocal fold mucosa.26,27 Item “e”
spectrogram. After EFA, the SAP was reorganized into may be related to reduced vocal intensity, reduction in the
three factors. It was observed that most of the Factor 1 closed phase of glottic cycles, or inappropriate use of reso-
items came from the distribution of energy in the spectro- nance cavities, which results in harmonics with low bright-
gram domain of the previous classification; one item from ness.27,28 Items “d” and “h” are also related to the vocal
the temporal aspects of the emission domain, one item from fold vibratory regularity and glottal resistance, respectively.
the description of harmonics domain, and one item from Item “d” refers to the presence of abrupt spectrogram inter-
the distribution of noise in the spectrogram. By reorganizing ruptions caused by voice breaks, which are related to inter-
the items, the content of Factor 1 was composed of items ruptions in the closed phase of the glottic cycle.23 This refers
“d,” “e,” “f,” “I," “g,” “h,” “n,” and “q.” to how much a voice has a periodicity in its emission and,
10 Journal of Voice, Vol. &&, No. &&, 2020

Statistics of the Factor 1, Factor 2, and Factor 3 Constructs of the Items
Item Construct Estimate SE P Value
g) Presence of harmonics above 4,000 Hz F1 1.000
f) Decreased energy and number of harmonics above 4,000 Hz F1 0.676 0.112 <0.001*
q) Presence of additional diffused noise above 4,000 Hz F1 0.814 0.126 <0.001*
i) Energy increase between 1,000 and 3,000 Hz F1 0.820 0.129 <0.001*
n) Predominant presence of low amplitude harmonics F1 0.705 0.137 <0.001*
e) Presence of harmonics with a low brightness F1 0.699 0.123 <0.001*
h) Decreased energy or reduced number of harmonics up to 4,000 Hz F1 0.725 0.137 <0.001*
d) Presence of abrupt spectrogram interruptions F1 0.491 0.092 <0.001*
m) Presence of poorly defined harmonics or harmonic sketches F2 1.000
p) Presence of noise between harmonics below 4,000 Hz F2 0.909 0.133 <0.001*
o) Presence of harmonics with irregular trajectory and morphology (not rectilinear) F2 0.766 0.142 <0.001*
c) Gradual definition/spectrogram loss F2 0.324 0.114 <0.001*
l) Presence of irregular horizontal striations between harmonics F3 1.000
a) Presence of noise or irregularity at the onset of emission F3 0.472 0.169 <0.001*
k) Increased energy level over the entire frequency band along the spectrogram F3 0.601 0.164 <0.001*
t test.
F1 = Factor 1—Distribution of energy; F2 = Factor 2—Instability of the system; F3 = Factor 3—Asynchrony of glottic cycles.
*Significant values.

consequently, its analysis would benefit from the inspection the voice. The vibratory irregularity of the vocal folds
of the emission as a whole, in a descriptive way.29 Item “h” generates the unpredictability of glottic cycles and can be
is also related to reduced loudness due to a loss in the condi- perceived as roughness. This can happen when some aero-
tion of resistance of the vocal folds.22,27 dynamic and biomechanical parameters are altered. These
Based on this information, it is possible to infer that parameters can be subglottic pressure, stiffness, mass and/
Factor 1 corresponds to the energy distribution in the or tension.31
spectrogram. Therefore, Factor 1 was called “distribution Item “o” analyzes whether the harmonics have an irregu-
of energy.” lar trajectory and morphology, so that their behavior along
Factor 2 consists of four items, corresponding to the the spectrogram is nonrectilinear. This behavior may be
description of harmonics in the domains previously used, associated with instability or phonatory strain.11,27
one item of the temporal aspect of the emission and one Item “c,” as already mentioned, refers to the difficulty of
item about the distribution of noise in the spectrogram. maintaining the closed phase of the glottic cycles, or, the
Factor 2 is composed of the items “c,” “m,” “p,” and “o.” maintenance of the vocal folds in the midline. The emission
Item “m” exhibited a high value in factor load in Factor results in a progressive decrease in loudness and an increase
2. This item refers to the presence of poorly defined harmon- in noise, which generates a gradual loss of energy.11
ics or harmonic sketches. Physiologically, this corresponds When performing the analysis of each item in detail, it
to the vibratory regularity of the vocal folds, glottal closure was observed that Factor 2 described the instability of the
and sound amplification.27,28 Its analysis is important both phonatory system. Factor 2 was called “instability of the
for characterizing vocal deviation and for monitoring the system.”
effects of vocal exercises.30 Finally, Factor 3 grouped one item from each domain:
In item “p,” the presence of noise between harmonics start of emission, energy distribution, and description of
below 4,000 KHz characterizes the roughness present in harmonics. Factor 3 is made up of items “a,” “k,” and “l.”

Adjustment Indexes of the Model Tested According to the Confirmatory Factor Analysis
Model Index
2 2
Final model 206.895 87 2.378 0.917 0.909 0.967 0.908 0.679 0.645
Reference for goodness of fit - - 1-5 Above 0.90 0.60-0.80 0.05-0.10
Abbreviations: AGFI = adjusted goodness of fit index; c2 = discrepancy function; CFI = comparative fit index; c2/gl = standardized chi-square; GFI = goodness of
fit index; PGFI = parsimony goodness of fit index; RMSEA = root mean square error of approximation; TLI = Tukey-Lewis Index.
Confirmatory factor analysis.
Leonardo Wanderley Lopes, et al Evidence of Internal Consistency in the Spectrographic Analysis Protocol 11

Item “l” presented the highest factor load in Factor 3, REFERENCES

being the most representative item in this factor. It repre- 1. Roy N, Barkmeier-Kraemer J, Eadie T, et al. Evidence-based clinical
sents the presence of horizontal striations between the har- voice assessment: a systematic review. Am J Speech Lang Pathol.
monics, which has a physiological correlation with the 2. Van Stan JH, Mehta DD, Hillman RE. Recent innovations in voice
irregularity of the vocal folds. This results in the bifurcation assessment expected to impact the clinical management of voice disor-
of harmonics, caused by the asymmetry of phase of the glot- ders. Perspect ASHA Spec Interes Groups. 2017;2:4–13.
tic cycles in the closed phase.13,14 10.1044/persp2.SIG3.4.
Item “a” corresponds to a beginning of emission with 3. Patel RR, Awan SN, Barkmeier-Kraemer J, et al. Recommended pro-
tocols for instrumental assessment of voice: American Speech-Lan-
vibratory irregularity of the vocal folds. At the initial guage-Hearing Association Expert Panel to develop a protocol for
moment of phonation, it is necessary to have a subglottic instrumental assessment of vocal function. Am J Speech Lang Pathol.
pressure of air sufficient to put the mucosa in vibration and 2018;27:887–905.
keep the vocal folds on the midline. When any structure 4. Dejonckere PH, Bradley P, Clemente P, et al. A basic protocol for
involved is altered, whether at the subglottic or glottic level, functional assessment of voice pathology, especially for investigating
the efficacy of (phonosurgical) treatments and evaluating new assess-
consequently there is an alteration in the synchrony of the ment techniques. Eur Arch Otorhinolaryngol. 2001;258:77–82. https://
glottic cycles, together with the escape of air.31,32
Item “k,” increased energy level over the entire frequency 5. Buder EH. Acoustic analysis of voice quality: a tabulation of algo-
band along the spectrogram, has a perceptual correlate with rithms 1902-1990. In: Kent RD, Ball MJ, eds. Voice Quality Measure-
a strain vocal quality, and hyperfunctional emission.33 Phys- ment. San Diego, CA: Singular Publishing Group; 2000:119–244.
6. Sader RM, Hanayama EM. Consideraç~ oes te
oricas sobre a aborda-
iologically, it occurs when there is an increase in the closed gem ac ustica da voz infantil. Rev CEFAC. 2004;6:312–318.
phase of the glottic cycle. Thinking of the two previous 7. Barsties B, De Bodt M. Assessment of voice quality: current state-of-
items, excessive adduction may also be related to changes in the-art. Auris Nasus Larynx. 2015;42:183–188.
the level of subglottic pressure, mucosal and muscular j.anl.2014.11.001.
strain, or benign mass lesions of the vocal fold and ten- 8. Eadie TL, Doyle PC. Classification of dysphonic voice: acoustic and
auditory-perceptual measures. J Voice. 2005;19:1–14.
sion.31 To compensate for unsynchronized movement of the 10.1016/j.jvoice.2004.02.002.
vocal folds, hyperadduction may be a strategy used by a 9. Christmann MK, Brancalioni AR, Freitas CR De, et al. Uso do pro-
dysphonic patient. grama MDVP em diferentes contextos: revis~ao de literatura. Rev
Thus, it appears that Factor 3 is a hybrid, containing CEFAC . 2015;17:1341–1349.
items related to the domain of time and frequency. Factor 3 021620151742914.
10. Brockmann-Bauser M, Drinnan MJ. Routine acoustic voice analysis:
was called “asynchrony of glottic cycles.” time to think again? Curr Opin Otolaryngol Head Neck Surg.
Item “c” presented a low value of commonality raising 2011;19:165–170.
the possible need to exclude it. However, its retention is 11. Lopes LW, Alves GA ^ dos S, Melo ML de. Content evidence of a spec-
again suggested to enable the evaluation of the temporal trographic analysis protocol. Rev CEFAC. 2017;19:510–528. https://
aspects of the emission and for the reasons of physiological
12. N ~ez Batalla F, Corte Santos P, Se~
un naris González B, et al. Eval-
correlates that were discussed previously. This will indicate uacion espectral cuantitativa de la hipofunci
on vocal. Acta Otorrino-
that there is a need for training of judges who will use the laringol Espa~nola. 2004;55:327–333.
SAP to better understand this item. 6519(04)78531-3.
The value of Cronbach's alpha shows that the SAP is reli- 13. Yanagihara N. Significance of harmonic changes and noise compo-
able. In addition, as it does not have a value above 0.9, it nents in hoarseness. J Speech Hear Res. 1967;10:531–541. https://doi.
also suggests that the protocol items are not redundant.15 14. Titze IR. Workshop on acoustic analysis. Summary Statement. Iowa:
The CFA examined the convergence of factors in relation National Center for Voice and Speech; 1995:26–30.
to the model previously presented. The results confirmed 15. Pernambuco L, Espelt A, Magalh~aes Junior HV, et al. Recommenda-
the adequacy of the EFA model performed for the valida- tions for elaboration, transcultural adaptation and validation process
tion of the SAP. of tests in speech, hearing and language pathology. CoDAS. 2017;29:
Further studies are suggested in order to analyze the evi- 16. Polit DF, Beck CT. The content validity index: are you sure you know
dence of validity based on the relationship with other varia- what’s being reported? Critique and recommendations. Res Nurs
bles, reliability/precision, equity, and accuracy15 of the SAP. Health. 2006;29:489–497.
17. Daniel WW, Cross CL. Biostatistics: A Foundation for Analysis in the
Health Sciences. 11th ed Wiley; 2018.
CONCLUSION 18. Hair JF, Black WC, Babin BJ, et al. Análise Multivariada de Dados.
It is concluded that there is evidence that the SAP has good 6th ed Bookman; 2009.
internal consistency. All items have a good degree of rela- 19. Hirano M. Clinical Applications of Voice Tests Assessment of Speech
and Voice Production: Research and Clinical Application NICD Mono-
tionship with each other and contribute positively to the graph Proceedings of a Conference of the National Institute of Health
protocol as a whole. The final version of the SAP, at this NICD. 1990.
stage, consists of 15 items, distributed among three factors. 20. Hammarberg B, Gauffin J. Perceptual and acoustical characteristics of
quality differences in pathological voices as related to physiological
aspects. In: Fujimura O, Hirano M, eds. Vocal Fold Physiology, Voice
DECLARATION OF INTEREST Quality Control. San Diego: Singular Publishing Group; 1995:
There are no conflicts of interest to declare. 283–303.
12 Journal of Voice, Vol. &&, No. &&, 2020

21. Rodríguez-Parra MJ, Adrián JA, Casado JC. Comparing voice-ther- 27. Pontes PAL, Vieira VP, Gonçalves MIR, et al. Características das
apy and vocal-hygiene treatments in dysphonia using a limited multidi- vozes roucas, ásperas e normais: análise ac ustica espectrográfica com-
mensional evaluation protocol. J Commun Disord. 2011;44:615–630. parativa. Rev Bras Otorrinolaringol. 2002;68:182–188. 10.1590/S0034-72992002000200005.
22. Cielo CA, Ribeiro VV, Bastilha GR, et al. Quality of life in voice, per- 28. Barrichelo-lindstro V, Behlau M. Resonant voice in acting students:
ceptual-auditory assessment and voice acoustic analysis of teachers perceptual and acoustic correlates of the trained Y-Buzz by Lessac. J
with vocal complaints. Audiol Commun Res. 2015;20:130–140. https:// Voice. 2009;23:603–609. 29. Gama ACC, Behlau MS. Estudo da const^ancia de medidas ac usticas
23. Rees CJ, Blalock PD, Kemp SE, et al. Differentiation of adductor-type de vogais prolongadas e consecutivas em mulheres sem queixa de voz e
spasmodic dysphonia from muscle tension dysphonia by spectral anal- em mulheres com disfonia. Rev da Soc Bras Fonoaudiol. 2009;14:8–14.
ysis. Otolaryngol Neck Surg. 2007;137:576–581.
10.1016/j.otohns.2007.03.040. 30. C^ortes MG, Gama ACC. Análise visual de par^ametros espectrográfi-
24. Barrichelo VMO, Heuer RJ, Dean CM, et al. Comparison of singer’s cos pré e pos-fonoterapia para disfonias. Rev da Soc Bras Fonoaudiol.
formant, speaker’s ring, and LTA spectrum among classical singers 2010;15:243–249.
and untrained normal speakers. J Voice. 2001;15:344–350. https://doi. 31. Zhang Y, Jiang JJ, Wallace SM, et al. Comparison of nonlinear
org/10.1016/S0892-1997(01)00036-4. dynamic methods and perturbation methods for voice analysis. J Acoust
25. Hanayama EM, Camargo ZA, Tsuji DH, et al. Metallic voice: physio- Soc Am. 2005;118:2551–2560.
logical and acoustic features. J Voice. 2009;23:62–70. 32. Roark RM, Watson BC, Baken RJ, et al. Measures of vocal attack
10.1016/j.jvoice.2006.12.006. time for healthy young adults. J Voice. 2012;26:12–17.
26. Beber BC, Cielo CA. Características da espectrografia de banda larga e 10.1016/j.jvoice.2010.09.009.
estreita da emiss~ao vocal de homens com laringe sem afecç~ oes. Rev 33. Vieira VP, Biase N De, Pontes P. Análise ac ustica e perceptiva auditiva
C E F A C. 201 2;1 4:29 0–2 97 . h tt ps://doi. org/10 .15 90/S 151 6- versus coaptaç~ao gl
otica em alteraç~ao estrutural mínima. Acta ORL.
18462012005000008. 2006;24:174–180.

