Lab Notebook

Sevilla, Laurence
Provisional Title: Setting word order in Tagalog: Voice, Agency, or Syntactic Prominence?
Research Questions and Hypotheses:
The research questions and hypotheses of this experiment are detailed below:
Does Tagalog follow the agent first principle (AFP) and/or the subject first principle (SFP)?
Does the choice of voice in a symmetrical voice system influence whether agents and/or subjects come first?
Do agents and subjects coincide in Tagalog?
Background
Word order studies are useful not only at the typological level such as for confirming universals, but also in
discovering differences in cross-linguistic processing. To that end we will be looking into whether preference
principles applicable in well-studied language families such as Indo-European are also being followed in a
language from a different family.
Tagalog is an ergative Austronesian language with a symmetrical voice system (Foley, 2008), where both
agents and patients can both be mapped to the highest syntactic functions, without the demotion of the agent
(Schacter & Otanes, 1972). Although the canonical word order is VSO, it has relatively free word order, al-
lowing both SO and OS configurations, as well as an SVO order derived by topicalization. In canonical tran -
sitive sentences, one argument is marked with ang and the other one with ng. The ang argument is consid-
ered the only syntactically privileged argument as it is the only one that agrees with the verb (Kroeger,
1993). This relationship is reflected by the voice morphology on the verb itself, as either a prefix, an infix, or
both. Although Tagalog has markings for paradigms other than actor voice (AV) and patient voice (PV), we
will focus on these two as they are semantically parallel to the active and passive voice found in other lan -
guages.
When the agent or actor is the ang marked argument, the verb carries the AV marker; however if it is the pa-
tient that is ang marked, then the verb is instead inflected for PV. There is no single marker for each voice
paradigm because Tagalog also incorporates aspect into the same marker; thus the markers are actually a
combination of voice and aspect. For this experiment we will be focusing on two infixes, <um> for AV and
<in> for PV. Both of these affixes also denote [begun] aspect.
Dryer (2011) performed a thorough analysis of the word orders of 1377 languages and found that roughly
89% of languages are subject-first. Kemmerer (2012) explains that this is due to transitive sentences being
used to encode the “prototypical transitive action scenario” where animate agents cause something to happen
to inanimate patients. This interaction, when placed in a chain, results in subjects being placed before ob-
Sevilla, Laurence
jects. He further says that in Broca’s area, certain neural mechanisms responsible for the interpretation of hu-
man actions favor clauses where agents are mapped to the highest syntactical function.
NOTE: Kemmerer mentions a lot of things about BA44 and how it relates to word order processing, this
could be useful for TFM
The current study will focus on two related hypotheses, the Subject First Preference (SFP) and the Agent
First Preference (AFP). The predictions of these two hypotheses coincide with Kemmerer’s (2012) findings,
though they are still fundamentally different.
The SFP posits that subjects will be preferentially uttered first because of their syntactic prominence (Levelt,
1989). The general principle on which this is based is subject salience, which simply states that subjects pref-
erentially come before objects (Comrie, 1989). Typologically, this preference is relevant as it reflects Green -
berg’s (1963) Universal #1:
“In declarative sentences with nominal subject and object, the dominant order is
almost always one in which the subject precedes the object.”
The AFP, also known as the Agent Advantage, states that people preferentially mention Agents before Pa -
tients because the former initiate the start of event representation. In a self-paced viewing paradigm using
comic strips instead of words, Cohn & Paczynski (2013) found that event predictions were elicited easier by
Agents than Patients and were processed more quickly when the character shown after the action was the
Agent.
In eye-tracking studies, it has been found that English speakers produce sentences while fixating on objects
sequentially. This means that they gaze from the first character mentioned to the next, indicating that syntax
interacts with word order. Tzeltal speakers on the other hand dedicate early fixations to encode events when
they produce verb-first sentences. The following fixations then proceed sequentially, also following charac-
ter order (Norcliffe et al. 2015). Similar results to Norcliffe et al. (2015) were found for Tagalog, where par -
ticipants behaved in the same way as Tzeltal speakers producing predicate-first sentences (Sauppe et al.
2013). This points towards dependency setting being the first step in sentence processing for predicate-initial
languages. This dependency is the interaction between the agent and the patient, encoded linguistically by
the predicate. Since the verb comes first in Tagalog and it is marked for Voice, we could see any possible in -
teractions between Voice and any agent/subject preferences.
NOTE: Perhaps mention that this would be hard (if not impossible) to do with SOV or SVO languages - or
maybe that would be better suited for Relevance of the Study. Perhaps also mentions Sauppe (2016) where he
found using eye-tracking studies that verbal semantics influenced which object listeners gazed to next: he
Sevilla, Laurence
found that participants gazed at the agent immediately after hearing the verb, whether it was the subject or
not. This means Sauppe’s findings would match H2a2’s predictions.
For the purposes of this experiment we will be using the self-paced reading moving window paradigm (Just,
Carpenter, & Woolley, 1982). Each critical item will have four variations, each corresponding to the four
conditions. Recall that AV = Actor Voice, PV = Patient Voice. For the sake of simplicity, we will label the
ang argument as the subject (SU), and the ng argument the object (OB), followed by its role in the sentence
(ag for agent; pt for patient).
NOTE: The last sentence is clunky and perhaps it might be worth noting that there is still no consensus on
whether Tagalog is even ergative at all, hence the use of subject and object. The last analysis I’ve read was
Aldridge (2012), and she did provide compelling evidence showing that Tagalog is ergative by analyzing AV
constructions as antipassives. Sauppe et al (2013) used SPA and NSPA for syntactically preferred/non-syn-
tactically preferred
(1) AV SU.ag OB.pt; (2) AV OB.pt SU.ag; (3) PV SU.pt OB.ag; (4) PV OB.ag SU.pt
The sentences will be divided in segments, as shown below:

1 2 3 4 5 6 7 8 9
B<um>asag // ang // mayor // ng // pinggan // sa // tindahan // noong // sabado
The last segment is intended to prevent any potential wrap-up effects and to counter-balance sentence length
of critical items with that of the fillers. All agents were animate, and all patients were inanimate to exclude
possible animacy effects on word order. Both verbs and nouns used were selected from the tlTenTen Tagalog
web corpus (sketchengine). To control for frequency and predictability effects, high frequency words were
counterbalanced with low frequency words. The critical segments are segments 2 and 3.
Hypotheses and Predictions
The hypotheses and predictions are as follows:
H01: There is a significant difference in reading times when agents come first in a sentence compared to
when they are not
Sevilla, Laurence
If we expect the AFP to hold, then reading times should be lower for segment 3 when it contains the agent,
compared to when it contains the patient
HA1: There is no significant difference in reading times when agents come first in a sentence compared to
when they are not
Reading times would be the same for segment 3 regardless of it being the agent or the patient
H02: There is a significant difference in reading times when subjects come first in a sentence compared to
when they are not
If we expect the SFP to hold, then reading times should be lower for segment 2 when the word is ang, com-
pared to when it is ng, irrespective of voice used
HA2: There is no significant difference in reading times when subjects come first in a sentence compared to
when they are not
Reading times would be the same for segment 2 regardless of the word being ang or ng
H03: There is an interaction of subject position on agent sentence position for reading times
If we expect both the AFP and SFP to coincide (like in English), then reading times for segment 3 would be
lower when it is an agent and it is preceded by ang in the AV condition, and when it is a patient in the PV
condition.
HA3: There is no interaction of subject position on agent sentence position for reading times
Reading times for segment 3 would be the same whether is preceded by ang or ng, both in the AV condition
and in the PV condition
H04: There is an interaction of Voice chosen on subject sentence position for reading times
If voice chosen affects the position of the subject in the sentence, then reading times for segment 2 should be
different within the AV condition and the PV condition
HA4: There is no interaction of Voice chosen on subject sentence position for reading times
Reading times for segment 2 would not be different within the AV condition and the PV condition
Sevilla, Laurence
H05: There is a main effect of Voice chosen on agent sentence position for reading times
If voice chosen affects the position of the agent in the sentence, then reading times for segment 3 should be
different within the AV condition and the PV condition
HA5: There is no effect of Voice chosen on agent sentence position for reading times
Reading times for segment 3 would not be different within the AV condition and the PV condition
H06: There is an interaction between Voice chosen, subject sentence position and agent sentence position
If there is indeed an anti-P bias and both the SFP and AFP are respected, then condition 1 will be easier to
process and will have a lower over-all reading time.
HA6: There is no interaction between Voice chosen, subject sentence position and agent sentence position
Over-all reading times for condition 1 would not be significantly lower compared to the other conditions
NOTE: H6 is not discussed above but is also possible to measure. It’s related to the anti passive bias (I read
it somewhere but can’t find the source!!!). Sauppe (2017) provides evidence for higher cognitive load for
passive sentences ergo AV is preferred, but corpus studies show otherwise (Cooreman, Fox, & Givón, 1984).
Maybe there are too many hypotheses? Then again this has more to do with statistical testing so in paper this
would be shorter. Maybe no need to mention the alt hypotheses
Experiment Design Hypothetical
Participants: 300 right-handed native speakers of Tagalog (150 males; M age = 23 y/o) from Metro Manila,
the capital of the Philippines, participated in the web-based experiment. All participants had at least univer -
sity level education, to control for possible literacy effects (e.g. Dehaene et al, 2010). Participants had normal
or corrected-to-normal vision. A post-test confirmed that the participants remained naïve to the purposes of
the experiment.
Materials: The linguistic stimuli used consisted of sentences formed from a pool of 24 verbs, 24 animate
nouns, and 48 inanimate nouns which were selected while controlling for frequency, imaginability and word
length as these are known factors that affect processing costs. Sentence targets were sentence-final nor sen -
Sevilla, Laurence
tence-initial to avoid starting and wrap-up effects. In this experiment the critical segments are segments 2
and 3. Care was taken in the selection and combination of the verbs and nouns to avoid possible predictabil -
ity effects. Before using the sentences in the experiment, we performed a norming study by asking ten native
Tagalog speakers to take part in an acceptability judgment task where they rated the sentences from 1-7. Any
sentence that got a score less than 5 was replaced along with the other sentences under the same item. In this
experiment the critical segments are segments 2 and 3.
Comprehension questions were then created for both critical items and fillers. The comprehension questions
for the critical items were either: “Who VERB the PATIENT?” Or “What VERB the AGENT?”. Participants
had to answer the questions by selecting one of two options presented to them. For the first question, the
choices were both agents, for the second, they were both patients. The questions were randomized across all
conditions for the critical items. The sentences and questions were then coded to IBEX Farm, a JavaScript
web-based tool for handling self-paced reading experiments (Drummond, 2019).
In total we had 192 experimental items divided into 4 lists in a latin square design, with each verb appearing
twice. Each list had 102 fillers. The four conditions are repeated as follows:
(1) AV SU.ag OB.pt; (2) AV OB.pt SU.ag; (3) PV SU.pt OB.ag; (4) PV OB.ag SU.pt
Procedure: Participants were sent the IBEX Farm link via e-mail, along with a short questionnaire regarding
their use and knowledge of Tagalog. The latin squaring was done automatically by IBEX Farm, so each par -
ticipant using the same link would view a different version of the experiment. Upon opening the experiment,
the participants were shown the instructions with a reminder not to do the experiment when they are tired
and to only start it when they were sure they could finish it in one sitting. The instructions were then fol -
lowed by ten practice items, some with comprehension questions. The practice items were syntactically dif-
ferent from the critical items. The experimental items along with the fillers were then shown to the partici -
pants. Participants had to press the space bar to reveal the first segment, and again to reveal the following
segment. Each segment replaced the previous segment displayed, meaning there was no way to go back once
the space bar was pressed. After pressing the space bar on the last segment, a comprehension question was
shown, which had a 5000 ms time limit. On average, it took participants 25 minutes to finish the experiment.
Participants who correctly answered less than 80% of the questions were excluded. Likewise, critical items
with no or incorrect answers to the corresponding question were also excluded. The reading times for seg -
ments 2 and 3 were then analyzed in SPSS.
Statistical Analysis
H1: T-test: segment 3 RT as dep. var.; agent/patient as indep. var.

Main effect of Agent/Patient on segment 3 RT: support for AFP
Sevilla, Laurence
H2: T-test: segment 2 RT as dep. var.; subject/object as indep. var.

Main effect of Subject/Object on segment 2 RT: support for SFP
H3: Mixed-models ANOVA: segment 3 RT as dep, var; subject/object as between groups, agent/patient as
within groups
Interaction between subject/object and agent/patient: supports hypothesis that subject/object influences
agent/patient
H4: Mixed-models ANOVA: segment 2 RT as dep. var.; AV/PV as between groups, subject/object within
groups
Interaction between AV/PV and subject/object: supports hypothesis that voice affects subject/object position
H5: Mixed-models ANOVA: segment 3 RT as dep. var.; AV/PV as between groups, agent/patient within
groups
Interaction between AV/PV and agent/patient: supports hypothesis that voice affects agent/patient position
H6: Mixed-models ANOVA: total RT as dep. var.; AP/VP as between groups, subject/object and agent/pa-
tient as within groups
Interaction between AV/PV, agent/patient, subject/object: supports hypothesis that there is a preferred voice
Discussion
NOTE: Discussion would vary based on hypothetical results and to be honest, because of conflicting results
found, there is no way to say whether the null will be accepted or rejected for each of the 6 hypotheses. If we
find that the SFP holds but not the AFP, then this could be interpreted as word order being part of syntactic
planning. If instead the AFP holds but not the AFP, then word order is generated during thematic role assign-
ment. If both hold, then we confirm that agents and subjects tend to coincide due to the cause-chain effect of
transitive sentences. If neither hold, then we would have to think why.
Furthermore, if voice affects subject position and SFP is followed, then dependency setting influences sub-
ject position. If voice affects agent position and AFP is followed but subject position does not influence
agent position, then dependency setting also influences agent position. Otherwise dependency setting is just
the planning of the relationship between the agent/subject and the patient/object, but does not have anything
to do with word order. Lastly, if there is interaction between all three (AV/PV, subject/object, and agent/pa-
tient), and condition 1 has the lowest RT, then speakers prefer canonical transitive sentences with agents
and subjects coming first.
Sevilla, Laurence
Bibliography:
Cohn, N. & Paczynski, M. (2013). Prediction, events, and the advantage of Agents: The processing
of semantic roles in visual narrative. Cognitive Psychology, 67(3), 73-97.
Comrie, B. (1989). Language Universals and Linguistic Typology. Oxford: Blackwell. (2nd edition).
Drummond, A. (2019). IBEX Farm [Internet-based Experiment Tool]. (2014). Retrieved from
http://spellout.net/ibexfarm.
Dryer, M. S. (2011). Order of subject, object and verb. The world atlas of language structures
online, ed. by Matthew S. Dryer and Martin Haspelmath. Munich: Max Planck Digital Library.
[Online].
Foley, W. A. (2008). The place of Philippine languages in a typology of voice systems. In

Austin, P. K. & Musgrave, S. (Eds.), Voice and grammatical relations in Austronesian
languages (pp. 22–44). Palo Alto: CSLI Publications.
Greenberg, J. H. (1963). Some Universals of Grammar with Particular Reference to the Order of
Meaningful Elements. In Greenberg, Joseph H. (ed.), Universals of Human Language,
73-113. Cambridge, MA: MIT Press.
Just, M. A., Carpenter, P. A., & Woolley, J. D. (1982). Paradigms and processes in reading
comprehension. Journal of Experimental Psychology: General, 111, 228-238
Kemmerer D. (2012). The cross-linguistic prevalence of SOV and SVO word orders reflects the
sequential and hierarchical representation of action in Broca’s area. Language and
Linguistics Compass, 6(1):50–66.
Kroeger, P. (1993). Phrase structure and grammatical relations in Tagalog. Palo Alto: CSLI
Publications.
Levelt, W. J. M. (1989). Speaking: From intention to articulation. MIT Press, Cambridge, MA.
Norcliffe, E., Konopka, A. E., Brown, P., & Levinson, S. C. (2015). Word order affects the time
course of sentence formulation in Tzeltal. Language, Cognition, and Neuroscience, 30(9),
1187-1208.
Sauppe, S. (2016). Verbal Semantics Drives Early Anticipatory Eye Movements during the
Comprehension of Verb-Initial Sentences. Frontiers in Psychology, 7(95).
Sauppe, S. (2017). Word Order and Voice Influence the Timing of Verb Planning in German
Sentence Production. Frontiers in Psychology, 8.
Sauppe, S., Norcliffe, E., Konopka, A. E., Van Valin Jr., R. D., & Levinson, S. C. (2013).
Dependencies First: Eye Tracking Evidence from Sentence Production in Tagalog. In M.
Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual
Meeting of the Cognitive Science Society (CogSci 2013) (pp. 1265-1270). Austin, TX:
Cognitive Science Society.
Schachter, P., & Otanes, F. T. (1972). Tagalog Reference Grammar. University of California Press.
Sevilla, Laurence

Lab Notebook

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lab Notebook

Uploaded by

Copyright:

Available Formats

Sevilla, Laurence

Research Questions and Hypotheses:

The sentences will be divided in segments, as shown below:

Hypotheses and Predictions

The hypotheses and predictions are as follows:

Experiment Design Hypothetical

H1: T-test: segment 3 RT as dep. var.; agent/patient as indep. var.

H2: T-test: segment 2 RT as dep. var.; subject/object as indep. var.

Foley, W. A. (2008). The place of Philippine languages in a typology of voice systems. In

You might also like