Method Guidelines

Grammaticality Judgments as Linguistic Evidence
Methodological Guidelines for Measuring Grammaticality Brian Murphy

Language, Interaction and Computation Lab Centre for Mind/Brain Sciences (CIMeC) University of Trento
ESSLLI 2009, Bordeaux
Brian Murphy (CIMeC)
Grammaticality: Methods Guidelines
ESSLLI 2009
1 / 23
Course Outline
Notions of Grammaticality, and Current Practice in Linguistics Others Sources of Linguistic Evidence Scales for Measuring Grammaticality
Methodology for Eliciting Judgements of Grammaticality

Objectives of Experimentation Ideal and practical choices Software Materials Statistics
Theoretical Implications
ESSLLI 2009
2 / 23
Objectives of Experimentation I
Validity: are we measuring what we aim to measure? Reliability: are your data purely the result of chance? Replicability: do other researchers have enough information to `rerun' your experiment? Generalizability: do your results apply to other sentences/speakers than the ones you Otherwise, a eld is likely to spend a lot of time following up bad leads and chasing its own tail
ESSLLI 2009
3 / 23
Objectives of Experimentation II
Validity: are we measuring what we aim to measure?
as we have seen, intuitions are most often the best measure we have of grammaticality but we have to be aware of strategising, misinterpretation of instructions, subconscious biases [Armstrong et al., 1983]
ESSLLI 2009
4 / 23
Objectives of Experimentation III
Reliability: are your data purely the result of chance?
this is what inferential statistics are for
ESSLLI 2009
5 / 23
Objectives of Experimentation IV
Replicability: do other researchers have enough information to `rerun' your experiment?
ask journals about online appendices; or put supplementary materials on your website describe your methods very briey
ESSLLI 2009
6 / 23
Objectives of Experimentation V
Generalizability: do your results apply to other sentences/speakers than the ones you tested?
are your materials representative of the structure and language variety you're interested? are your speakers representative of the speech community you're interested in? how are you dealing with confounds?
ESSLLI 2009
7 / 23
The Ideal Experiment I

All other things being equal (and of course they never are) you would like your experiment to have the following characteristics: As many informants as possible, minimum 10 As many judgements for each sentence as possible, minimum 10, preferably 20 No single participant should see the same sentence twice The informant should not know your hypothesis You know the informant's age, profession, educational background, dialect, handedness, ... Test items should be interspersed with llers
ESSLLI 2009
8 / 23
The Ideal Experiment II
Materials should be authentic Context should be supplied to control interpretation Presentation order should be randomised Each informant should see a similar proportion of each type of sentence (counter-balancing) The experimenter should not be present during the experiment The informants should not be known to the experimenter ... and probably some other things I've forgotten about
ESSLLI 2009
9 / 23
What Matters More? I
Featherston 2009: Relax, lean back, and be a linguist plausibility of content of experimental materials sentence or phrasal length and complexity, for example ... ... referential abstractness (eg too many pronouns) unclear meaning low accessibility of intended interpretation
ESSLLI 2009
10 / 23
What Matters More? II
Brian's pet list: Quantity of sentences and informants Clear, naturalistic instructions Training/calibration
ESSLLI 2009
11 / 23
What Matters More? III
ESSLLI 2009
12 / 23
What Matters More? IV
The X axis represents the number of responses, the Y axis the mean score up to that point. Symbols represent the following sentence stimuli: squares: ... it may take a little longer diagonal crosses: I was surrounded by an endless sorrow diamonds: Did anyone order me a plain cheese triangles: Throw the idol me crosses: Stop thinking sex about
ESSLLI 2009
13 / 23
What Matters Less? I

Featherston social, educational and professional status of subjects dialect background of subjects (except ...) the sex, age, and handedness of subjects a degree of linguistics training among subjects (but ...) frequency in lexis (as long as extremes are avoided) speed of response required structure type being tested (statements, questions, dialogues) carrying out experiments over the net or on paper the precise methodology, as long as it asks subjects about their receptive responses to examples
ESSLLI 2009
14 / 23
Software I
Lab-based (roughly in order of ease of use):
E-Prime, Presentation: aimed at psychologists, windowing-based, can build your rst experiment in an hour with no programming skills, no particular support for linguistics, very accurate timing, no inbuilt stats; expensive (see if you can use a copy from you Psychology department) Psychotoolbox for Matlab: aimed at vision researchers, code/windowing based, need scripting skills, no particular support for linguistics, very accurate timing, statistics for free, kind-of-free (toolbox is free; many universities have campus licenses for Matlab) PyEPL, PsychoPy, both for Python: aimed at vision researchers, code based, need scripting skills, no particular support for linguistics, very accurate timing, statistics for free, free
ESSLLI 2009
15 / 23
Software II
Internet-based (work in labs too):
Linguist-GRID: aimed at linguists, web-form based, very easy, DEFUNCT (but on source-forge still)? LimeSurvey: aimed at social sciences, web-form/conguration-le based, user-access control, no timing, has to be installed on server, data/informant/session management, summary of results, free WebExp for Java: aimed at linguists and psychologists, perfectly adequate timing, conguration le based, has to be installed on webserver, data/informant/session management, not-quite-platform-independent, free
ESSLLI 2009
16 / 23
Software III
Paper/Mail-based, for your inner-Luddite: MiniJudge [Myers, 2007, Myers, 2009]: `small-scale experimental syntax':
Experimental sentences only (no llers) Only as many sentence sets as are needed for statistical validity (around 5) Only as many speakers as are needed for statistical validity (around 7) All speakers get all sentences (no counterbalancing of sentence lists) Binary YES/NO judgments Maximum of two binary factors Random sentence order, which is treated as a factor (helps control for some processing eects)
ESSLLI 2009
17 / 23
Materials
Nowadays, authentic materials are easy to nd Google: gave * the * Yahoo API: (gave * the *) OR (send * the *) OR (send * * a *) ... www.webcorp.org.uk: g[i|a]v[ing|e|es|en] * to * corpus.leeds.ac.uk (Wacky, [Baroni et al., 2009]): "give|gave|given" [pos="DT"] [pos="N.*"] "to" [pos="N.*"] lse.umiacs.umd.edu: (S1 (VP (VB give) NP (PP (TO to) NP)))
ESSLLI 2009
18 / 23
Statistics I
Download and install R, probably the most important software package in science (www.cran.org) Do a basic practical statistics course at your department of statistics or psychology Try to understand the t-test, and linear correlation, and their non-parametric alternatives Mann-Whitney and rank correlation Ignore theory of scales [Stevens, 1946]
normality of the sampling distribution only thing that matters so test more sentences to improve use parametric statistics
ESSLLI 2009
19 / 23
Statistics II
ESSLLI 2009
20 / 23
Software Wishlist I
ESSLLI 2009
21 / 23
What Comes Last
Notions of Grammaticality, and Current Practice in Linguistics Others Sources of Linguistic Evidence Scales for Measuring Grammaticality Methodology for Eliciting Judgements of Grammaticality
Theoretical Implications
ESSLLI 2009
22 / 23
References I
Armstrong, S. L., Gleitman, L. R., and Gleitman, H. (1983). What some concepts might not be. Cognition, 13:263308. Baroni, M., Bernardini, S., Ferraresi, A., and Zanchetta, E. (2009). The WaCky wide web: a collection of very large linguistically processed web-crawled corpora. Language Resources and Evaluation. Myers, J. (2007). MiniJudge: Software for small-scale experimental syntax. 12(2):175194.
International Journal of Computational Linguistics and Chinese LanguageProcessing,

Myers, J. (2009). The design and analysis of small-scale syntactic judgment experiments. Lingua, 119:425444. Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103:677680.
ESSLLI 2009
23 / 23

Method Guidelines

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Method Guidelines

Uploaded by

Copyright:

Available Formats

Grammaticality Judgments as Linguistic Evidence

Methodological Guidelines for Measuring Grammaticality Brian Murphy

ESSLLI 2009, Bordeaux

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

Methodology for Eliciting Judgements of Grammaticality

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

Validity: are we measuring what we aim to measure?

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

Objectives of Experimentation III

Reliability: are your data purely the result of chance?

this is what inferential statistics are for

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

Replicability: do other researchers have enough information to `rerun' your experiment?

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

The Ideal Experiment I

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

The Ideal Experiment II

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

What Matters More? I

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

What Matters More? II

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

What Matters More? III

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

What Matters More? IV

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

What Matters Less? I

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

Lab-based (roughly in order of ease of use):

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

Internet-based (work in labs too):

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines

What Comes Last

Brian Murphy (CIMeC)

Grammaticality: Methods Guidelines