"a"
gmail "dot"
com
Here the data, graphs and analyses scripts (R code) are provided for the paper:
Using context to resolve object pronouns, Jacolien van Rij, Bart Hollebrandse, and Petra Hendriks
Submitted for publication in:
Empirical perspectives on anaphora resolution: Information structural evidence in the race for salience, edited by Anke Holler, Christine Goeb and Katja Suckow.
Note: The code for the eyetracking GAMM analyses is outdated due to updates in the packages mgcv and itsadug. See Porretta, Kyröläinen, van Rij, & Järvikivi (2017) for a more recent description of a GAMM analysis of VWP gaze data.
Response data analysis \(\rightarrow\)
Gaze data analysis \(\rightarrow\)
41 Children (46 yrs old) and 36 adult participants performed the experiment. Behavioral responses and gaze data were recorded.
Our study investigates whether and how children who do not have adultlike object pronoun interpretation make use of context to resolve Dutch object pronouns. We used a PictureVerification Task, in which participants were asked to indicate whether the sentence they heard was a correct description of the picture presented on the screen. We contrasted a singlereferent context with tworeferent contexts and the order of referent introduction within the tworeferent contexts. An overview of the different conditions is listed in the table below.
We tested a 2 x 3 x 2 design, defined by the predictors Image Type (selforiented, otheroriented), Context (P,PA, AP) and Sentence Type (Reflexive, Pronouns).
Image type:  Context sentence:  Test sentence:  

selforiented  P: Here you see a rabbit.  The squirrel points at himself …  congruent 
PA: Here you see a rabbit and a squirrel.  
AP: Here you see a squirrel and a rabbit.  
selforiented  P: Here you see a rabbit.  The squirrel points at him …  incongruent 
PA: Here you see a rabbit and a squirrel.  
AP: Here you see a squirrel and a rabbit.  
otheroriented  P: Here you see a rabbit.  The squirrel points at himself …  incongruent 
PA: Here you see a rabbit and a squirrel.  
AP: Here you see a squirrel and a rabbit.  
otheroriented  P: Here you see a rabbit.  The squirrel points at him …  congruent 
PA: Here you see a rabbit and a squirrel.  
AP: Here you see a squirrel and a rabbit. 
Examples of the presented visual stimuli:
Selforiented image:  Otheroriented image: 

Number of test items per participant (32 in total):
Pronoun  Reflexive  

Context P  8  8 
Context PA  4  4 
Context AP  4  4 
All analysis were performed in R, using the packages mgcv
and itsadug
.
Installing the packages from CRAN:
install.packages('itsadug', repos="http://cran.us.rproject.org")
# and the same for other packages...
Load the packages for use, and check versions.
R.version.string
## [1] "R version 3.2.1 (20150618)"
# For GAMMs:
library(mgcv)
## Loading required package: nlme
## This is mgcv 1.87. For overview type 'help("mgcvpackage")'.
# For GAMM interpretation and visualization:
library(itsadug)
## Loaded package itsadug 1.0.1 (see 'help("itsadug")' ).
# for generating this R Markdown report the
# info messages are put on:
infoMessages('on')
# load package MASS for calculating inverse later:
library(MASS)
# load package plyr for calculating averages:
library(plyr)
# For printable plot colors:
library(sp)
The answers of the child and adult participants were converted into two measures based on Signal Detection Theory (SDT; Macmillan and Creelman 2004; Stanislaw and Todorov 1999):
The sensitivity \(d'\) reflects how well participants can distinguish between congruent and incongruent trials, with a higher (positive) value of \(d'\) indicating more correct “yes”" responses on congruent trials and fewer incorrect “yes” responses on incongruent trials (cf. Başkent et al. 2013).
The response bias \(C\) reflects the difference between the participant’s bias and an ideal observer bias. In other words, \(C\) reflects the participants’ answering strategy:
a value around zero indicates that participants are equally likely to say “yes” to congruent and incongruent items,
a positive value indicates that participants are more likely to give incorrect responses on congruent items than to incongruent items (“no” bias),
and a negative value indicates that participants are more likely to give incorrect responses on incongruent items than to congruent items (“yes” bias).
In our experiment, we treated the Congruent items as the ‘signal’, and the Incongruent items as ‘noise’. In other words, in the SDT analyse we are interested in whether participant responded differently to congruent items i.e., match between picture and referring expression (e.g. otheroriented action in picture with a pronoun being presented), and incongruent items, i.e., mismatch between picture and referring expression. The responses are relabeled following the classification of the SDT, as illustrated in the table below.




response: \(\downarrow\)  pronoun  reflexive  pronoun  reflexive  
“yes”  HIT  FALSE ALARM  FALSE ALARM  HIT  
“no”  MISS  CORRECT REJ.  CORRECT REJ.  MISS 
The SDT differentiates four different response types (hit, miss, false alarm and correct rejection), rather than two (correct and incorrect). As a result, it can disentangle the participant’s sensitivity to the stimuli from potential response biases.
The sensitivity \(d'\) reflects how well participants can distinguish between congruent and incongruent trials, with a higher (positive) value of \(d'\) indicating more correct “yes”" responses on congruent trials and fewer incorrect “yes” responses on incongruent trials.
The response bias \(C\) reflects the difference between the participant’s bias and an ideal observer bias. In other words, \(C\) reflects the participants’ response strategy:
a value around zero indicates that participants are equally likely to say “yes” or “no” to congruent and incongruent items,
a positive value indicates that participants are more likely to say “no” to congruent items than to incongruent items,
and a negative value indicates that participants are more likely to say “yes” to incongruent items than to congruent items.
The interactive plot below shows how the SDT measures \(d'\) and \(C\) relate to the accuracy of “signal” items (i.e., congruent items) and “noise” items (i.e., incongruent items). Note: The plot assumes an equal number of congruent and incongruent items.^{1}
The SDT measures are calculated from the count of responses in each of the response categories. Note that the number of items or responses per condition changes the granularity of these measures (i.e., the potential values these measures can take). Calculate the SDT measures in the interactive plot based on 10 items (default), or based on 8 items, or on 5 items (type these numbers in the field Adjust number of items:
, followed by ENTER). The accuracy values might change, and also the \(d'\) and \(C\) measures.
Our experiment listed only 4 items in each of the tworeferents contexts, but 8 items in the singlereferent condition. To avoid differences due to granularity, we split the singlereferent condition into two groups, and calculated the averages over these two groups. Additionally, we performed an analysis that collapsed the tworeferent context conditions to compare the singlereferent context with a tworeferent context, ignoring the order of referent introduction. The results of the two analyses are highly similar.
The “yes”bias observed in language aquisition studies (e.g., Chien and Wexler, 1990; van Rij, van Rijn, and Hendriks 2010) is characterized by a high accuracy on congruent items, but low accuracy on incongruent items (i.e., saying “yes” all the time). This translates to a negative \(C\) measure.
Relabel responses. Depending on the visual context (ie., image on the screen) the pronouns and reflexive forms are considered as signal and noise. See Table.
The number of responses in each of these categories are counted per participant per condition.
Calculate hit rate and falsealarm rate and convert these to zscores using the normal quantile function \(\Phi^{1}\).
Calculate the SDT measures \(d'\) and \(C\).
All analyses were performed with Generalized Additive Mixedeffects models (GAMM; Lin and Zhang (1999)) as implemented in the R package mgcv
(Wood 2006; Wood 2011). In contrast to linear regression models, such as linear regression or liner mixedeffects models, GAMMs do not assume that the relation between predictors and the dependent variable (the measure, e.g. \(d'\), or accuracy) is linear. Thus, GAMMs are a nonlinear regression method. This is useful for many psycholinguistic data sets including the data being analyzed in this paper, show nonlinear patterns. For example, we have no reason to assume that the sensitivity \(d'\) changes linearly with age, and we also have no reason to assume that the change in gaze position over time in the trial develops linearly.
The next sections provide the R scripts for the response data analysis (with SDT measures), and for the gaze data analysis.
Başkent, Deniz, Jacolien van Rij, Zheng Yen Ng, Rolien Free, and Petra Hendriks. 2013. “Perception of Spectrally Degraded Reflexives and Pronouns by Children.” The Journal of the Acoustical Society of America 134 (5): 3844–52.
Lin, X., and D. Zhang. 1999. “Inference in Generalized Additive Mixed Modelsby Using Smoothing Splines.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 61: 381–400.
Macmillan, Neil A., and C. Douglas Creelman. 2004. Detection Theory: A User’s Guide. Psychology Press.
Stanislaw, Harold, and Natasha Todorov. 1999. “Calculation of Signal Detection Theory Measures.” Behavior Research Methods, Instruments, & Computers 31 (1): 137–49.
Wood, Simon N. 2006. Generalized Additive Models: An Introduction with R. Chapman; Hall/CRC.
———. 2011. “Fast Stable Restricted Maximum Likelihood and Marginal Likelihood Estimation of Semiparametric Generalized Linear Models.” Journal of the Royal Statistical Society (B) 73 (1): 3–36.
Refresh if plot is inactive. In case the interactive plot is not loaded, please inspect the interactive plot at http://jacolienvanrij.shinyapps.io/SDTmeasures.↩