Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Winter, B. (2013). Linear models and linear mixed effects models in R with linguistic applications. arXiv:1308.5499.
Is there a difference in frequencies for male and female voices? If so, by how much?
Also – does a person’s “Attitude” change the frequency? If so, by how much?
There were 10 women and 10 men. Each voice was measured in 7 different scenarios 2 times, once each for “Polite” or “Informal” attitude.
The difference in politeness level is represented in the column called “attitude”. In that column, “pol” stands for polite and “inf” for informal. Sex is represented as “F” and “M” in the column “gender”. The dependent measure is “frequency”, which is the voice pitch measured in Hertz (Hz). To remind you, higher values mean higher pitch.
Use this data to complete the following lab. At the end of the lab are some final notes and comments provided by the author of this work: Bodo Winter at University of California, Merced
Part 0.
For a start, we need to install the R package lme4 (Bates, Maechler & Bolker, 2012).
install.packages(“lme4”)
After installation, load the lme4 package into R with the following command:
library(lme4)
Now, you have the function lmer() available to you, which is the mixed model equivalent of the function lm()
Load the data set and save the object as a data frame called “politeness”.
Part 1.
Now, you have a data frame called politeness in your R environment. You can familiarize yourself with the data by using head(), tail(), summary(), str(), colnames()… or whatever commands you commonly use to get an overview of a dataset.
Also, it is always good to check for missing values: which(is.na(politeness$frequency))
Or, alternatively, you can use the following: which(!complete.cases(politeness))
Part 2. Some EDA
boxplot(frequency ~ attitude*gender, col=c("white","lightgray"),politeness)
What other plots may be useful at this point? Consider univariate and bi-variate graphs and summaries of the variables.
A) Include 2 figures in this section, and comment on what you observe from the figures.
Part 3. Model Building
politeness.model = lmer(frequency ~ attitude + (1|subject) + (1|scenario), data=politeness)
The last command created a model that used the fixed effect “attitude” (polite vs. informal) to predict voice pitch, controlling for by-subject and by-item variability. We saved this model in the object politeness.model. Use summary() to display the full result:
summary(politeness.model)
B) Explain briefly why attitude is modeled using a fixed effect (and not random).
C) Explain briefly why subject and scenario are modeled as random effects (and not fixed).
D) Calculate the Intra-class Correlation Coefficients:
E) Add Gender as a Fixed Effect to your model. How did adding “gender” change the amount of variability associated with the random effects?
Part 4. Testing
P-values:
“Unfortunately, p-values for mixed models aren’t as straightforward as they are for the linear model. There are multiple approaches, and there’s a discussion surrounding these, with sometimes wildly differing opinions about which approach is the best.” See page 160 of your BYSH text for comments on significance and the use of REML.
F) Conduct a Likelihood Ratio Test for the effect of Attitude:
politeness.null = lmer(frequency ~ gender + (1|subject) + (1|scenario), data=politeness, REML=FALSE)
politeness.model = lmer(frequency ~ attitude + gender + (1|subject) + (1|scenario), data=politeness, REML=FALSE)
anova(politeness.null,politeness.model)
G) Provide a formal statement of your conclusion to the question: Does attitude effect frequency of the voice? (Your statement should include the test statistic, p-value, and estimate of change with the standard error).
Part 5. Random Intercepts versus Slopes
Let’s have a look at the coefficients of the model by subject and by item:
coef(politeness.model)
The fixed effects (attitude and gender) are all the same for all subjects and items. Our model is what is called a random intercept model. In this model, we account for baseline-differences in pitch, but we assume that whatever the effect of politeness is, it’s going to be the same for all subjects and items. But is that a valid assumption? In fact, often times it’s not – it is quite expected that some items would elicit more or less politeness. That is, the effect of politeness might be different for different items. Likewise, the effect of politeness might be different for different subjects. For example, it might be expected that some people are more polite, others less. So, what we need is a random slope model, where subjects and items are not only allowed to have differing intercepts, but where they are also allowed to have different slopes for the effect of politeness.
This is how we would do this in R:
politeness.model = lmer(frequency ~ attitude + gender + (1+attitude|subject) + (1+attitude|scenario), data=politeness, REML=FALSE)
Note that the only thing that we changed is the random effects, which now look a little more complicated. The notation “(1+attitude|subject)” means that you tell the model to expect differing baseline-levels of frequency (the intercept, represented by 1) as well as differing responses to the main factor in question, which is “attitude” in this case. You then do the same for items.