MATH38172 Generalised Linear Models
Generalised Linear Models
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
MATH38172 Generalised Linear Models
Instructions. Attempt the questions below and submit your work online via Blackboard by the deadline
of Thursday 5th May at 11am. Your submission must be a single file. It may contain any sensible mix
of word-processed and (scanned) handwritten parts, for example using LaTeX, RMarkdown or Microsoft
Word. You should include any R code used. A complete solution is possible in 6 or 7 pages of type 10 font;
please limit your response to 12 pages at most. The coursework may take up to 10 hours to complete. The
submitted work MUST be your own. Plagiarism will not be tolerated and will result in serious consequences if
discovered.
Background. The file birthweight.csv on Blackboard contains data on 189 births collected at a medical
centre in the USA. The data concern several variables relating to the mother, the pregnancy, and the outcome
of interest which is whether the baby is born with a low birth weight (defined as less than 2500 grams). The
variables contained in the data are as follows.
Variable name Meaning
Low Indicator for low birth weight
Age Mother’s age in years
MotherWeight Mother’s weight before the pregnancy (in pounds)
Race Mother’s race
Smoke Indicator for whether the mother smoked during pregnancy
Premature Number of times the mother has previously given birth prematurely
Hypertension History of hypertension (1 - Yes; 0 - No)
UI Uterine irritability (1 - Yes; 0 - No)
Visits Number of physician visits during the first trimester
Questions
(a) Read the dataset into R and fit a logistic regression model to explain the probability of low birth weight
using all explanatory variables, treating “white” as the baseline level for race. Present the summary for
your fitted model. (2 marks)
(b) For which variables is there significant evidence of an association with low birthweight? Explain your
methodology. (3 marks)
(c) Fit a simplified model that only includes the significant explanatory variables and the intercept. (1
mark)
(d) Write down the simplified model from part (c) in equation form, and interpret the parameters and their
estimated values. What characteristics of a mother are associated with the highest probability of low
birth weight? (5 marks)
(e) Using the model from part (c) estimate the probability of low birth weight for a 28 year old black
mother, who prior to pregnancy weighed 130 lbs, does not smoke, is not experiencing any medical
conditions and received no physician visits. (1 mark)
(f) Using the model from part (c), estimate the weight at which a mother has a 20% chance of having
a baby with low birth weight, given that she is a white non-smoker with uterine irritability, but no
hypertension or physician visits. Give a 95% Wald confidence interval for this weight. What do you
notice? (4 marks)
(g) Do your results in part (b) about which variables are significantly associated with the probability of
low birthweight change if ‘history of premature labour’ and ‘number of physician visits’ are treated as
categorical variables? Explain your working and methodology, and posit reasons for any differences you
observe. (4 marks)