STAT 231 Assignment
Online Assignment
Problem 1: Binomial distribution
In a very large population 1% of the people have a certain genetic mutation. Suppose 1200 people are
selected at random. Define the random variable Y = number of people with the genetic mutation in
the sample.
(a) What are the assumptions for a Binomial model? Explain, with reasons, whether or not these
assumptions might hold in this context. Your answer must be written in sentences.
(b) Use the Normal approximation to the Binomial with continuity correction and the Normal table in
the Course Notes to approximate the following probabilities.
P(Y ≤ 8), P(Y ≥ 16), and P(|Y – 12| < 7)
You must show your work for full marks.
(c) Type help(pbinom) in R to see the syntax for the R functions pbinom, qbinom, dbinom, and
rbinom. Use the appropriate R functions to obtain values for:
P(Y ≤ 8), P(Y ≥ 16), and P(|Y – 12| < 7)
Include the R statements that you used in your submitted answer.
(d) For each of the probabilities in (b) and (c) determine the percent relative error 100 |−|
where
is the approximate probability and is the probability calculated using R. Explain why each pair of
values is in good agreement or not.
(e) Suppose the proportion of people with the genetic mutation is an unknown value equal to θ.
Suppose n people are selected at random where n is large. Approximate the probability:
�
− 2.17�(1 − )
≤ ≤
+ 2.17�(1 − )
�
You may ignore the continuity correction. You must show your work for full marks.
3
Problem 3: Normal or Gaussian distribution
Suppose it is reasonable to assume that the heights in centimeters of second year female Math
students at the University of Waterloo have a G(160,9) = N(160, 81) distribution. Define the random
variable Y = height of a female Math student chosen at random.
(a) Use the Normal table in the Course Notes to determine P(Y ≥ 169).
You must show your work for full marks.
(b) Type help(pnorm) in R to see the syntax for the R function pnorm, qnorm, dnorm, and rnorm. Use
the appropriate R function to obtain the value for P(Y ≥ 169).
Include the R statement that you used in your submitted answer.
(c) Find the percent relative error 100 |−|
where is the probability determined in (a) using
the Normal table and is the probability determined in (b) using R. Explain why the answers are in
good agreement or not.
(d) Determine a such that P(Y ≥ a) = 0.83 using the inverse Normal cumulative distribution table in the
Course Notes.
You must show your work for full marks.
(e) Use the appropriate R function to obtain the value for a such that P(Y ≥ a) = 0.83.
Include the R statement that you used in your submitted answer.
(f) Are the answers in (d) and (e) in good agreement or not?
(g) Suppose 64 female Math students are chosen at random. Determine the probability that their
average height lies between 159 and 162. Use R to find the probability, not the Normal table in the
Course Notes.
You must show your work for full marks.
Include the R statement that you used in your submitted answer.
5