Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
MSIN0010 Data Analytics I
Homework 2
1. An executive placement service needs to estimate the average salary of executives in
the tech industry. A survey is sent out to 100 randomly sampled executives. The
sample returned an average of ̅ = $225,000 and a sample standard deviation of = $20,000.
a. (4 points) In words, describe the differences between the population,
sample, parameter, and statistic in the context of this problem.
b. (6 points) Construct a 95% confidence interval for the true average salary of
executives in the tech industry.
2. (5 points) For a sample of size , suppose that the margin of error is 10. How much
more data would you have to collect to achieve a margin of error equal to 5? Be sure
to show your work.
3. Conduct the following experiment in R.
a. (5 points) First, create a vector of length 100 where each element is equal to
zero. Store this vector in an object called check. Next, write a “for loop”
where the index i goes from 1:100. Inside the “for loop”, include the
following code:
x = [generate 1000 standard normal random variables]
lower = [compute the lower bound of a 95% confidence interval for ̅]
upper = [compute the upper bound of a 95% confidence interval for ̅]
check[i] = [check if lower < 0 and upper > 0 (TRUE or FALSE)]
Note that you should replace [text] with R code to do what is written inside
the brackets. Copy and paste your completed R code below (there should be
roughly 6-10 lines in total, including the four lines above).
b. (2 points) Based on your experiment, how often did the true value of the
population mean fall inside (between the lower and upper bounds) of your
confidence interval?
c. (3 points) Comment on whether your answer to 2b) seems reasonable based
on what you know about the theory of confidence intervals.
4. (5 points) According to a financial planner, individuals should save 10% of their
income over their working life to secure a comfortable retirement. An agency wants
to test whether this actually happens with people in the UK, suspecting the overall
savings rate may be lower. A random sample of 40 individuals revealed an average
savings rate of 8% and a sample standard deviation of 3%. Carry out a hypothesis
test to determine if the savings rate is actually lower than 10%. Use a 1% significance
level and be sure to follow the five steps of hypothesis testing.
5. In this question you will use the Auto data set from the ISLR package in R to
determine if there is a difference in average MPG between Ford and Dodge cars.
a. (3 points) First, create a new variable called brand which is equal to the first
word within the variable name. (Note: there are many ways to do this!) Show
your R code below.
b. (2 points) Does it seem like one brand has a higher MPG than the other?
Provide evidence with a graph.
c. (5 points) Follow the five steps of hypothesis testing to formally test for a
difference in the average MPG between Ford and Dodge cars. Use a 1%
significance level. Show your R code below.