QBUS2810 Statistical Modelling for Business
Statistical Modelling for Business
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
QBUS2810
Statistical Modelling for Business
Individual Assignment Task 1
This individual assignment will contribute 20% towards your final result in
the unit. The deadline is Friday March 31st, 2023 by 5:00pm (Sydney time).
Submission is via Turnitin on Canvas.
Key requirements:
• Complete your entire assignment in Jupyter Notebook, including your code and
markdown sections for your written answers. Use LaTex in markdown sections
where needed.
• Submit the resulting downloaded html file as your entire assignment. Care must
be taken with presentation in this file, however unavoidable error messages and
page formatting issues will be ignored in marking.
• Only relevant analysis outputs (graphs, tables, etc) should appear in the as-
signment file and all output should appear together with the discussion of that
output, in the file.
TASK 1
Business problem:
This assignment is about the predictive relationship between house’s price and living
area size in square meters. You will also assess whether the house price is a↵ected by
having a swimming pool (0: for no swimming pool (NSP), 1: for swimming pool (SP)).
Specifically, whether investing in the construction of a swimming pool increases the
profitability of the sale.
Data:
The data file for the analysis is “house data.csv”.
2Questions:
1. Conduct an appropriate exploratory analysis on the house priced, for all houses.
Discuss any cleaning of the data you did, including why and how you did it, or why
you didn’t do it. Explore the distribution of the houses in the two subgroups with and
without a swimming pool. (4 marks)
2. Conduct (with ↵ = 0.05) the appropriate t-test, median and Mann-Whitney tests,
to assess whether houses prices are typically higher for houses with swimming pool
(the alternative of unequal prices is fine for the median test). Assess all assumptions
made. (9 marks)
3. Which test’s result do you believe the most in Task 2? Discuss and explain. (2
marks)
4. Conduct an appropriate exploratory analysis to assess whether there may be a linear
relationship between houses’ prices and living area size in square meters. (3 marks)
5. Conduct a simple linear regression analysis, first using OLS and then using LAD
estimation, for houses’ prices on living area size in square meters. Fully assess all
assumptions. (10 marks)
6. Write a brief (e.g. 0.5 page) report summarising and discussing your findings and
conclusions. Include a discussion of whether you would recommend adding a swimming
to a property in view of increasing its sale value based on your findings. (4 marks)
TASK 2
Consider the population SLR model:
Yi = 0 + 1Xi + "i
and an observed, random sample of data (y1, x1), . . . , (yn, xn) from that model. Suppose
you ran an OLS regression on this data but made a mistake and did not include the
intercept in the regression.
Questions:
31. Is the mean of the estimated residuals from your OLS regression still equal 0, i.e.
e¯ = 0? How does your answer relate to LSA 2? (2 marks)
2. Is the correlation between the estimated residuals and the observed x’s still equal
0. (5 points)
Hint: look at the second equation found when di↵erentiating the residual sum of squares
with respect to 1.
3. What form does the OLS estimator of the slope coecient in your regression have?
(2 marks)
Hint: derive the solution from the 2nd equation found when di↵erentiating the residual
sum of squares with respect to 1.
4. Show whether or not the OLS estimator of your regression is still an unbiased
estimator of the true 1. (3 marks)