ECMT1020 Introduction to Econometrics
Introduction to Econometrics
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
ECMT1020 Introduction to Econometrics
Group Assignment
Academic Dishonesty and Plagiarism
Academic honesty is a core value of the University, and all students are required to act honestly,
ethically and with integrity. The consequences of engaging in plagiarism and academic dishonesty,
along with the process by which they are determined and applied, are set out in the Academic
Honesty in Coursework Policy 2015. Under the same policy, as the unit coordinator, I must report
any suspected plagiarism or academic dishonesty.
Instructions
1. This group assignment accounts for 15% of your final grade. You can self sign up
a group to complete this assignment. The maximum group size is 2. The marking
of the group assignment is based on the final submission of the group, and group
members will receive the same mark for this assignment.
2. There are 12 questions in this assignment and the full mark of the assignment is 55.
The breakdown marks are indicated in the questions. Please attempt all questions.
3. This group assignment entails the use of econometric models and statistical tools in
economic application. You will use statistical software to analyze a cross-sectional
data set containing annual household expenditure on categorized expenditure re-
ported in 2013.
4. The dataset your group will use is in the Excel spreadsheet CES#.xlsx, where #
is the last digit of the sum of the last digits of group members’ University of Sydney
SIDs. For example, student A and student B form a group. The last digit of student
A’s SID is 3 and the last digit of student B’s SID is 8, then 3 + 8 = 11 and the last
digit of 11 is 1. So, this group of student A and student B should use data set CES1.
5. Please use your assigned data set to answer the questions and write your data set
number and the SIDs of group members on the front page of your work. Using the
wrong data set will be reviewed as a potential case of Academic Dishonesty.
6. In your submitted work, please round all numerical answers to 2 decimal places if
necessary. When you are asked to “perform a test”, you should write down explicitly
the null hypothesis of the test, and state clearly how you make testing decisions and
conclusions. Please carry out all tests using a 5% level of significance.
7. If you are asked to make a plot, please make sure you have a proper title, x-axis
label and y-axis label on your figures.
8. You should include Stata procedures and outputs1 in your answers, and your own
interpretations and explanations are necessary for earning marks. Please type your
answer in a document. We do not accept handwritten solutions.
1You do not need to submit a separate Stata do-file.
1
9. When answering the questions, please keep your statements concise as well as ac-
curate. Excessively long responses indicate a lack of understanding and will be
penalized accordingly.
10. Please submit a pdf file2 named CES# SID1 SID2.pdf where # is your assigned data
set number, and SID1 and SID2 are 9-digit SIDs of the group members. Do not put
your names in your submission. Do not include a cover sheet.
11. Submit one pdf file through Turnitin under the Canvas module “Assignment”. Late
submission is subject to a penalty of 5% of total 55 marks, which is 2.75 marks,
per calendar day. Work submitted more than 10 calendar days after the due date
will receive a mark of zero. There are in accordance with 7A in the University
Assessment Procedures 2011.
Data Description
Your assigned data set is a subset of the Consumer Expenditure Survey (CES2013) data
set. The description of the data set and contained variables can be found in Appendix B
on pp. 570–572 of the textbook (also provided in a separate pdf file).
In the data set, there are 23 variables of categorical household expenditure, such as FDHO,
indicating food and nonalcoholic beverages consumed at home, and HOUS, indicating
housing expenditure.
The category of household expenditure you will be interested in for answering the following
questions (variable Y in the questions) depends on which data set you use. Find your
variable Y from the following table.
Data set Expenditure variable (Y )
CES0 DOM
CES1 EDUC
CES2 ELEC
CES3 FURN
CES4 GASO
CES5 HEAL
CES6 HOUS
CES7 LIFE
CES8 READ
CES9 TOB
Questions
In the following questions, the variable Y is an expenditure variable that you will focus
on in the analysis. Please first determine your variable Y based on your assigned data
set and the above table, and then replace the variable Y in all the questions by the
corresponding expenditure variable name.3
2You may type your answers in a Word document and then save it as a pdf file.
3For example, you are assigned data set CES8 and your expenditure variable is READ, then you can
replace Y in all the questions by READ.
2
1. (5pt) In your data set, the variable SIZE is the number of persons in the household.
Make a scatter plot of Y (on the y-axis) and SIZE (on the x-axis), and fit a simple
linear regression of Y on SIZE. Please write down the fitted regression and carefully
interpret the regression intercept and slope.
2. (5pt) Is the slope coefficient in your fitted model in Question 1 significantly different
from zero at 5% level? Please explain how you could perform a hypothesis test to
draw your conclusion. Be explicit about your null hypothesis and explain in detail
how you would make your testing decision by reading either
(i) the test statistic, or
(ii) the p-value of the test, or
(iii) the confidence interval
in your regression output.
3. (5pt) Please explain why methods (i), (ii) and (iii) are equivalent for making your
testing decision in Question 2. In your explanation, you should be explicit about
how the p-value of the test is defined and how the confidence interval is constructed.
4. (5pt) Define a new variable LGY as the log transformation of Y. Make a scatter plot
of LGY (on the y-axis) and SIZE (on the x-axis), and fit a simple linear regression
of LGY on SIZE. Please write down the fitted regression and carefully interpret the
regression coefficients. Please also explain why the interpretation of the regression
coefficients here is different from that in Question 1.