MTH161 Linear Regression Analysis
Linear Regression Analysis
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
MTH161
For this project, you will need to gather data in order to make inferences about the relationship between
variables. After you gather your data, you will describe the variables using appropriate graphs and statistics.
Then, you will use the following methods to perform statistical operations on the data you have collected.
These methods may include:
Inferences about Rho (ρ)
Linear Regression Analysis
Inferences about the slope of the least squares regression equation
Confidence and Prediction intervals
Regression Analysis with addition of dichotomous (dummy) variable
Step 1: Topic Assignment and Data Collection
You may choose the topic that involves collecting data from a website. You will need to collect at least 36
groups of data for this project. You will use a sampling method from chapter 1 of module 1 to gather your
data. Note: within your project, you will have to explain your sampling method and reference any websites you
used to gather the data. Data sho u ld be as RA N D O M as p o ssi b l e . Do not just choose the first 36 pairs you see.
Population Predictor Variable Explanatory Variable Dichotomous Variable
ATP Tennis Players Points (2023) Earnings (2023) Tour (Male vs. Female)
MLB Hitters At Bats (2023 season) Hits (2023 season) Position (Infield vs. Outfield)
NBA Players Minutes Played (current season) Points (current season) Position (Guard vs. Forward)
Desserts Calories Fat Grams Type (Candy vs. Not Candy)
2023 Movies Number of Theatres Gross (in thousands) Genre (Drama vs. Comedy)
Vehicles Weight Miles per Gallon Type (Truck vs. Car)
Humans
Age (must include individuals in
all decades of life from teens to
70s..)
Hours of Average Nightly
Sleep Sex (Male vs. Female)
Your Choice You Pick! You Pick! MUST BE APPROVED FIRST!
Please choose a project topic based on the following grid: (You may also choose an alternative
project topic but you must email me for approval first!)
Step 2: Data Collection
You must have a minimum of 36 groups of data. The data must be listed in an appendix at the end of your
project. Data should be listed similarly to the following example:
Dog Age (years) Weight (pounds) Sex
Dobby 4.2 62.4 F
Luca 7.5 75.5 M
Samson 2.1 66.3 M
Note: data sources/websites must be listed in this section. Without it, I will assume you made up the data –
and you will fail this project!
Step 3: The project itself
The project itself consists of the following parts
A. Introduction – Discuss your project topic, your sampling methodology and the tests you will
perform. Be thorough – and write it as if we have never discussed the project. Also, explain any
preconceived ideas you had about your topic.
B. Describe the data
• Include the graph(s) of your choice to describe the distribution of data for each numeric
variable. (You can include histograms, boxplots, etc.). You must include at least one graph for
each numeric variable. You must include at least one graph for the dichotomous variable (i.e. pie
chart or bar chart).
• Include a 95% confidence interval for both mean and standard deviation for EACH numeric
variable. Interpret the results accordingly.
C. Regression Analyses
Create a scatterplot and discuss the visual relationship between the two numeric variables.
Conduct a hypothesis test about Pearson correlation. Discuss the implication of the results –
and proceed with the remaining tasks regardless of outcome.
Find the regression equation for the two numeric variables. (We will include the dichotomous
variable in the last bullet point!) Interpret slope and y-intercept.
Test for usefulness and create a 99% confidence interval for slope. Discuss/interpret the
meaning behind the slope and its confidence interval.
Choose a value for the predictor variable within the sample domain and create 98%
Confidence and Prediction intervals for the response variable. Discuss your results.
Find the regression equation that includes all three variables. How does the equation change
with the addition of the dichotomous variable? Choose a value for the predictor variable within
the sample domain and discuss how the prediction changes for each value of the dichotomous
variable.
D. Conclusion – Briefly summary your results. Discuss any shortcomings of the methods you used to
gather data. Did you discover anything surprises? Do you think your results would have been
different with larger sample sizes? If you had to do the project again, what would you do differently?
Note1: StatCrunch MUST be used for ALL statistical calculations with output listed either neatly within the
project or as part of the appendix. Manual calculations or any other software output will not be permitted.
Projects without data and/or StatCrunch output will receive automatics zeros.
Note2: Neatness/organization counts for 10% of this project. Please make it look like you care!