Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
ETM5900
Assignment 1
Question 1 [Total 23 Marks]
A group of researchers are interested in studying the prevalence of obesity, diabetes, and other
cardiovascular risk factors in Subang Jaya, Selangor. To gain more insight into this question,
1150 subjects were interviewed and some of the results obtained are compiled in the data file
A1 S2 2023.xls. The columns provide the following information:
Column A: the patient ID
Column B: the level of stabilised glucose
Column C: The total level of cholesterol
Column D: the level of high-density-lipoprotein (“good” cholesterol)
Column E: the weight of the patient
Column F: the gender of the patient
Column G: the type of body frame (small, medium, large)
The data is available on the “A1 S2 2023.xls” file on the Moodle. You must use your subsample
of the survey data. Your sample will consist of 200 observations starting from the respondent
whose ID is the same as the last three digits of your student number. For example, if your
student number is 20275749, you would use individuals 749 to 948.
All tables, graphs and comments for this question should be places in the designated spaces in
the Worksheet Results.
(a) Complete Table (a). Use Countif or another method to find the frequencies for the
number of male and female patients in the sample and hence complete Table (a).
[2 marks]
(b) Display the data in Table (a) using an appropriate chart to be placed in the Graph (b)
Textbox. [2 marks]
(c) Using Countif or any other appropriate method, complete Table (c) by filling in the
frequencies of male and female patients according to their type of body frame.
[2 marks]
(d) Display the data in Table (c) using an appropriate chart to be placed in the Graph (d)
Textbox. [2 marks]
(e) Complete Table (e) containing the summary statistics for the HDL (high-density-
lipoprotein or “good” cholesterol) variable according to the patient gender.
[2 marks]
(f) Complete the grouped frequency Table for the HDL (“good” cholesterol) for female and
male patients [Table (f)]. Find the frequency and hence calculate the percentage
frequency and cumulative percentage frequency for female and male patients. [2 marks]
(g) Is the level of “good” cholesterol (HDL) different for the two groups? Use figures from
Table (e) to help you explain any differences. [3 marks]
(h) Construct percentage frequency polygons for the HDL for female and male students as one chart
as Graph (h). [3 marks]
(i) Discuss the shape of the percentage frequency polygons for the HDL levels for female
and male patients. [3 marks]
(j) List the four measures of variability from the summary statistics. Which one of the
HDL (female or male patients) shows more variability? You are required to use your
sample result to answer this question. [4 marks]
Question 2 [Total 13 Marks]
a. Based on your sample size, construct a contingency Table between the gender of the
patient and the type of body frame. [1.5 marks]
b. Who are the majority of patients and what is their probability? [1.5 marks]
c. What is the probability that the randomly selected patient is a medium body frame?
[2 mark]