Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
COMM5000 DATA LITERACY
Seminar 5 Week 6
Seminar Solutions
Page 2
1. Truth in advertising. A garden centre wants to store leftover packets of vegetable
seeds for sale the following spring, but the centre is concerned that the seeds may
not germinate at the same rate a year later. The manager finds a packet of last
year’s green bean seeds and plants them as a test. Although the packet claims a
germination rate of 92%, only 171 of 200 test seeds sprout. Is this evidence that
the seeds have lost viability during a year in storage? Test an appropriate
hypothesis and state your conclusion. Be sure the appropriate assumptions and
conditions are satisfied before you proceed.
Page 3
2. It is known that 80% of people suffering from a particular disease are cured by a
certain standard medication. Test the claim of developers of a new medication
that their product is more effective than the standard medication in curing the
disease, using a 5% significance level, given a random sample of 400 people with
the disease of whom 330 are cured by using the new medication. (Hint: Use the
normal approximation)
H0: p = 0.80, H1: p > 0.80, n = 400, α = 0.05 & p� = 330400 = 0.825
We can use the normal approximation for the sample proportion. Under the null:
P� ∼ N �p, p(1−p)
n
� or P� ∼ N �0.8, 0.8×0.2
400
� approximately.
The p-value: P�P� > 0.825� = P �Z > 0.825 − 0.8
�(0.8 × 0.2)/400� = P(Z > 1.25) = 0.1056
Because the p-value>0.05, we do not reject the null H0 and instead we conclude
that there is not enough evidence to support the developers’ claim of a more
effective cure.
Alternatively, the rejection region is given by z>1.645 or p� > 0.8329.
Page 4
3. Perform the following hypothesis tests of the population mean. In each case, draw
a picture to illustrate the rejection regions on both the Z and X ̅ distributions, and
calculate the p-value of the test.
(a) H0: μ = 50, H1: μ > 50, n = 100, = 55, σ = 10, α = 0.05
Rejection region: z = x� − 5010/√100 > z0.05 = 1.645
Alternatively, x� > xU = μ0 + z0.05 σ√n = 50 + 1.645 � 10√100� = 51.645̅
Since, 55 − 50z = = 5 > z10/√n 0.05 = 1.645
we can reject H0 and conclude that we are 95% confident that the population mean
is greater than
− = ( > 5) ≈ 0.0000.
0.05
50 51.645
reject
X
0.05
Z
0 1.645
reject
Page 5
(b) H0: μ = 25, H1: μ < 25, n = 100, = 24, σ = 5, α = 0.1
Rejection region: z = x� − 255/√100 < z0.1 ≈ −1.28
Alternatively, x� < xL = μ0 − z0.1 σ√n = 25 − 1.28 � 5√100� = 24.36 ̅
Since 24 − 25z = = −2 < −z5/√100 0.1 = −1.28
we can reject H0 and conclude that we are 95% confident that the population
mean is less than 25.
− = ( < −2) = 0.0228
Page 6
(c) H0: μ = 80, H1: μ ≠ 80, n = 100, = 80.5, σ = 4, α = 0.05
Rejection region: x� − 804/√100 < −z0.025 ≈ 1.96 or > 1.96
Alternatively: x� < xL = μ0 − z0.025 σ√n = 80 − 1.96 � 4√100� = 79.216
Or, x� > xU = μ0 + z0.025 σ√n = 80 + 1.96 � 4√100� = 80.784
Since z = 80.5 − 804/√100 = 1.25
H
is neither less than -1.96 nor greater than 1.96, we do not reject 0, with 95%
confidence.
− = 2( > 1.25) = 2 × 0.1056 = 0.2112
Page 7
4. Case Study: Credit Card Promotion.
A credit card company plans to offer a special incentive program to customers
who charge at least $500 next month. The marketing department has pulled a
sample of 500 customers from the same month last year and noted that the mean
amount charged was $478.19 and the median amount was $216.48. The finance
department says that the only relevant quantity is the proportion of customers who
spend more than $500. The program will lose money if that proportion is not more
than 25%.
Among 500 customers, 148 or 29.6% of them charged $500 or more. Has the goal
that 25% of all customers charging at least 4500 been met?
Setup:
State the problem and discuss the variables and the context
We want to know whether 25% or more of the customers will spend $500 or more
in the next month and qualify for the special program. We will use the data from the
same month a year ago to estimate the proportion and see whether the proportion
was at least 25%
The statistic is ̂ = 0.296, the proportion of customers who charged $500 or more.
Hypotheses:
0: = 0.25
1: > 0.25
Model:
Check the condition.
Independence Assumption. Customers are not likely to influence one another when
it comes to spending on their credit cards.
Randomization Condition. This is a random sample from the company’s database
Success/failure Condition. We expect 125 successes and 375 failures, both at least
10. The sample is large enough.
10% condition. The sample of 500 customers is less than 10% of all our customers.
Under these conditions, the sampling model is Normal. We’ll compute a one-
proportion z-test.
Page 8
Mechanics: Write down the given information and determine the sample
proportion. Find the test statistic and its p-value.
= 500,
̂ = 148
500
= 0.296 and (�) = �(0.25)(0.75)
500
= 0.01936
So, the test statistic is
= 0.296−0.250
0.01936 = 2.38
From Excel or from the N(0,1) table, we find that the probability of a z-score ≥ 2.38
is 0.0087, so that is our p-value.
Report:
Conclusion. Link the test decision about the null hypothesis, then state your
conclusion in context.
Re: Credit card promotion
If the true proportion of customers charging $500 or more were actually 25%, the
probability of seeing a success rate at least as large as the 29.6% that we did
observe is about 0.0087. This is strong evidence that the true proportion is greater
than our target of 25%. However, business judgement is called for to determine
whether to go ahead with the new promotion.
Page 9
5. Case study: Bank Loans
Before lending someone money, banks must decide whether they believe the
applicant will repay the loan. One strategy used is a point system. Loan officers
assess information about the applicant, totaling points they award for the
person’s income level, credit history, current debt burden, and so on. The higher
the point total, the more convinced the bank is that it’s safe to make the loan. Any
applicant with a lower point total than a certain cutoff score is denied a loan.
We can think of this decision as a hypothesis test. Since the bank makes its profit
from the interest collected on repaid loans, their null hypothesis is that the
applicant will repay the loan and therefore should get the money. Only if the
person’s score falls below the minimum cutoff will the bank reject the null and
deny the loan. This system is reasonably reliable, but, of course, sometimes there
are mistakes.
(a) When a person defaults on a loan, which type of error did the bank
make?
The null hypothesis the bank assumes is that the person will repay the loan. If in
fact the person defaults a loan, this a case of assuming an incorrect null
hypothesis. In other term not rejecting a false null. That is what we defined as
type two error.
(b) Which kind of error is it when the bank misses an opportunity to
make a loan to someone who would have repaid it?
If the bank misses this opportunity, this means that the point system rejected a
person who is going to repay the loan. In this case, the null is correct, but the
rule rejects a true null. This is what we refer to as type 1 error. The probability of
type 1 error is the size of the test, or significance level.