Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
MAST20005/MAST90058: Week 10 Solutions
1. The sign test is the only one we can do with this summary data. Let Z = freq(X ≤ 40),
then H0 : m = 40, H1 : m ̸= 40 and Z ∼ Bi(100, 0.5) ≈ N(50, 25). We can use the
exact Binomial, and against a two-sided alternative with α say not allowed to exceed
0.05, we note that 2*pbinom(39,100,0.5)=0.0352 but 2*pbinom(40,100,0.5)=0.0569, and
so decide we will reject H0 unless 40 ≤ Z ≤ 60; the observed z = 37, so we reject H0.
For the normal approximation the p-value is 2*pnorm(37.5,50,5)=0.0124 with continuity
correction, so again we reject at the nominal α = 0.05 level.
2. In a truly random sequence of numbers, the probability of the next digit being the same
as the preceding one is 1/10 and the probability of the next one differing by 1 from the
preceding is 2/10 (for all digits as we ensured that 0 and 9 each had two neighbours as
well by linking them together). The null hypothesis is:
H0 : p1 = 0.1, p2 = 0.2, p3 = 0.7
Suppose we obtain the following observed counts:
Observed Expected
Same 0 50× 0.1 = 5
Differ by 1 8 50× 0.2 = 10
Other 42 50× 0.7 = 35
The chi-squared statistic is:
(0− 5)2
5
+
(8− 10)2
10
+
(42− 35)2
35
= 6.8 > 5.991 (0.95 quantile of χ22)
Thus we conclude that the string of 51 digits is unlikely to have been randomly generated.
In your groups it is likely, judging from past year’s experience, that you will have a low
count for the ’same’ category as people are reluctant to repeat digits when attempting to
mimic randomness. As this category has a small expected number the residual is large
for smaller deviations - so this cell is likely to have contributed most to your test statistic.
Most attempts at randomness in the past have failed mainly for this reason.
3. (a) Note W+ +W− =
∑n
i=1 i =
1
2
n(n+ 1) and W = W+ −W− = V −W−. So
W = V −W−
= V − [1
2
n(n+ 1)− V ]
= 2V − 1
2
n(n+ 1)
Hence
V =
W + 1
2
n(n+ 1)
2
(b) Define the sum we seek as Sn ≡
∑n
i=1 i
2.
1
i. First note that
∑n
1 [(i+ 1)
3 − i3] = (23 − 1) + (33 − 23) + . . .+ (n3 − (n− 1)3) +
((n+ 1)3 − n3) = (n+ 1)3 − 1 by cancellation of telescoping terms. Secondly:
n∑
1
[(i+ 1)3 − i3] =
n∑
i=1
[i3 + 3i2 + 3i+ 1− i3]
= 3Sn + 3
1
2
n(n+ 1) + n
Equating these two expressions gives:
(n+ 1)3 − 1 = 3Sn + 3
2
n(n+ 1) + n
and elementary algebra gives the required formula.
ii. By induction we first verify Sn holds for n = 1, namely S1 = 1
2 = 1
6
1(1+1)(2+1)
which is true. We then assume Sn is true for n and prove if holds for (n+ 1) as
follows:
12 + 22 + . . .+ n2 + (n+ 1)2 = Sn + (n+ 1)
2
=
1
6
n(n+ 1)(2n+ 1) + (n+ 1)2
=
1
6
(n+ 1)[n(2n+ 1) + 6(n+ 1)]
=
1
6
(n+ 1)[2n2 + 7n+ 6]
=
1
6
(n+ 1)[(n+ 2)(2n+ 3)]
=
1
6
(n+ 1)((n+ 1) + 1)(2(n+ 1) + 1) = Sn+1
So by induction Sn holds for all n.
(c) Finally we have:
E(V ) =
E(W ) + 1
2
n(n+ 1)
2
=
1
4
n(n+ 1)
var(V ) =
1
4
var(W )
=
1
24
n(n+ 1)(2n+ 1)
V can be approximated by a normal like W as it is a linear function of W .
4. (a) Given X
d
=Y we have P (X > Y ) = P (Y > X) = P (X < Y ) = 1−P (X > Y ) which
implies P (D > 0) = P (X − Y > 0) = P (X > Y ) = 1
2
so mD = 0.
(b) Given X
d
=Y we have D = X−Y d=Y −X = −(X−Y ) = −D so the pdf of D must
be symmetric around 0. Alternatively, FD(d) = P (D < d) = P (−D < d) = P (D >
−d) = 1− FD(−d) so differentiating gives fD(d) = −fD(d)×−1 = fD(−d) which is
a definition of symmetry.
2
5. Let our sign test statistic be N = number of Xi < 0 ∼ Binomial(10, 0.5). Note 2 ∗
pbinom(2, 10, 0.5) = 0.108 > 0.05 and 2 ∗ pbinom(1, 10, 0.5) = 0.0107 < 0.05. So we fail
to reject with the sign test if 2 ≤ N ≤ 8.
Signed rank test will reject if |W | ≥ c. To find an extreme value for W for a data set
for which the sign test accepts H0, lets make the 8 largest ranks positive and the only
negative ranks −1 and −2.
This gives W = 49. We need to convert to V to use the psignrank command in R.
V = 1
2
(W + 1
2
n(n+ 1)) = 1
2
(49 + 55) = 52. So:
p-value = 2Pr(V ≥ 52)
= 2(1− Pr(V ≤ 51))
= 2(1− psignrank(51, 10))
= 0.009765
To verify, note that all 210 possible allocations of signs to the ranks are equally likely,
each with probability 2−10. So we just need to count up all sign allocations resulting in
a value of |W | ≥ 49. The sets for W ≥ 49 are as follows (where we just list the negative
ranks in the set and the corresponding W value):
{none negative} with W=55
{-1} with W=53
{-2} with W=51
{-3} with W=49
{-1,-2} with W=49
Clearly W is symmetric around 0 so there are another 5 outcomes with W ≤ −49, hence
p-value=2× 5
210
= 0.009765 as before.
The signed rank test is rejecting at better than the 1% level so we can probably afford
to add in more small negative ranks. If we have {−1,−2,−3}, then V = 49 with p-
value= 2 ∗ (1− psignrank(48, 10)) = 0.0273. Trying a fourth gives {−1,−2,−3,−4} with
V = 45 and p-value= 2 ∗ (1 − psignrank(44, 10)) = 0.0839 which finally fails to reject.
So {−1,−2,−3} gives rejection for signed rank with p-value= 0.0273 but acceptance for
sign test at p-value=2 ∗ pbinom(3, 10, 0.5) = 0.34.
6. (a) Given the dependence of the before and after observations here we must work with
the differences. If the distributions before and after are the same, then the median of the
differences will be 0. One pair is tied and we must have only two categories - positive or
negative differences - as we are modelling the data with a Binomial distribution. So we
ignore that pair giving n = 9, H0 : mD = 0, test statistic (number of positive differences
which is expected to be high if the treatment works), Z ∼ Bi(9, 0.5) , observed z = 7, so
the p-value is (1− pbinom(6, 9, 0.5)) = 0.0898 and we do not reject H0.
(b) The Null distribution of the signed rank test assumes that differences are either
positive or negative so we again ignore the tie and reduce n to 9. We know from Q4 that
the differences have a symmetric distribution. The differences in order are:
(2,-3,6,8,5,7,4,9,-1) with V = 41. The p-value= 1 − psignrank(40, 9) = 0.0137, so the
3
signed rank test rejects at the α = 0.05 level. (c) the signed rank test is more likely to
reject H0 when it is false than the sign test as it uses not just the sign but also takes
account of the magnitude of the differences - using more information gives it more power
provided the additional assumption of symmetry is met.