Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Statistics
STAT 131: Quiz 3 40 points total
Bayes’s Theorem is the only known approach to learning from data that satisfies two
important properties: (a) it’s logically internally consistent (meaning that it cannot produce
contradictory conclusions such as {An unknown quantity θ of interest to me cannot be
negative, but Bayes’s Theorem says that my best estimate of θ is −2.3}, and (b) it combines
information external and internal to your dataset in such a way that (I) no extraneous
information is inadvertently smuggled into your answer (consistency) and (II) all relevant
and useful information is properly accounted for (completeness). However, it’s possible to
use Bayes’s Theorem in a way that defeats its ability to help you learn from the world
around you, which is obviously undesirable. The result we’ll explore here, which illustrates
this cautionary tale, was called Cromwell’s Rule by the superb British Bayesian statistician
Dennis Lindley (1923–2013).
Let U be a true-false proposition whose truth status is Unknown to you, and let D be
another true-false proposition (representing Data) whose truth status is known to you;
an example would be U = (person P really is infected with COVID–19) and D = (this
screening test says that person P is infected with COVID–19) (note that U and D are not
the same; the test could be wrong). Recall that Bayes’s Theorem in this situation says,
assuming that P (D) > 0, that
P (U |D) = P (U)P (D |U)
P (D)
, (1)
in which P (U) is your prior (external) information about the truth of U (in the example
above, this would be the prevalence of COVID–19 among people similar to person P in all
relevant ways).
(a) Show that if you assume that P (U) = 0, this breaks Bayes’s Theorem, in the sense
that it makes the right-hand side of equation (1) undefined. 10 points
1
(b) Show that if you assume that P (U) = 1, then you would have to conclude that
P (U |D) = 1, no matter how the data information D came out. 10 points
(c) 20 points for this part of this problem Implications of the above results:
(i) Briefly explain in what sense your findings in (a) and (b) imply that
Putting prior probability 0 or 1 on anything renders Bayes’s
Theorem incapable of helping you learn from new data.
10 points
(ii) What practical conclusion should we draw about the assignment of prior prob-
abilities? Explain briefly. 10 points