Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
ST305 Statistical Inference
Why study statistics?
Collecting, analyzing, and interpreting data is growing
more important every year in nearly every field.
Many important real-world decisions hinge on conflicting
claims about “what the data show,” such as:
Does raising the minimum wage increase unemployment?
Is a new cancer drug more effective than the current
standard of care?
What are the most likely ecological or agricultural effects of
climate change, and how do we mitigate them?
Interpreting data will also increasingly become part of your
personal decision making.
Knowledge on Statistics will empower us to be an active
participant in the data-driven arguments that drive
decisions and shape our understanding of the world.
2
Chapter 1 Probability Theory
“You can, for example, never foretell what any one man will
do, but you can say with precision what an average
number will be up to. Individuals vary, but percentages
remain constant. So says the statistician."
Sherlock Holmes
The signs of Four
3
Overview
Probability theory
the foundation upon which all of statistics is built
has a long and rich history, dating back to at least to the
17th century
The aims of this chapter:
to outline some of the basic ideas that are fundamental to
the study of statistics
not to give a thorough introduction to probability theory
4
Set theory
Sample space S: the set of all possible outcomes of an
experiment
Example 1: An experiment consists of flipping two coins.
The sample space
S = {(h,h), (h, t), (t ,h), (t , t)},
where t = tail and h = head .
Event: any subset of the sample space
An event is a set consisting of some possible outcomes of
the experiment.
Example 1: An experiment consists of flipping two coins.
The sample space
S = {(h,h), (h, t), (t ,h), (t , t)},
E = {(h,h), (h, t)} is an event (the event that a head
appears on the first coin).
5
Exercise: An experiment consists of flipping one coin. What is
the sample space? List one event.
6
Union
Union of events E and F : E ∪ F
E ∪ F : the event that consists of all outcomes that are
either in E
or in F
or in both E and F
Venn diagram
Example 1: Suppose E = {(h,h), (h, t)} and
F = {(t ,h), (h,h)}.
Then E ∪ F = {(h,h), (t ,h), (h,h)}.
The union, ∪∞n=1En: the event that consists of all outcomes
that are in at least one of the sets, E1, · · · , En.
7
Intersection
Intersection of events E and F : E ∩ F or EF
E ∩ F : the event that consists of all outcomes that are
both in E and in F
Venn diagram
Example 1: Suppose E = {(h,h), (h, t)} and
F = {(t ,h), (h,h)}.
Then E ∩ F = {(h,h)}.
The intersection, ∩∞n=1En: the event that consists of those
outcomes that are in all of the events, E1, · · · , En.
If E ∩ F does not contain any outcomes, it is the null event
and is denoted by ∅
8
Complementation
Complementation. Ac , the complement of A, is the set of
all elements that are not in A:
Ac = {x ; x ∈ A}
9
Exercise
Exercise: Suppose A = {(h,h), (h, t)}, B = {(t ,h)} and
C = {(h,h), (h, t), (t , t)}. Find A ∩ B, A ∪ C, A ∩ C and Cc .
10
Some rules for the operations of events
Commutative laws
E ∪ F = F ∪ E Exercise Show this is by using Venn diagrams
E ∩ F = F ∩ E Exercise Show this is by using Venn diagrams
Associative laws
E ∪ F ∪G = E ∪ (F ∪G)
E ∩ F ∩G = E ∩ (F ∩G)
Distributive laws:
(E ∪F )∩G = (E ∩G)∪(F ∩G) Exercise Show this is by using Venn diagrams
(E ∩F )∩G = (E ∩G)∩(F ∩G) Exercise Show this is by using Venn diagrams
11
DeMorgan’s laws
DeMorgan’s laws:
a) (∪ni=1Ei)c = ∩ni=1Eci
b) (∩ni=1Ei)c = ∪ni=1Eci
Proof to a):
Suppose x ∈ (∪ni=1Ei)c
=⇒ x /∈ ∪ni=1Ei
=⇒ x /∈ Ei for all i = 1,2, · · ·
=⇒ x ∈ Eci for all i = 1,2, · · ·
=⇒ x ∈ ∩ni=1Eci for all i = 1,2, · · ·
12
Disjoint (mutually exclusive) and partition
If E ∩ F = ∅, then E and F are said to be disjoint or
mutually exclusive
The events E1, E2,. . . are pairwise disjoint (or mutually
exclusive) if Ei ∩ Ej = ∅ for all i ̸= j .
If the events E1, E2,. . . are pairwise disjoint (or mutually
exclusive), then the collection of E1, E2,. . . forms a
partition of S.
Partitions are useful as it allows us to divided the smaple
space into small, non-overlapping pieces.
Example: Define Ei = [i , i + 1).
Then E1, E2,. . . are pairwise disjoint
E1, E2,. . . forms a partition of [0,∞).
13
Sigma algebra (or Borel field)
Let B be a collection of subsets of S. It is a Sigma algebra
(or Borel field) if all the following hold:
a. ∅ ∈ B
b. B is closed under complementation: If A ∈ B, then Ac ∈ B
b. B is closed under countable unions: If A1, A2, . . . ∈ B, then
∪∞i Ai ∈ B.
Example 1: {0,S}
Example 2: Let S = (−∞,∞) and B be a set that contains
all sets of the form [a,b], (a,b], (a,b) and [a,b).
14
Probability function
Given a sample space S and an associated sigma lagebra
B, a probability function P is a function with domain B
such that
Axiom 1: P(E) ≥ 0 for all E ∈ B
Axiom 2: P(S) = 1
Axiom 3: For any sequence of pairwise disjoint events, E1,
E2, . . . (EiEj = ∅ when i ̸= j),
P
(∞⋃
i=1
Ei
)
=
∞∑
i=1
P(Ei)
The three properties given above are also referred to as
the Kolmogorov Axioms (after A. Kolmogorov, one of the
fathers of probability theory).
Any function P that satisfies the Axioms of Probability is
called a probability function.
15
Example
Suppose a die (with 6 sides) is rolled and that all 6 sides are
equally likely to appear. What is the probability of rolling an
even number?
Answer.
The event of rolling even number: {2,4,6}
We have
P({1}) = P({2}) = P({3}) = P({4}) = P({5}) = P({6}) = 1
6
.
Note {2}, {4} and {6} is a sequence of mutually exclusive
events.
So by Axiom 3, the prob. of rolling an even number:
P({2,4,6}) = P({2}∪{4}∪{6}) = P({2})+P({4})+P({6}) = 1
2
.
16
A common method of defining a legitimate
probability function
Theorem: Let S = {s1, . . . , sn} be a finite set. Let B be any
sigma algebra of subsets of S. Let p1, . . . ,pn be
nonnegative numbers with
∑n
i=1 pi = 1. For any A ∈ B,
define P(A) by
P(A) =
∑
i:si∈A
pi .
(The sum over an empty set is defined to be 0 .)
Then P is a probability function on B.
This remains true if S = {s1, s2, . . .} is a countable set.
(See next slide for a proof)
17
Proof:
For any A ∈ B,P(A) =∑{i:si∈A} pi ≥ 0
=⇒ Axiom 1 is true.
P(S) =
∑
{i:si∈S}
pi =
n∑
i=1
pi = 1
=⇒ Axiom 2 is true.
Let A1, . . . ,Ak denote pairwise disjoint events. ( B contains only
a finite number of sets, so we need consider only finite disjoint
unions.) Then,
P
(
k⋃
i=1
Ai
)
=
∑
{j:sj∈∪ki=1Ai}
pj =
k∑
i=1
∑
{j:sj∈Ai}
pj =
k∑
i=1
P (Ai)
=⇒ Axiom 3 is true.
18
Some properties
P(∅) = 0
P(E) ≤ 1
P(Ec) = 1− P(E), or equivalently, P(E) + P(Ec) = 1.
If E ⊂ F , then P(E) ≤ P(F ).
Proof.
Note F = (E ∪ Ec)F = (EF ) ∪ (EcF ) = E ∪ (EcF )
(ExerciseUse Venn diagram to show this.)
Note Ec and F are mutually exclusive
By Axiom 3:
P(F ) = P(E) + P(EcF ) ≤ P(E)
(since P(EcF ) ≥ 0 by Axiom 1)
19
Some properties
P(E ∩ F c) = P(E)− P(E ∩ F ).
Proof.
E = (E ∩ F ) ∪ (E ∩ F c) (Exercise Show this in Venn
diagrams.)
E ∩ F and E ∩ F c are mutually exclusive
P(E) = P(E ∩ F ) + P(E ∩ F c)
20
Some properties
P(E ∪ F ) = P(E) + P(F )− P(EF ).
Proof.
E ∪ F = S(E ∪ F ) = (E ∪ Ec)(E ∪ F ) = E ∪ (EcF )
E and EcF mutually exclusive (by Axiom 3) =⇒
P(E ∪ F ) = P(E ∪ (EcF )) = P(E) + P(EcF )
Note F = (EF ) ∪ (EcF ) (by Axiom 3) =⇒
P(F ) = P(EF ) + P(EcF ), i.e., P(EcF ) = P(F )− P(EF )
21
Some properties
P(A) =
∑∞
i=1 P(A ∩ Ci) for any partition C1,C2, . . .
Proof.
A = A ∩ S = A ∩ (
∞⋂
i=1
Ci) =
∞⋃
i=1
(A ∩ Ci)
where the last equality follows from the Distributive Law.
Thus,
P(A) = P
(∞⋃
i=1
(A ∩ Ci)
)
Note Ci are disjoint =⇒ the sets A ∩ Ci are also disjoint
(Exercise Show this in Venn diagrams.)
P(A) = P
(∞⋃
i=1
(A ∩ Ci)
)
=
∞∑
i=1
P (A ∩ Ci)
22
Boole’s inequality
Boole’s inequality: P(E1 ∪ E2 ∪ · · · ∪ En) ≤
∑n
i=1 P(Ei).
Proof.
Construct a disjoint collection A∗1,A
∗
2, . . ., with the property that
∪∞i=1A∗i = ∪∞i=1Ai in the following way:
A∗1 = A1, A
∗
i = Ai\
i−1⋃
j=1
Aj
, i = 2,3, . . .
Notation: A\B = A ∩ Bc.