Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
COMP4141
Theory of Computation
Lecture 7: Algorithms for CFLs
Administrivia
Assignment 1 now overdue
Assignment 2 available: due Monday, 21 March at 12 noon
(AEDST)
2
Outline
Decision Problems
Algorithms for Regular Languages
Algorithms for Context-Free Languages
3
Common questions about languages
We have seen a number of ways of representing languages in a
finite way (automata, regular expresions, grammars). This lets us
address several questions algorithmically:
Emptiness: Given a representation, is the associated language
empty?
Universality: Given a representation, is the associated
language universal?
Word/Acceptance problem: Given a word w and a
representation of a language L, is w ∈ L?
Language equivalence: Given representations of two
languages, L1 and L2, is L1 = L2?
Language inclusion: Given representations of two languages,
L1 and L2, is L1 ⊆ L2?
4
Decision Problems
When we can phrase a question we want answered as a Yes/No
question this is known as a Decision Problem.
For example:
NFA Emptiness
Input: An NFA N
Question: Is L(N) = ∅
5
Outline
Decision Problems
Algorithms for Regular Languages
Algorithms for Context-Free Languages
6
Deciding emptiness of a regular language
How to decide emptiness of a regular language L depends on its
representation, of which we’ve met a few.
NFA Emptiness
Input: An NFA N
Question: Is L(N) = ∅?
RegExp Emptiness
Input: A regular expression R
Question: Is L(R) = ∅?
7
NFA Emptiness
When L is given as L(A) of an NFA (Q,Σ, δ, q0,F ) (or DFA,
-NFA) is an exercise in graph reachability: Is there a final state
that can be reached from the initial state?
This can be done by a depth-first search in time linear in the
number of edges and vertices. (But note there could be |Q|2
edges.)
8
RegExp Emptiness
When L is given as L(R) of a regular expression R then we can
abstract inductively as follows:
Base: L(∅) = ∅, L() 6= ∅, and L(a) 6= ∅.
Induction:
L(R1 ∪ R2) = ∅ iff L(R1) = L(R2) = ∅.
L(R1 · R2) = ∅ iff L(R1) = ∅ or L(R2) = ∅.
L(R∗1 ) 6= ∅.
L((R1)) = ∅ iff L(R1) = ∅.
This implies that L(R) = ∅ can be decided in O(|R|) time.
9
The word problem for regular languages
Like the emptiness problem, the word problem depends on the
representation.
DFA Acceptance
Input: A DFA D and a word w
Question: Is w ∈ L(D)?
NFA Acceptance
Input: An NFA N and a word w
Question: Is w ∈ L(N)?
10
Word problems
DFA: easy, just feed the word to the automaton.
NFA: marking algorithm:
On input w = a1 . . . a|w |
1 Set of marks M := {q0}.
2 For i = 1 to |w | do
M :=
⋃
q∈M δ(q, ai )
3 Return whether F ∩M 6= ∅.
Others: translate to an equivalent DFA.
11
The inclusion problem for regular languages
DFA Language Inclusion
Input: Two DFAs A1 and A2
Question: Is L(A1) ⊆ L(A2)?
Algorithm:
1 Construct a DFA recognising L(A1) ∩ L(A2)
2 Check for emptiness
Other represntations: Translate to equivalent DFAs.
12
The inclusion problem for regular languages
DFA Language Inclusion
Input: Two DFAs A1 and A2
Question: Is L(A1) ⊆ L(A2)?
Algorithm:
1 Construct a DFA recognising L(A1) ∩ L(A2)
2 Check for emptiness
Other represntations: Translate to equivalent DFAs.
12
The inclusion problem for regular languages
DFA Language Inclusion
Input: Two DFAs A1 and A2
Question: Is L(A1) ⊆ L(A2)?
Algorithm:
1 Construct a DFA recognising L(A1) ∩ L(A2)
2 Check for emptiness
Other represntations: Translate to equivalent DFAs.
12
The equivalence problem for regular languages
DFA Language Equivalence
Input: Two DFAs A1 and A2
Question: Is L(A1) = L(A2)?
Algorithm:
1 Check if L(A1) ⊆ L(A2)
2 Check if L(A2) ⊆ L(A1)
Other representations: translate to equivalent DFAs.
13
The equivalence problem for regular languages
DFA Language Equivalence
Input: Two DFAs A1 and A2
Question: Is L(A1) = L(A2)?
Algorithm:
1 Check if L(A1) ⊆ L(A2)
2 Check if L(A2) ⊆ L(A1)
Other representations: translate to equivalent DFAs.
13
The equivalence problem for regular languages
DFA Language Equivalence
Input: Two DFAs A1 and A2
Question: Is L(A1) = L(A2)?
Algorithm:
1 Check if L(A1) ⊆ L(A2)
2 Check if L(A2) ⊆ L(A1)
Other representations: translate to equivalent DFAs.
13
Outline
Decision Problems
Algorithms for Regular Languages
Algorithms for Context-Free Languages
14
Emptiness of CFLs
CFG Emptiness
Input: A CFG G = (V ,Σ,P,S)
Question: Is L(G ) = ∅?
The emptiness problem for CFGs can be decided as follows:
1 Mark the terminals and , as generating
2 Mark as generating all those non-terminals that have a
production with only generating symbols in the RHS.
3 Repeat step 2 until nothing new is marked generating.
4 Check whether the start symbol is marked as generating.
15
Emptiness of CFLs
CFG Emptiness
Input: A CFG G = (V ,Σ,P,S)
Question: Is L(G ) = ∅?
The emptiness problem for CFGs can be decided as follows:
1 Mark the terminals and , as generating
2 Mark as generating all those non-terminals that have a
production with only generating symbols in the RHS.
3 Repeat step 2 until nothing new is marked generating.
4 Check whether the start symbol is marked as generating.
15
Chomsky normal form
For many of the remaining questions, it is convenient to introduce
a particularly simple class of CFGs that is still as powerful as CFGs
in general.
Definition
A context free (type 2) grammar is in Chomsky normal form if
every production is of one of the forms
S → , or
A→ BC where B and C are not S , or
A→ a.
Theorem
Any CFL is generated by a CFG in Chomsky normal form.
16
Constructing the Chomsky normal form
Let G = (V ,Σ,P, S) be a CFG.
Step 0 Remove S and from RHS
Step 1 Remove terminals from RHS
Step 2 Remove singleton rules (A→ B)
Step 3 Reduce RHS to pairs of non-terminals
Step 0: Define an equivalent CFG G0 = (V ∪ {S0},Σ,P0,S0) with a fresh start
variable S0 and the production S0 → S . We also remove all productions of
the form A → and patch this up by introducing new productions for every
occurrence of A in a body with that occurrence removed.
(E.g., A→ | aAbA becomes A→ ab | abA | aAb | aAbA.)
Productions B → A are replaced by B → unless that is one we had removed
earlier.
Repeat until occurs at most in S0 → .
17
Constructing the Chomsky normal form
Let G = (V ,Σ,P, S) be a CFG.
Step 0 Remove S and from RHS
Step 1 Remove terminals from RHS
Step 2 Remove singleton rules (A→ B)
Step 3 Reduce RHS to pairs of non-terminals
Step 1: Define an equivalent CFG G1 = (V1,Σ,P1,S0) with fresh non-terminals
for terminals V1 = V ∪ {S0} ∪ { Xa | a ∈ Σ }. Here P1 is derived from P0 by
replacing all occurrences of a terminal a ∈ Σ in the body of a production by the
corresponding non-terminal Xa and then adding productions Xa → a.
17
Constructing the Chomsky normal form
Let G = (V ,Σ,P, S) be a CFG.
Step 0 Remove S and from RHS
Step 1 Remove terminals from RHS
Step 2 Remove singleton rules (A→ B)
Step 3 Reduce RHS to pairs of non-terminals
Step 2: Define an equivalent CFG G2 = (V1,Σ,P2, S0). To generate P2 from
P1, eliminate productions of the form A→ B by
1 determining all derivations A1 ⇒G1 . . .⇒G1 Ak ⇒G1 α /∈ V1 not
containing repetitions of non-terminals,
2 drop all productions A→ B, and
3 introduce A1 → α for each of the derivations determined before.
17
Constructing the Chomsky normal form
Let G = (V ,Σ,P, S) be a CFG.
Step 0 Remove S and from RHS
Step 1 Remove terminals from RHS
Step 2 Remove singleton rules (A→ B)
Step 3 Reduce RHS to pairs of non-terminals
Step 3: Define an equivalent CFG G3 = (V3,Σ,P3, S0). To generate P3 from
P2, replace all productions A→ B1 . . .Bn with n > 2 by A→ B1C1, C1 → B2C2,
. . . , Cn−2 → Bn−1Bn for fresh non-terminals Ci . (These are distinct for each
such production.)
17
Constructing the Chomsky normal form
Let G = (V ,Σ,P, S) be a CFG.
Step 0 Remove S and from RHS
Step 1 Remove terminals from RHS
Step 2 Remove singleton rules (A→ B)
Step 3 Reduce RHS to pairs of non-terminals
S → ASA | aB
A → B | S
B → b |
17
Constructing the Chomsky normal form
Let G = (V ,Σ,P, S) be a CFG.
Step 0 Remove S and from RHS
Step 1 Remove terminals from RHS
Step 2 Remove singleton rules (A→ B)
Step 3 Reduce RHS to pairs of non-terminals
S0 → S
S → ASA | aB | a
A → B | S | b |
B → b
17
Constructing the Chomsky normal form
Let G = (V ,Σ,P, S) be a CFG.
Step 0 Remove S and from RHS
Step 1 Remove terminals from RHS
Step 2 Remove singleton rules (A→ B)
Step 3 Reduce RHS to pairs of non-terminals