Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
COMP4141
Theory of Computation Lecture 7: Algorithms for CFLs Administrivia Assignment 1 now overdue Assignment 2 available: due Monday, 21 March at 12 noon (AEDST) 2 Outline Decision Problems Algorithms for Regular Languages Algorithms for Context-Free Languages 3 Common questions about languages We have seen a number of ways of representing languages in a finite way (automata, regular expresions, grammars). This lets us address several questions algorithmically: Emptiness: Given a representation, is the associated language empty? Universality: Given a representation, is the associated language universal? Word/Acceptance problem: Given a word w and a representation of a language L, is w ∈ L? Language equivalence: Given representations of two languages, L1 and L2, is L1 = L2? Language inclusion: Given representations of two languages, L1 and L2, is L1 ⊆ L2? 4 Decision Problems When we can phrase a question we want answered as a Yes/No question this is known as a Decision Problem. For example: NFA Emptiness Input: An NFA N Question: Is L(N) = ∅ 5 Outline Decision Problems Algorithms for Regular Languages Algorithms for Context-Free Languages 6 Deciding emptiness of a regular language How to decide emptiness of a regular language L depends on its representation, of which we’ve met a few. NFA Emptiness Input: An NFA N Question: Is L(N) = ∅? RegExp Emptiness Input: A regular expression R Question: Is L(R) = ∅? 7 NFA Emptiness When L is given as L(A) of an NFA (Q,Σ, δ, q0,F ) (or DFA, -NFA) is an exercise in graph reachability: Is there a final state that can be reached from the initial state? This can be done by a depth-first search in time linear in the number of edges and vertices. (But note there could be |Q|2 edges.) 8 RegExp Emptiness When L is given as L(R) of a regular expression R then we can abstract inductively as follows: Base: L(∅) = ∅, L() 6= ∅, and L(a) 6= ∅. Induction: L(R1 ∪ R2) = ∅ iff L(R1) = L(R2) = ∅. L(R1 · R2) = ∅ iff L(R1) = ∅ or L(R2) = ∅. L(R∗1 ) 6= ∅. L((R1)) = ∅ iff L(R1) = ∅. This implies that L(R) = ∅ can be decided in O(|R|) time. 9 The word problem for regular languages Like the emptiness problem, the word problem depends on the representation. DFA Acceptance Input: A DFA D and a word w Question: Is w ∈ L(D)? NFA Acceptance Input: An NFA N and a word w Question: Is w ∈ L(N)? 10 Word problems DFA: easy, just feed the word to the automaton. NFA: marking algorithm: On input w = a1 . . . a|w | 1 Set of marks M := {q0}. 2 For i = 1 to |w | do M := ⋃ q∈M δ(q, ai ) 3 Return whether F ∩M 6= ∅. Others: translate to an equivalent DFA. 11 The inclusion problem for regular languages DFA Language Inclusion Input: Two DFAs A1 and A2 Question: Is L(A1) ⊆ L(A2)? Algorithm: 1 Construct a DFA recognising L(A1) ∩ L(A2) 2 Check for emptiness Other represntations: Translate to equivalent DFAs. 12 The inclusion problem for regular languages DFA Language Inclusion Input: Two DFAs A1 and A2 Question: Is L(A1) ⊆ L(A2)? Algorithm: 1 Construct a DFA recognising L(A1) ∩ L(A2) 2 Check for emptiness Other represntations: Translate to equivalent DFAs. 12 The inclusion problem for regular languages DFA Language Inclusion Input: Two DFAs A1 and A2 Question: Is L(A1) ⊆ L(A2)? Algorithm: 1 Construct a DFA recognising L(A1) ∩ L(A2) 2 Check for emptiness Other represntations: Translate to equivalent DFAs. 12 The equivalence problem for regular languages DFA Language Equivalence Input: Two DFAs A1 and A2 Question: Is L(A1) = L(A2)? Algorithm: 1 Check if L(A1) ⊆ L(A2) 2 Check if L(A2) ⊆ L(A1) Other representations: translate to equivalent DFAs. 13 The equivalence problem for regular languages DFA Language Equivalence Input: Two DFAs A1 and A2 Question: Is L(A1) = L(A2)? Algorithm: 1 Check if L(A1) ⊆ L(A2) 2 Check if L(A2) ⊆ L(A1) Other representations: translate to equivalent DFAs. 13 The equivalence problem for regular languages DFA Language Equivalence Input: Two DFAs A1 and A2 Question: Is L(A1) = L(A2)? Algorithm: 1 Check if L(A1) ⊆ L(A2) 2 Check if L(A2) ⊆ L(A1) Other representations: translate to equivalent DFAs. 13 Outline Decision Problems Algorithms for Regular Languages Algorithms for Context-Free Languages 14 Emptiness of CFLs CFG Emptiness Input: A CFG G = (V ,Σ,P,S) Question: Is L(G ) = ∅? The emptiness problem for CFGs can be decided as follows: 1 Mark the terminals and , as generating 2 Mark as generating all those non-terminals that have a production with only generating symbols in the RHS. 3 Repeat step 2 until nothing new is marked generating. 4 Check whether the start symbol is marked as generating. 15 Emptiness of CFLs CFG Emptiness Input: A CFG G = (V ,Σ,P,S) Question: Is L(G ) = ∅? The emptiness problem for CFGs can be decided as follows: 1 Mark the terminals and , as generating 2 Mark as generating all those non-terminals that have a production with only generating symbols in the RHS. 3 Repeat step 2 until nothing new is marked generating. 4 Check whether the start symbol is marked as generating. 15 Chomsky normal form For many of the remaining questions, it is convenient to introduce a particularly simple class of CFGs that is still as powerful as CFGs in general. Definition A context free (type 2) grammar is in Chomsky normal form if every production is of one of the forms S → , or A→ BC where B and C are not S , or A→ a. Theorem Any CFL is generated by a CFG in Chomsky normal form. 16 Constructing the Chomsky normal form Let G = (V ,Σ,P, S) be a CFG. Step 0 Remove S and from RHS Step 1 Remove terminals from RHS Step 2 Remove singleton rules (A→ B) Step 3 Reduce RHS to pairs of non-terminals Step 0: Define an equivalent CFG G0 = (V ∪ {S0},Σ,P0,S0) with a fresh start variable S0 and the production S0 → S . We also remove all productions of the form A → and patch this up by introducing new productions for every occurrence of A in a body with that occurrence removed. (E.g., A→ | aAbA becomes A→ ab | abA | aAb | aAbA.) Productions B → A are replaced by B → unless that is one we had removed earlier. Repeat until occurs at most in S0 → . 17 Constructing the Chomsky normal form Let G = (V ,Σ,P, S) be a CFG. Step 0 Remove S and from RHS Step 1 Remove terminals from RHS Step 2 Remove singleton rules (A→ B) Step 3 Reduce RHS to pairs of non-terminals Step 1: Define an equivalent CFG G1 = (V1,Σ,P1,S0) with fresh non-terminals for terminals V1 = V ∪ {S0} ∪ { Xa | a ∈ Σ }. Here P1 is derived from P0 by replacing all occurrences of a terminal a ∈ Σ in the body of a production by the corresponding non-terminal Xa and then adding productions Xa → a. 17 Constructing the Chomsky normal form Let G = (V ,Σ,P, S) be a CFG. Step 0 Remove S and from RHS Step 1 Remove terminals from RHS Step 2 Remove singleton rules (A→ B) Step 3 Reduce RHS to pairs of non-terminals Step 2: Define an equivalent CFG G2 = (V1,Σ,P2, S0). To generate P2 from P1, eliminate productions of the form A→ B by 1 determining all derivations A1 ⇒G1 . . .⇒G1 Ak ⇒G1 α /∈ V1 not containing repetitions of non-terminals, 2 drop all productions A→ B, and 3 introduce A1 → α for each of the derivations determined before. 17 Constructing the Chomsky normal form Let G = (V ,Σ,P, S) be a CFG. Step 0 Remove S and from RHS Step 1 Remove terminals from RHS Step 2 Remove singleton rules (A→ B) Step 3 Reduce RHS to pairs of non-terminals Step 3: Define an equivalent CFG G3 = (V3,Σ,P3, S0). To generate P3 from P2, replace all productions A→ B1 . . .Bn with n > 2 by A→ B1C1, C1 → B2C2, . . . , Cn−2 → Bn−1Bn for fresh non-terminals Ci . (These are distinct for each such production.) 17 Constructing the Chomsky normal form Let G = (V ,Σ,P, S) be a CFG. Step 0 Remove S and from RHS Step 1 Remove terminals from RHS Step 2 Remove singleton rules (A→ B) Step 3 Reduce RHS to pairs of non-terminals S → ASA | aB A → B | S B → b | 17 Constructing the Chomsky normal form Let G = (V ,Σ,P, S) be a CFG. Step 0 Remove S and from RHS Step 1 Remove terminals from RHS Step 2 Remove singleton rules (A→ B) Step 3 Reduce RHS to pairs of non-terminals S0 → S S → ASA | aB | a A → B | S | b | B → b 17 Constructing the Chomsky normal form Let G = (V ,Σ,P, S) be a CFG. Step 0 Remove S and from RHS Step 1 Remove terminals from RHS Step 2 Remove singleton rules (A→ B) Step 3 Reduce RHS to pairs of non-terminals