Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
CS3342 – Assignment
!!! Submit all your responses for Q1 – Q5 as a single pdf file to Assignment 1 (Q1-Q5)
and your Python code for Q6 to Assignment 1 Q6 (code). Solutions should be typed but
high-quality hand-written solutions are acceptable.
1. (10pt) Consider a programming language that has comments as follows:
- comments start with a line beginning with three consecutive equal signs, ”===”, followed by
any characters until the end of the line (and a newline ’\n’ at the end)
- comments end with a line consisting of equal signs only, at least three (and a newline ’\n’ at
the end)
Example of a comment (newlines are shown for clarity):
======= this is a comment =======\n
*** some stuff ***
================\n
Write a regular expression that describes precisely the above described comments. You can use the
notation Σ− {a} to describe the set of all characters different from a.
2. (5pt) A scanner is built for a language where the identifiers start with a letter followed by any
number of letters, digits, or underscores, such that there are no consecutive underscores. Draw
a deterministic finite automaton (DFA) with fewest states that accepts all identifiers and nothing
else.
3. (10pt) Consider the simple calculator language (slide 16 of scanning) and add to it ranges:
range −→ digit digit∗.. digit digit∗
Modify the scanner table and DFA on slide 28 to reflect this change. (Note that the DFA already
requires some small changes to match the table, as discussed in class.) To save time, you are allowed
to cut-and-paste the table and DFA (as figures) and only indicate the changes.
4. (10pt) Keywords fit the definition of identifiers and scanners identify them as such, and then look
them up a table of keywords, because otherwise the DFA is unnecessarily larger. This question
addresses the size of this DFA. Assume that the identifiers and the (made-up) keywords are defined
as follows:
identifier −→ (_ | letter)(_ | letter | digit)∗
letter −→ a | b | · · · | z
digit −→ 0 | 1 | · · · | 9
keyword −→ float | floats | flop | top | tops
Draw a DFA with fewest states that recognizes the above keywords directly. That is, there will be
separate accepting states for keywords and different ones for identifiers. You can use any meaningful
notation such as letter− {a, b, · · · } or 6= a to denote various subsets of characters.
1
5. (15pt) Consider the following grammar, G:
1. P −→ S $$
2. S −→ if (e) then M O
3. S −→ other
4. M −→ if (e) then M else M
5. M −→ other
6. O −→ else T
7. O −→ ε
8. T −→ if (e) then T
9. T −→ other
(a) (5pt) Compute the sets first(X) and follow(X), for all non-terminals X, and the sets
predict(p) for all production rules p.
(b) (5pt) Prove that G an LL(1) grammar.
(c) (5pt) We said that there is no top-down grammar for if..then..else statements. Does G
contradict this? Prove your answer.
6. (50pt) Write a Python program comm rm.py to remove all comments from a Python program. The
program should work as follows:
> python comm_rm.py input_file.py
where input_file.py is any (correct) Python program. The program with comments removed will
be saved in the file input_file_rm.py.
You are not allowed to use the regular expression capabilities of Python, as that would defeat the
purpose of the question.
Notes: JFLAP: You are allowed to use JFLAP to help you solve the assignment. You still need to explain
clearly your solution. Also, make sure you understand what it does; JFLAP will not be available
during exams!
LLMs: You are allowed to use LLMs (Large Language Models), such as ChatGPT, but, again,
they will not be available during exams.