CS318 Programming Assignment
Programming Assignment
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
CS318 Programming Assignment 2
Assembler
Upload your completed Assembler.java file to the Blackboard page for this assignment.
Instructions:
• Download the ZIP compressed file from this assignment page in Blackboard (CS318_Prog2.zip), and
extract the files.
• The starter code files are Assembler.java, LabelOffset.java, and Opcode.java.
• The solution from Assignment 1 is provided as a Java Archive (JAR) file: prog1.jar. This JAR contains the
completed classes from Assignment 1. The Binary class will be useful for Assignment 2.
• The Javadoc documentation for the Binary, ALU, LabelOffset, and Opcode classes are included in the
Prog2Doc folder in the zip file.
• Complete the two methods in the Assembler class, as described below and indicated by the comments
(pass1 and pass2). You must follow the instructions for the actions that each of these methods must
perform.
• Do not change the assemble method in the Assembler class; do not change the LabelOffset or Opcode
classes.
• You may write additional private methods in the Assembler class.
• You are provided with a Java test program (TestAssembler.java) that runs the assembler over four
assembly code programs, and compares your program’s output with the correct output files. The tests
increase in complexity. Work on passing Test 1, then Test 2, etc.
Guidelines for working with others on this assignment:
It is strongly recommended that you work with a partner on this assignment. Your partner is the only person
with whom you may share your code solution.
You may discuss ideas about how to do things in Java with any of your classmates, tutors, and other students.
This assignment requires the use of files, Strings, and other Java concepts in ways that may be new to you. You
may communicate with others about how to handle these Java concepts.
Overview:
This is the second of four assignments where we are building a simulation of a simple computer. In this
assignment, you will write the Assembler that translates an assembly language program into machine code. The
input to the Assembler is a file with assembly language code. The Assembler has two output files: the binary
data segment (.data file) and the binary code segment (.code file).
Our assembler supports an assembly language that is a restricted version of the A64 language. It supports the
eight A64 instructions listed below, as well as a halt (HLT) instruction that indicates the end of the program. In
order to differentiate our simulated computer from the ARMv8 emulator, the registers in our simulated
computer will be named R0 through R31.
We also support a restricted version of the data segment. The only type is a 32-bit word type, and the values will
always be signed decimal integers. The data segment will not contain any labels. References to the data segment
from within the code segment assume that the first address of the data segment is offset zero.
Instructions supported by our simulated computer
CS318 Programming Assignment 2 Page 2 of 4
• Add values in Rm and Rn, put result in Rd: ADD Rd,Rm,Rn
• Subtract values in Rn from Rm, put result in Rd: SUB Rd,Rm,Rn
• Logical AND of values in Rm and Rn, put result in Rd: AND Rd,Rm,Rn
• Logical OR of values in Rm and Rn, put result in Rd: ORR Rd,Rm,Rn
• Load value from memory address (value in Rm plus literal offset) into Rd: LDR Rd,[Rm,#N]
• Store value to memory address (value in Rm plus literal offset) from Rd: STR Rd,[Rm,#N]
• Branch to label if contents of Rm are zero: CBZ Rm,label
• Unconditional branch to label: B label
• End of program (halt): HLT
Machine Language format for the Instructions
This is the human-readable format with the MSB on the left and LSB (bit at index zero) on the right.
The letter ‘B’ indicates a bit that will be filled in with a 0 or a 1.
Instr. opcode Source Reg. 2 Shift Amount Source Reg. 1 Dest. Reg.
bit range [31-21] [20-16] [15-10] [9-5] [4-0]
ADD 100 0101 1000 BBBBB 000000 BBBBB BBBBB
SUB 110 0101 1000 BBBBB 000000 BBBBB BBBBB
AND 100 0101 0000 BBBBB 000000 BBBBB BBBBB
ORR 101 0101 0000 BBBBB 000000 BBBBB BBBBB
Instr. opcode Immediate 11-10 Base Reg. Value Reg.
bit range [31-21] [20-12] [11-10] [9-5] [4-0]
LDR 111 1100 0010 B BBBB BBBB 00 BBBBB BBBBB
STR 111 1100 0000 B BBBB BBBB 00 BBBBB BBBBB
Instr. opcode Immediate Register
bit range [31-24] [23-5] [4-0]
CBZ 1011 0100 BBB BBBB BBBB BBBB BBBB BBBBB
Instr. opcode Immediate
bit range [31-26] [25-0]
B 000101 BB BBBB BBBB BBBB BBBB BBBB BBBB
Instr. opcode not used
bit range [31-21] [20-0]
HLT 110 1010 0010 0 0000 0000 0000 0000 0000
Implementation Overview:
Assembler pass1 method:
Read the assembly code file and determine the size (in bytes) of the data and code segments, also create a list
of the labels in the code segment and their relative offsets. At the end of pass 1, the integer size of the data
and code segments must be written to their respective output files.
• Data segment: the only data type is the 4-byte (32-bit) word type. The size of the data segment is the
number of values multiplied by 4-bytes per value. The .word directive will be the first token on each
line. The remaining tokens on each line are a comma separated list of signed decimal values. The data
segment has an arbitrary number of lines, and each line has an arbitrary number of values. There are
no labels in the data segment.
CS318 Programming Assignment 2 Page 3 of 4
• Code segment: Each instruction will be stored in 4-bytes (32-bits). The size of the code segment is the
number of instructions multiplied by 4-bytes per instruction.
• Code segment labels: A label is a string of letters that ends with a colon. The relative offset of a label is
(4 x number of instructions before label). This sets the relative offset of the label as the number of
bytes in memory from the beginning of the program to the instruction that follows the label. Use the
LabelOffset struct to store the text of the label and its offset value.
Assembler pass2 method:
Read the assembly code file and write the binary data segment and binary machine language code segment to
their respective files. Each line of the binary files contains 1 byte (8 bits) where bit 0 is on the left and bit 7 is
on the right. The 4 bytes of each number or instruction are written in little byte first order where the smallest
byte is first.
For example, the 32-bit string (shown in human-readable format with bit 0 on the right):
1000 0101 1010 1100 0011 0101 1010 1100
written to the binary file with the smallest byte first, and each byte written with bit 0 on the left:
bit 0 bit 1 bit 2 bit 3 bit 4 bit 5 bit 6 bit 7
1st byte 0 0 1 1 0 1 0 1
2nd byte 1 0 1 0 1 1 0 0
3rd byte 0 0 1 1 0 1 0 1
4th byte 1 0 1 0 0 0 0 1
Assumptions:
• Assume that the assembly language code files used for testing will have the correct format.
• The literal values in the assembly language code (data segment and code segment) are always in
signed decimal representation, and are within the valid range. You may use the conversion methods
from the Assignment 1 Binary class. When you need a binary value that is less than 32 bits (such as a 5-
bit register name) you may assume that the lowest bits in the 32-bit representation are correct.
• The register numbers in the assembly language code are valid (between 0 and 31, inclusive).
• Valid file pathnames are provided. The Assembler class may throw FileNotFoundException,
IOException, and any other exceptions that are related to file I/O problems.
Suggestions:
• In pass 1, write the size of the data and code segments to their respective files. In pass 2, make sure to
open these files in append mode so that you do not overwrite what was written during the first pass.
The FileWriter and PrintWriter classes allow for opening a file in append mode.
• One approach for reading the input file is to use the Scanner class and read one line at a time into a
String object using the nextLine method. Use the String class trim method to remove leading and
trailing whitespace from the string. Then use other String class methods to extract the information you
need from the string. Some String methods that may be useful are split and substring.