Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
The objective of this exercise is to reconstruct/unzip a message archived with a binary-tree
based algorithm. The program should ask for a single filename at the start: “Please
enter filename to decode: “, decode the message in the file and print it out to the
console. The name of the compressed message file will end in .arch, e.g. “monalisa.arch”.
The file consists of two or three lines: the first one or two lines contain the encoding
scheme, and the second or third line contains the archived message.
2. Encoding
The archival algorithm uses a binary tree. The edges of the tree represent bits, and the leaf
nodes contain one character each. Internal nodes are empty. An edge to a left child
always represents a 0, and an edge to a right child always represents a 1. Characters are
encoded by the sequence of bits along a path from the root to a particular leaf. The below tree
serves as an example.
With the above encoding, the bit string:
10110101011101101010100 is parsed as
1011|0|1010|111|0|110|1010|100
which is decoded as:
cadbard!
With this encoding, we can automatically infer where one character ends and another
begins. That is because no character code can be the start of another character code. For example,
if you have a character with the code 111, you cannot have the codes 1 and 11, as they
would be internal nodes.
The following steps decode one character from the bit string:
Start at root
Repeat until at leaf
Scan one bit
Go to left child if 0; else go to right child
Print leaf payload
3. Input Format
The archive file consists of two lines: the first line contains the encoding scheme, and
the second line contains the compressed string. For ease of development and to make the
archive file human-readable, each bit is represented as the character ‘0’ or ‘1’, rather
than as an actual bit from a binary file.
The encoding scheme can be represented as a string. For example, the tree from section 2
can be represented as:
^a^^!^dc^rb
where ^ indicates an internal node. The above code represents a preorder traversal of
the tree.
The cadbard! message is encoded in the following file (“cadbard.arch”):
^a^^!^dc^rb
10110101011101101010100
There are four test files in HW4S2021_Test_Files.zip. Note: the encoding scheme representations
may include a space character and a newline character, thereby breaking the tree string into
two lines! The newline character needs to be parsed correctly if the encoding file has three lines in
total.