Monitor the OFFICIAL A10 Piazza Post for any corrections, updates, or frequently asked questions.
The assignment files are available here (they are also provided in seashell). You do not have to complete this assignment sequentially.
The content for question 1 is covered in section 12 of the notes.
The content for questions 2 and 3a are covered at the end of section 11 (trees). The content for question 3b, 3c and 4 has already been covered.
BLACK QUESTION (moderate collaboration allowed)
IMPORTANT: For this question, most of the code can be “borrowed” from the course notes. You only need to modify the provided code to meet the ADT specifications. For practice, you may want to implement the code from scratch, but you MUST ensure that the add/remove behaviour is the same as discussed in the notes. When inserting a duplicate key, do NOT do a remove followed by an insert — replace (free) the existing key/value pair with the new key/value.
For this question, you must complete the Dictionary ADT implementation (dictionary.c). There are only two primary differences between this ADT and the version in the notes:
Because of the first point above, you will need to provide functions to the dict_create function.
The dict_print function requires you to print out the underlying BST structure in one of the following orders:
INORDER
POSTORDER
Pre-order: F, B, A, D, C, E, G, I, H.
In-order: A, B, C, D, E, F, G, H, I.
Post-order: A, C, E, D, B, H, I, G, F.
These images are blatantly stolen from Tree_traversal page on Wikipedia, which can be consulted if you are not familiar with tree traversal terminology.
For this question, the Marmoset basic tests (“public” tests) will be identical to the simple
(assertion) test provided.
We have ALSO provided an additional advanced test, which demonstrates how you can truly have ANYTHING as a key or value. For this test, we have nested dictionaries: the values of the master dictionary are themselves a dictionary.
GOLD QUESTION (no collaboration allowed)
In the notes (here), we discuss how it can be confusing because there is section of memory and a data structure known as a “heap”. In this question you will implement a (min) heap data structure.
A minheap, like a binary search tree, is a form of binary tree. To be considered a minheap, the tree must satisfy two requirements:
The following are examples of a minheap:
The following violate at least one of the two requirements for a minheap:
Of course, in addition to the minheap data structure there is also a maxheap, where the order property specifies that the largest value is at the root of the tree.
Pseudocode for Adding an Element:
Notice that Step 2 will stop when the new node reaches the root or when the new node’s parent has a value smaller than or equal to the new node’s value.
Psuedocode for Removing the Root:
Notice that Step 3 will stop when the out-of-place node reaches a leaf or it has a value that is smaller than or equal to all its children.
Instead of storing a minheap as a set of linked nodes, it is far more common to encode the tree in an array. This is space efficient (there is no need to store pointers at every node) and makes it substantially easier to access the node that is farthest right on the deepest
level of the tree. Encoding the tree into an array works as follows:
The root node’s value is stored at index 0.
The left child of the node at index i is stored at index i*2+1. The right child of the node at index i is stored at index i*2+2.
The parent of the node at index i is stored at index (i-1)/2 [using int division].
Using this encoding, a complete binary tree will leave no “holes” in the array. In other words, if there are n nodes in the tree, then the node that is farthest right on the deepest level of the tree is stored at index n-1. Additionally, the next available spot in the tree that retains the completeness of the tree is index n.
A minheap can be easily used to sort numbers. Simply add all of the numbers to a heap, and then when they are removed they will be removed in ascending order.
For your implementation, you must use the data structure we have provided. You should use the array doubling scheme, but you do not have to shrink the array. Instead of doubling the length of the array, increase the array by (2 * maxlen + 1), as this will always ensure your array can store a completely full tree (e.g., 1, 3, 7, 15, …)
You MUST follow the procedures for adding and removing from a minheap described above. In addition, your heapsort function must use a heap as described above.
The following trie stores the words “a”, “ace” and “zoo”.
To see how the word “ace” is stored, consider that the first letter of the word is ‘a’, which is also the first letter of the alphabet. From the root node, we follow the first child link to the next node. This node is part of words that start with the letter ‘a’. Because the string “a” is contained in the trie, the value of end_word at this node is TRUE (filled in). You can think of this as corresponding to the null terminator for the word “a”. The next letter in the word “ace” is ‘c’ (third letter of the alphabet) and so we follow the third child to the next node. The word “ac” is not a valid word, so the value of end_word here is false.
Because the next letter ‘e’ is the fifth letter of the alphabet, we follow the fifth child.
There are no additional letters in the word “ace”, so the end_word at this node is TRUE to indicate that word “ace” ends there.
As an exercise, review how “zoo” is similarly stored.
We can now make a few observations about the structure of the tree. Each leaf in the tree must have end_word be true, and all of its children must be NULL, otherwise it would not be a valid leaf. Also, if the value of end_word is true at the root, then the empty string “” is contained in the trie. If there are no words in the trie, then the value of root would be NULL.
For this question, the Marmoset basic tests (“public” tests) will be identical to the simple
(assertion) test provided.
The only new wrinkle is that the wordlist ADT provides a wordlist_count operation which is O(1) (whereas, trie_count is O(n)).
We have provided an I/O-based client that will allow you to test your wordlist ADT.
For each possible change, we have shown what words would be generated for the word
“keyword”.
single deletions: each letter removed from the word. For a keyword of length n, this will be n suggestions.
“eyword”, “kyword”, … “keywor”
single insertions: every possible letter (a..z) inserted before each letter in the keyword (or at the end of the keyword). For a keyword of length n, this will be 26 * (n + 1) suggestions.
“akeyword”, “bkeyword”, … “zkeyword”, “kaeyword”, “kbeyword”, …
“kzeyword”, … “keyworzd”, “keyworda”, … “keywordz”
single substitutions: each letter replaced with every possible letter. For a keyword of length n, this will be 26 * n suggestions.
“aeyword”, “beyword”, … “zeyword”, “kayword”, … “keyworz”
adjacent swaps (transpositions): each adjacent pair of letters in the word swapped. For a keyword of length n, this will be n-1 suggestions. “ekyword”, “kyeword”, … “keywodr”
As described in suggest.h, suggest will not return a wordlist that contains the original keyword or the empty string (“”). You can simply remove those two words from the wordlist before returning the wordlist. Furthermore, if the wordlist to be returned is empty, no wordlist is returned and NULL is returned instead.
While hypothetically it might be possible to accommodate a pool_realloc (e.g., there is enough space before the original allocation but not after), pool_realloc is unsuccessful (returns NULL) if a new contiguous block of the required size cannot be accommodated.
freed by a single free within pool_destroy (and no reallocs). All addresses returned from successful calls to pool_alloc (and pool_realloc) must be addressess from within that pool.
In addition to the large pool itself, the ADT must use a data structure (e.g., a linked list or an array or a tree — your choice!) to maintain the pool and keep track of which areas of the pool have been allocated and which areas are still free. To maintain this data structure, you may use malloc, realloc, and free.
The following rules must be followed for allocating memory from within the pool:
pool_alloc returns the lowest possible address from within the pool that can accommodate the allocation size.
If the size argument of pool_realloc is less than or equal to the original allocation, the return address is the same as the address of the original allocation (addr).
If the size argument of pool_realloc is larger than the original allocation, and there is enough available (unallocated) space after the original allocation for the increase, the return address is the same as the address of the original allocation (addr).
Otherwise, the return address for pool_realloc is the same as what would be returned from a pool_alloc for the new size. Note that pool_realloc behaves the same as the “real” realloc where it does a new allocation, moves the data, and then frees the old allocation.
Note that while hypothetically you could use your pool to provide arrays of other types (e.g.,
ints) you should only use char arrays in your testing to avoid hardware alignment errors.
For this question, the MARMOSET basic tests (“public” tests) will be the same as the simple I/O test client provided.
Note that the test client can only have at most 26 active allocations at once (a…z), but your implementation must support an arbitrary number of allocations.
We have also provided a “DEMO” that you can use to further test your ADT, which may inspire you to create your own tests.