MIE245 Data Structures and Algorithms
Data Structures and Algorithms
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
MIE245 Data Structures and Algorithms
Assignment
Submission instructions:
● The assignment must be typed.
● Your submission should consist of three files: a PDF file named A1.pdf and two .py
file named grouping.py and fast_grouping.py
● Upload your submission files on Quercus. (Only submissions received on
Quercus, will be accepted.)
● Only one member of each team should submit the assignment.
● The cover page of each assignment should include the last name, first name, and
student number of all the team members. (Teams can consist of up to 4 members.)
Question (10 marks)
Prove or disprove each of the following using the methodology presented in class:
1. !() ∈ ("())
2. !"#$"#% ⊆ (!)
3. +!(), ⊆ ()
4. Ω+ !(), ⊆ Ω()
5. Ω+ !(), ⊆ O( √)
2
Question 2 (13 marks)
Consider the algorithm Alg1 described below in pseudo-code. Alg1 takes as input an array
(whose elements are either 0 or 1) and an integer (the size of the array). The index of the first
element of the array is 1.
1. Alg1(A, n)
2.
3. num1 ß 0
4. num0 ß 0
5. i ß 1
6.
7. while i <= n do
8. if A[i] = 1 then
9. num1 ß num1 + 1
10. if num1 >= n/2 then
11. return TRUE
12. endif
13. else
14. num0 ß num0 + 1
15. if num0 > n/2 then
16. return FALSE
17. endif
18. endif
19. i ß i +1
20. endfor
When estimating the run-time complexity of Alg1 as a function of the size of the input array
consider only the number of array comparisons performed when the algorithm is run on the
input array. (Note that array comparisons only occur on line 8 of Alg1.)
1. (3 marks) State the best-case running time complexity of Alg1 as a function of the size of
the input array. Do not use the asymptotic notation. Explain your answer (show your
calculations).
2. (3 marks) State the best-case running time complexity of Alg1 as a function of the size of
the input array. Do not use the asymptotic notation. Explain your answer (show your
calculations).
3. (7 marks) State the average case complexity of Alg1 as a function of the size of the input
array. Explain your answer using the methodology presented in class, i.e., define an appropriate
sample space, state a probability distribution function (assume a uniform distribution), define
the necessary random variables, etc. Show your calculations.
.
3
Question 3 (10 marks)
Let SortedList be a class whose instances are sorted lists. The index of the first element in
a SortedList instance is 1. (The index of the last element is equal to the number of elements
in the sorted list.)
The implementation of SortedList is such that the elements of a SortedList instance are
directly accessible only by the methods of this class. Of particular interest are the methods
length and check. Method check takes as input two integers and returns an integer. Method
length takes no input and returns an integer.
Let my_list by an instance of SortedList.
• The call my_list.length() returns the number of elements in the list.
• The call my_list.check(a,b) returns:
o -1, or 0, or 1 if the element with index a in my_list is strictly smaller than, or
equal to, or strictly larger than the element with index b, respectively;
o -2, if min{a,b} is strictly greater than the number of elements in my_list.
Assume that another team was asked to implement an algorithm with the following properties:
• P1: The input of the algorithm is a sorted list.
• P2: The output of the algorithm is one pair of integers (i, j) such that either (a) 1 £ i, 1 £ j, i
¹ j, and they are the indices of input list elements of equal value, or (b) i = j = -1, if no two
such elements are in the list. (No further restrictions are placed on i and j, i.e., they can
correspond to any pair of input list elements that are equal.)
Their implementation must take as input an instance of SortedList and no advanced
techniques, e.g., hashing, randomization, etc. are allowed. The implementation must be based
on examining pairs of elements from the input list (and must use method check to examine
them), but there are no restrictions related to the number of pairs checked and the order in
which they are considered.
Assuming that the implementation is sound and complete with respect to the specifications,
prove that n -1 is a tight lower bound for the number of calls to check the implementation must
make, where n is the size of the input list. (This lower bound holds for any implementation that
fulfills the requirements described above.)
4
Question 4 (6 marks)
Let position() be a function that returns the position of a letter in the Latin alphabet, e.g.,
position(I) = 9
Consider the following two hash functions that map each letter of the alphabet to an integer,
i.e., a hash table index:
• h(letter) = (7 ´ position(letter)) mod M, where M is an integer (the size of the hash table)
• h2(letter) = (position(letter) mod 3 ) + 1
Example: To get the index of letter I in a hash table of size 5, using h, we calculate h(I).
h(I) = (7 ´ 9) mod 5 = 63 mod 5 = 3. (3 is the hash table index.)
1. (2 marks) Consider an initially empty hash table T of size M = 5, that uses separate chaining
with unordered lists to handle collisions. Describe the contents of T after you insert items with
the keys A, L, G, O, R, I, T, H, M, S, in that order.
Use the hash function h(letter) = (7 ´ position(letter)) mod M to map each letter of the
alphabet to a hash table index.
2. (2 marks) Consider an initially empty hash table T of size M = 11, that uses linear probing to
handle collisions. Describe the contents of T after you insert items with the keys A, L, G, O, R,
I, T, H, M, S, in that order.
Use the hash function h(letter) = (7 ´ position(letter)) mod M to map each letter of the
alphabet to a hash table index.
3. (2 marks) Consider an initially empty hash table T of size M = 11, that uses double hashing.
Describe the contents of T after you insert items with the keys A, L, G, O, R, I, T, H, M, S, in
that order.
Use the hash function h(letter) = ( 7 ´ position(letter) ) mod M for the initial probe and the
hash function h2(letter) = ( position(letter) mod 3 ) + 1 for the search increment.
5
Question 5 (21 marks)
1. (6 marks) Provide a Python implementation for a function grouping(points_set) that
takes as input a set of points in the first quadrant of the Cartesian plane, points_set, and
returns a list with all the “line sections” that contain five or more of the points in the input set,
i.e., all the points in a “line section” must lie on the same line in the Cartesian plane. Each point
is represented as a tuple (x, y) of integers, where both x and y are greater or equal to 0, e.g.,
(2, 5). Your function must accept as input a set of points of arbitrary size and each “line section”
found should be returned as a set of points, e.g., {(1, 2), (1, 4), (1, 7), (1, 5), (1, 3)}.
For example, the call grouping ( { (1, 2), (1, 4), (1, 7), (1, 5), (1, 77), (1, 3), (4, 2), (3, 2), (100,
2), (55, 2), (31, 2) } ) should return [ { (1, 2), (1, 4), (1, 7), (1, 5), (1, 77), (1, 3) }, { (4, 2), (3, 2),
(100, 2), (55, 2), (31, 2) } ]
Your implementation should be inefficient, e.g., in Q(n2) or worse, where n is the number of
points in the input set (e.g., it could consider groups of five points at a time and check if they all
lie on the same line)
Do not use helper functions.
Save your implementation of grouping in a file called grouping.py. (The only function defined
in grouping.py must be grouping.)
2. (8 marks) Provide a Python implementation for a function called fast_grouping(
points_set) that, like grouping, takes as input a set of points in the first quadrant of the
Cartesian plane, and returns a list of all the “line sections” that contain five or more of the points
in the input set. The algorithm you design for this implementation should be efficient, i.e., in O(n
log n) or better, where n is the number of points in the input set.