COMP10002 Foundations of Algorithms
Foundations of Algorithms
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
COMP10002 Foundations of Algorithms
1 Learning Outcomes
In this assignment you will demonstrate your understanding of dynamic memory allocation, linked data
structures, and search algorithms. You will further extend your skills in program design and implementation.
2 The Story...
Mining is the largest industry in Australia. It delivered 10.4% (i.e., $202 billion) of the Australian econ-
omy in 2019-2020 and a total of 1.1 million jobs.1. The era of Big Data and Industry 4.0 call for a new
kind of mining – data mining, or in more popular and generic terms, data science. Data science is an
interdisciplinary field that focuses on mining patterns from data to support decision making and optimi-
sation. In this assignment, you will gain an initial experience on how data science is impacting our daily lives.
The Victorian Dining and Entertainment Program offers 25% cash back on eligible dining and entertainment
purchases across Victoria. 2 Through this program, the Victorian government can harvest millions of dining
and entertainment transactions, which are a gold mine of opportunities. Now, the government is hiring you
as a data scientist to build a program to mine this dataset3, with the aim to identify Victorian’s dining
patterns so as to promote the restaurants and help with the recovery of the local business. Here, we focus
just on dining to simplify the assignment setting.
3 Your Task
You are given a list of restaurant records and a list dining transactions in the following format. See the next
page for a sample input.
The input starts with a list of at least 1 and up to 99 restaurant records sorted by the restaurant IDs
(e.g., ABNs) in ascending order. Each restaurant record takes one line with the following six fields:
• The restaurant ID, which is a unique 6-digit integer.
• The x-coordinate of the restaurant, which is a real number.
• The y-coordinate of the restaurant, which is a real number. For simplicity, all x- and y-coordinates
have been normalised into the range of (0, 100).
• The average price per head, which is a positive integer.
• The cuisine type, which is a string of at least 1 and up to 20 lower-case letters.
• The restaurant name, which is a string of at least 1 and up to 100 (lower-case letters or ‘_’) characters.
1https://www.australianmining.com.au/news/mining-industry-holds-largest-slice-of-australian-economy/
2https://www.vic.gov.au/victorian-dining-and-entertainment-program
3Assume that we have user permission to analyse the data.
1
The list of restaurant records ends with a line of five ‘#’s which serves as a separator line. Then, the input
continues with a non-preknown number (at least 1) of dining transactions sorted in ascending order of the
(unique) transaction time. Each dining transaction takes one line with the following four fields:
• The transaction date and time, which uses the following format: Year:Month:Day:Hour:Minute:Second.
• The customer ID, which is a string of 6 lower-case letters (each unique ID represents a different
customer).
• The restaurant ID, which is a 6-digit integer (and has appeared in the list of restaurants).
• The transaction amount, which is a positive integer.
You may assume that there are at least 3 unique customer IDs in the list of dining transactions.
You may assume that all test input follows the format described above. No input validity checking is needed.
Below is a sample set of input data following this format.
190947 34.2 37.6 50 aussie captain_cook
359210 31.7 07.4 30 british queen_of_the_fries
626944 97.4 84.8 40 chinese whale_fin_inn
680328 83.9 98.1 160 chinese tim_ho_wonton
699229 19.3 55.4 30 indian curry_palace
726136 94.8 67.2 10 italian tiramisu_pizza
789499 12.4 93.9 60 aussie seventeen_seeds
883358 37.9 72.4 110 italian lasagnes_kitchen
901093 33.3 92.6 140 british the_sherlock_home
987328 70.5 16.1 200 aussie mount_de_vue
#####
2022:04:05:17:05:55 jxrpfj 190947 42
2022:04:06:18:08:08 migbyt 359210 129
2022:04:07:13:50:45 syzbls 190947 170
2022:04:08:14:16:20 ftddia 680328 186
2022:04:08:19:39:00 ozhodc 883358 121
2022:04:15:09:06:49 ftddia 883358 195
2022:04:15:10:09:50 jxrpfj 190947 99
2022:04:15:13:45:29 migbyt 901093 173
2022:04:22:18:30:43 syzbls 680328 234
2022:04:23:17:26:20 syzbls 359210 45
2022:04:29:18:03:00 jxrpfj 789499 89
2022:04:29:20:07:00 jxrpfj 901093 192
3.1 Stage 1 - Read Restaurant Records (Up to 3 Marks)
Your first task is to read the restaurant records from the input data (using standard input functions with
input redirection, just like in Assignment 1; no file operations needed such as fopen). You should start with
designing a struct to represent a restaurant and then create an array to host the restaurant records.
When this stage is done, your program should output: the total number of restaurant records and the name
of the restaurant with the smallest average price per head. If there is a tie, print the name of the restaurant
with the smallest ID among the tied ones. The output of this stage based on the sample input is as follows.
==========================Stage 1==========================
Number of restaurants: 10
Restaurant with the smallest average price: tiramisu_pizza
3.2 Stage 2 - Read Dining Transactions (Up to 9 Marks)
Next, create a data structure to store the customer dining records. You need to create a struct to represent a
customer and a customer linked list (using a modification of listops.c provided in the assignment package)
to represent all unique customers in the dining transactions. The customer struct has two components:
1. A string to represent the customer ID, and
2
2. A restaurant visiting array to record the number of times that the customer has visited each restaurant
in the restaurant list. This array needs to be of the same size of the array of restaurants created in
Stage 1. The i-th element in this array stores the number of times that the customer has visited the
i-th restaurant in the array of restaurants.
For example, given the sample input above, the restaurant visiting array of customer jxrpfj should
be {2, 0, 0, 0, 0, 0, 1, 0, 1, 0}. Here, customer jxrpfj has visited restaurant #190947 (the
0-th restaurant) twice, restaurant #789499 (the 6-th restaurant) once, and restaurant #901093 (the
8-th restaurant) once. Thus, the 0-th element of the restaurant visiting array is 2, the 6-th and the
8-th elements of the array are both 1, and the rest of the array elements are all 0’s.
Your program needs to populate data into the customer linked list by reading the dining transactions from
the input. When a transaction is read:
• If the customer ID in this transaction is seen for the first time (for example, when reading the line
2022:04:05:17:05:55 jxrpfj 190947 42), create a new node to represent the customer, fill the
customer ID and initialise the restaurant visiting array with all 0’s except for a 1 at the element
corresponding to the restaurant in the transaction (that is, the first recorded visit to the restaurant),
and append the node to the end of the customer linked list.
• If the customer ID in this transaction has been seen in the input before (for example, when reading
the line 2022:04:15:10:09:50 jxrpfj 190947 99), find the node corresponding to this customer
ID in the customer linked list, and update its restaurant visiting array by increasing the element
corresponding to the restaurant in the transaction by 1 (that is, one more visit to the restaurant).
When this stage is done, your program should output the customer linked list in the form of a matrix, where
each row is the restaurant visiting record of a customer, and the i-th numeric column represents customers’
visiting records to the i-th restaurant in the array of restaurants.
==========================Stage 2==========================
jxrpfj: 2 0 0 0 0 0 1 0 1 0
migbyt: 0 1 0 0 0 0 0 0 1 0
syzbls: 1 1 0 1 0 0 0 0 0 0
ftddia: 0 0 0 1 0 0 0 1 0 0
ozhodc: 0 0 0 0 0 0 0 1 0 0
Note: If you are not confident with using linked data structures, you may use an array of customer structs
instead, assuming a maximum of 20 unique customers. However, if you do so, the full mark of this stage
will be reduced by 3 marks.
3.3 Stage 3 - Recommend Based on Restaurant Similarity (Up to 12 Marks)
Your third task is to identify restaurants that a customer may want to visit next time, based on the similarity
between the restaurants (a.k.a. content-based recommendation). In particular, for each customer C and each
restaurant R that customer C has visited before (that is, its corresponding element in the restaurant visiting
array of customer C is greater than zero), your program should identify all other unvisited restaurants that
• share the same cuisine with R, or
• are within a distance of 30 distance units (inclusive) from R, or
• have an average price per head which differs from that of R by no more than 20 (can be either higher
or lower).
For each restaurant identified, your program should change the corresponding element in the restaurant
visiting array of customer C to -1.
Here, we use the Euclidean distance to measure the distance between two restaurants. Given two restaurants
R1 and R2 with coordinates 〈x1, y1〉 and 〈x2, y2〉, respectively, their distance is calculated by:
distance(R1, R2) =
√
(x1 − x2)2 + (y1 − y2)2 (1)
The output of this stage is the updated customer linked list, again in the form of a matrix. For example,
given the sample input above, the output of this stage should be as shown in the next page.