Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Database Systems
COSC 2406/2407
Assignment 1
Assessment
Type
Individual assignment. Submit online via Canvas→Assignments→Assignment 1. Marks awarded
for meeting requirements as closely as possible. Clarifications/updates may be made via
announcements/relevant discussion forums.
Marks 100 points (20% of the overall assessment)
1. Overview
You will use the AWS Linux instance assigned to you and the data from a public source to complete the following tasks:
1. implement in Java a heap file to store the data;
2. implement in Java a query on the data;
3. store and query the data in an Apache Derby relational database that you create, and
4. store and query the data in a MongoDB database that you create.
In the second assignment, you will extend your solution developed in this assignment and conduct further timing
experiments on your AWS Linux instance with indexes.
2. Learning Outcomes
This assessment relates to the following learning outcomes of the course:
• CLO 1: Explain and critique data structures and algorithms used to efficiently store and retrieve information in
database systems,
• CLO 2: Evaluate, critically analyse and compare alternative designs for implementation of database systems,
including data models, file structures, index schemes, and query evaluation, and
• CLO 4: Design, implement and report on significant software components of a database system (such as file
structures and index schemes) according to analysis of requirements and specified constraints.
Submit on Canvas: Assignments > Assignment 1. You MUST submit:
• a zip file of your code for tasks 1 and 2 (all Java sources files including your git log); and
• your report (a single PDF file) that explains your approach and answers for each task (1, 2, 3 and 4), including
description of any scripts for data pre-processing, queries you used, and output.
Progress submission of your Java code in Week 4 You must make a progress submission of your Java code and your git
log. Failing to do this result in a penalty of 10 points.
Progress submission due date: Week 4, Thursday 24 March 2022, 23:59.
Late submission:
• After the due time, you will have 7*24 hours to submit your assignment as a late submission. Late submissions
follow the same procedure but will be penalised by 10 points for each (up to) 24 hours being late. For
assignments that are more than 7*24 hours late, zero points will be awarded.
• Tasks (Section 4) in this assignment require you run database systems and timing experiments in a cloud(AWS)
Linux instance. All experiments and database systems must be shut down before you log out of your Linux
Page 2 of 6
instance. Otherwise your instance may run out of memory. Other precautions are provided via
announcements on Canvas. Ignoring such precautions will NOT be grounds for extension to due time.
Page 3 of 6
4. Academic integrity and plagiarism (standard warning)
Academic integrity is about honest presentation of your academic work. It means acknowledging the work of others while
developing your own insights, knowledge and ideas. You should take extreme care that you have:
• Acknowledged words, data, diagrams, models, frameworks and/or ideas of others you have quoted (i.e. directly
copied), summarised, paraphrased, discussed or mentioned in your assessment through the appropriate
referencing methods,
• Provided a reference list of the publication details so your reader can locate the source if necessary. This includes
material taken from Internet sites.
If you do not acknowledge the sources of your material, you may be accused of plagiarism because you have passed off
the work and ideas of another person without appropriate referencing, as if they were your own.
RMIT University treats plagiarism as a very serious offence constituting misconduct. Plagiarism covers a variety of
inappropriate behaviours, including:
• Failure to properly document a source
• Copyright material from the internet or databases
• Collusion between students
5. Marking Guidelines
Task 1 Heap file implementation: 40/100
Task 2 Range query implementation: 30/100
Task 3 Derby database and queries: 15/100
Task 4 MongoDB database and queries: 15/100
Page 4 of 6
Assessment details
Data: The data that you are going to use in this assignment is available from:
and you can download the file on another machine and use scp to copy to your AWS Linux instance (you may need to
temporarily store files in the temp directory on titan if you are doing this outside of RMIT).
The first four lines are not data but headers for you to create and name fields in the databases for Tasks 3 and 4.
Run database systems and timing experiments
Timing experiments are only reliable when running one program at a time. You should only run one database system,
Derby or MongoDB, at a time and shut down one database system before starting the other. All database systems must
be shut down before you run your own program and before you log out of your Linux instance. Otherwise, your instance
may run out of memory. Other precautions are provided via announcements on Canvas. Ignoring such precautions will
NOT be grounds for extension to due time.
Walkthrough of your Java code in Week 4 You must undertake a walk-through during a scheduled lab class in week 4
explaining your Java code for Task 1 and answer questions about it. Failing to do this will result in a penalty of 10 points.
Code and git log: You must use git to track your assignment code. You need to set up your git repository so that each
commit identifies you with your full name as per course enrolment and your student email address. It sets an expectation
of professionalism. You must submit a text file of your git log. Do not include the git repository in your code submission.
Report: Create a file called report.pdf (various software including Word processors can export as PDF). Use this file to
report on the following four tasks. Each task should be reported under a separate heading with the task name and
description, for example for the first task use the heading: Task 1: Derby. Limit your report to three pages.
• Description of scripts for data preprocessing for loading MongoDB and Derby databases, rather than the scripts
themselves, should be included in the report. Scripts are submitted with your code in the zip file to enable markers
to test your scripts.
• Description of query output, rather than a long list of output, should be included in the report.