Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
AMME4710 Computer Vision and Image
Processing
Assignment
• This assignment contributes 20% to your final mark.
• Late assignments will be deducted 5% (5 marks out of a possible 100) for each
day late, starting on the day the assignment is due and including weekends.
Assignments submitted more than 10 days late will receive zero.
• Any special consideration requires you to go through the Special
Consideration form process through Sydney Student. Only major
illnesses/misadventures will be considered.
• Incidences of academic dishonesty or plagiarism will be referred to the
academic honesty coordinator, and could result in a zero mark being awarded
for this assignment, or automatic failure of the entire subject.
• This assignment should take the average student 12 hours to complete.
Objectives
• In this assignment, you will implement and test algorithms for stereo vision
and image classification
• For each of these problems, you will develop MATLAB code that implements
your algorithms. You will write a report examining existing approaches to
your problem, detailing your design process, and presenting and examining
your results using the provided image datasets.
Submission
• You will submit the assignment by 24th September 2023 11:59pm.
• You will provide one report file for the assignment. The report will include the
Sections “Question 1”, “Question 2” and an appendix with subsections for
each question. Each of the first two sections will contain the subsections
“Introduction”, “Methodology” and “Results and Discussion”.
• For each question you will submit working MATLAB code. For each
question make sure there is a main file named “mainQN.m” where N is the
question number. This script file will be run by the tutors during marking and
should run all of your code and produce all of your plots for the questions.
• There are TWO submission portals for this assignment, one for the report and
one for your MATLAB code, both via Turnitin on the course Canvas site:
o The report should be submitted to the “Report Submission” portal
o Your MATLAB code is to be zipped into a single file (i.e.
“SID_Assignment1.zip” where SID is you student number) and
submitted via the “Code Submission” portal.
o You will need to submit both your report AND MATLAB code to
receive marks for the assignment.
• All MATLAB code must be commented so that the code can be read and
understood without the aid of the report.
• If your code does not work, you will receive 0 marks for the code section of
the marking unless you provide an explanation in the report as to why your
code didn’t work.
Question 1: 3D Reconstruction using Underwater Stereo Vision (50%)
• In this question, you will work with a dataset of stereo images captured using
an underwater diver-operated stereo camera rig (the “diver-rig”, see above
figure) over a shallow water coral reef.
• The “diver-rig” contains a downwards-facing stereo pair with one colour
camera (left) and one monochrome camera (right). Both cameras are placed in
a waterproof housing. The “diver-rig” also carries a combination of other
sensors (GPS, magnetic heading and tilt sensing) that allow for the position
and orientation of the stereo camera pair to be measured at each camera frame
(i.e. the rotation matrix and translation vector representing the left camera
coordinate system relative to a world-fixed coordinate system). Processing of
the stereo imagery allows one to construct a 3D model of the reef using
collected imagery and sensor data.
• On the Canvas site, download and unzip “assignment2_stereodata.zip”. This
file contains a collection of 49 stereo image pairs and three mat files that
contain the calibration parameters of the stereo camera system, the camera
trajectory parameters for each captured stereo pair and a reference terrain
model, generated from a slightly larger series of images. Left camera images
are stored in the folder “images_left” and right camera images in the folder
“images_right”. The file “stereo_calib.mat” contains a single MATLAB
variable corresponding to the stereo camera calibration parameters. The file
“camera_pose_data.mat” contains a MATLAB structure “camera_poses”
which has four fields:
o R: a 3 x 3 x 49 array of rotation matrices, one 3 x 3 matrix for each of
the 49 camera images (see week 5 slide 23 for the definition of these
parameters)
o t: a 3 x 49 array of translation vectors, one 3 x 1 vector for each of the
49 camera images (see week 5 slide 23 for the definition of these
parameters)
o Left_camera: a 49 x 1 cell array of corresponding left camera
filenames
o Right_camera: a 49 x 1 cell array of corresponding right camera
filenames
• The file “terrain.mat’ contains reference terrain data in three variables:
o height_grid: a 2D array of terrain height values
o X: a vector of X-position values, corresponding to the rows of
height_grid
o Y: a vector of Y-position values, corresponding to the columns of
height_grid
o You can produce a plot of the reference terrain using the MATLAB
function mesh (i.e. mesh(flip(X),Y,height_grid))
• Use the stereo pairs and corresponding information on the rotation and
translation of the left camera with respect to a fixed-world reference system
for each of the 49 image pairs to build a 3D pointcloud of the underwater
terrain, corresponding to the surfaces seen in the camera images:
o For each stereo pair, extract interest points of your choice (i.e. SURF,
Harris Corners etc.) and match these across the left and right images,
making sure to use an outlier-rejection process to remove bad matches
o Use these matched feature points to produce a set of corresponding 3D
points in space with respect to the left camera’s frame of reference, for
each of the 49 stereo pairs
o Produce a single pointcloud in the world reference frame using the
pointclouds from each of the 49 stereo pairs. For each set of points,
translate and rotate the points into the world reference frame using the
camera rotation and translation data contained in
“camera_pose_data.mat”.
o Compare your reconstructed pointcloud to the provided reference
terrain model
• You should provide a MATLAB script in your code submission called
“MainQ1.m” which when run, implements your stereo vision process
algorithms and produces a plot of the reconstructed pointcloud. You do not
need to include the stereo images themselves in your code submission. At the
top of your script have two MATLAB variables images_left_dir and
images_right_dir which represent the path to the folder containing the
left and right image sets; these will be changed by the tutor based on the
computer that your code is run on.
• In your report, under “Question 1”, detail your approach to producing the
pointcloud and present your results. This section of the report (Question 1)
should have the following subsections:
o Introduction: Describe how stereo vision works and from your own
research some of the applications in which stereo vision has been
applied and the algorithms used.
o Methodology: Describe your approach to producing a 3D pointcloud
of the imaged terrain from stereo image pairs, calibration data and
camera pose information.
o Results and Discussion: Use plots from your MATLAB code to
discuss your results, focussing on the feature extraction and matching
process for image pairs and the final 3D pointcloud reconstruction.
o This section of the report has a 4 page limit (not including the
appendix).