Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
AMME4710 Computer Vision and Image
Processing
Assignment 1 (20%), 2023
• This assignment is due on Sunday 27th August, 2023 at 11:59pm, to be
submitted using the Turnitin links on the course’s Canvas website (see
instructions below).
• This assignment contributes 20% to your final mark.
• Late assignments will be deducted 5% (5 marks out of a possible 100) for each
day late, starting on the day the assignment is due and including weekends.
Assignments submitted more than 10 days late will receive zero.
• Any special consideration requires you to go through the Special
Consideration form process through Sydney Student.
• Incidences of academic dishonesty or plagiarism will be referred to the
academic honesty coordinator and could result in a zero mark being awarded
for this assignment, or automatic failure of the entire subject.
• This assignment should take the average student 12 hours to complete.
Objectives
• In this assignment you will develop computer vision algorithms for solving
problems involving light, shading and colour.
• You will develop an algorithm for recovering the 3D shape of a face using
photometric stereo and develop a lego brick tracking algorithm using colour
information.
• For each of these problems, you will develop MATLAB code that implements
your algorithms and write a report examining existing approaches to your
problem, detailing your design process, and presenting and examining your
results using one or more different image datasets.
Submission
• You will submit the assignment by 27th August 2023 11:59pm.
• You will provide one report file for the assignment. The report will include the
Sections “Question 1”, “Question 2” and an appendix with subsections for
each question. Each of the first two sections will contain the subsections
“Introduction”, “Methodology” and “Results and Discussion”.
• For each question you will submit working MATLAB code. For each
question make sure there is a main file named “mainQN.m” where N is the
question number. This script file will be run by the tutors during marking and
should run all of your code and produce all of your plots for the questions.
• There are TWO submission portals for this assignment, one for the report and
one for your MATLAB code, both via Turnitin on the course Canvas site:
o The report should be submitted to the “Report Submission” portal
o Your MATLAB code is to be zipped into a single file (i.e.
“SID_Assignment1.zip” where SID is you student number) and
submitted via the “Code Submission” portal.
o You will need to submit both your report AND MATLAB code to
receive marks for the assignment.
• All MATLAB code must be commented so that the code can be read and
understood without the aid of the report.
• If your code does not work, you will receive 0 marks for the code section of
the marking unless you provide an explanation in the report as to why your
code didn’t work.
Question 1: Photometric Stereo (50 %)
• This question is an extension of the photometric stereo tutorial activity
performed during week 2. You should refer to the week 2 tutorial sheet on
instructions for loading and viewing the datasets described here, and you
should have successfully completed tutorial activities 1 and 2 from week 2
before attempting this question.
• Using your recovered surface normals for each of the face datasets, develop an
algorithm for recovering a 3D height image of each subject’s face using
numerical integration across the image in the x and y directions.
o Your approach should set the height of the top left pixel to be zero as a
starting point for each path integral.
o You should implement three different integration strategies:
§ (a) integrate in the horizontal direction along the top row first,
then down each column
§ (b) integrate in the vertical direction first along the first
column, then across each row
§ (c) the average of height computer by (a) and (b)
o Your integrations will essentially be cumulative sums of the gradient
values treating dx and dy to be a single pixel distance (i.e. equal to 1)
o You should try to implement this without use two nested for loops for
each pixel: have a look at the cumsum function in MATLAB.
• Use the provided function ‘display_face_model.m’ in the week 2 tutorial files
to display a 3D rendered image of the face (you will also need to provide this
function with a recovered albedo image of the face, as performed in the week
2 tutorial activity)
• Once you have working reconstructions for each of the faces in the datasets
‘facedata_yaleB01.mat’, ‘facedata_yaleB02.mat’ and ‘facedata_yaleB05.mat’,
try to improve the accuracy of your approach by developing and outlier
detection and rejection system in your code for recovering the surface
normals:
o For each pixel (x,y), compute the residual between the measured pixel
brightness and the pixel brightness predicted by your values for albedo,
surface normal and lighting direction (see week 2 tutorial, activity 1)
o Produce a plot, for each pixel in each face image (try using montage)
that highlights image regions for which the residual is greater than two
times the standard deviation of residuals for that pixel.
o Try to re-implement your surface normal computing algorithm and or
3D height computing algorithm to not use pixel brightness data above
this threshold
• In your report, under “Question 1”, detail your algorithm and present and
compare results. This section of the report (Question 1) should have the
following subsections:
o Introduction: discuss the basic principles behind how photometric
stereo works, and from your own research describe some of the
existing approaches and applications of photometric stereo in the
research literature
o Methodology: Describe how your algorithms for 3D reconstruction
and outlier detection and rejection were developed and implemented.
o Results and Discussion: Use plots from your MATLAB code to
demonstrate results on each of the face datasets provided. Include key
figures and/or tables in the main body of your report and provide
additional plots in an appendix at the end of your report and reference
them in the body of your report.
o This section of the report has a 4-page limit (not including the
appendix).
Question 1 Marking Scheme
Code
Implementation
(20/50)
Code doesn’t work 0 Marks awarded
for code
Code works, is readable and commented and
the implementation is correct: Have the 3D
height been properly reconstructed? Has an
outlier detection and reject mechanism been
implemented into a surface normal/albedo
computing implementation?
/20 Marks
Report (30/50) Introduction: discuss the basic principles
behind how photometric stereo works, and
from your own research describe some of the
existing approaches and applications of
photometric stereo in the research literature.
/5 Marks
Methodology: Have the methods implemented
been clearly explained in the report?
/10 Marks
Discussion and Analysis: How has the
algorithm performed over the different face
datasets? Where obvious errors are present,
what is the reasoning for this in terms of the
assumptions that go into the method? What
effect has the outlier rejection system had on
the results?
/15 Marks
Question 2: Lego Block Colour-based Tracking (50%)
• In this question you will develop and evaluate your own algorithm for tracking
differently coloured lego blocks in images, based on the colour of the block
• There are two datasets for which you will create two different tracking
algorithms:
o “lego_brick_images.zip” (herein referred to as the “Lego Bricks”
dataset) containing several images of lego bricks under various lighting
sources. There are six different colours of bricks: red, orange, yellow,
light green, dark green and blue
o “lego_bricks_joined_images.zip” (herein referred to as the “Lego
Bricks Joined” dataset) containing several images of set of bricks
connected to each other in the same arrangement
• For each problem, your algorithm should take as an input an image and output
the pixel locations and dimensions of bounding boxes surround each block and
a corresponding label based on the colour of the block.
o For the “lego bricks joined” dataset and algorithm, you code should
output a single bounding box for the entire joined shape with the label
“all”
• Your algorithm should primarily use the colours of the bricks to track these
properties, but you may also choose to use morphological characteristics in the
image data to help refine detection decisions made based on colour such as the
size and shape of segmented regions.
o Any colour thresholding may have to be robust to the different ambient
lighting conditions in the different images
o For the “lego bricks joined” there is a lot of other colour “clutter” in
the scene: think about what makes the joined lego brick distinct in
terms of the spatial arrangement of colours
• You should implement your algorithms as two MATLAB functions, one for
each tracking task that each take a colour image array as an input and output a
list of detected bricks including their colour type (i.e. red, orange, yellow, light
green, dark green, blue or all), pixel location (centroid) and bounding box
dimensions.