COMP 9517 Computer Vision
Assignment 1 Specification
The objective of this assignment is to familiarise students with basic image processing
methods. It also introduces students to common image processing and analysis tasks using
OpenCV. After completing this assignment, you will learn how to:
1. Open and read image files;
2. Perform simple mathematical operations on images;
3. Construct and manipulate image pyramids;
4. Carry out translation based alignment;
5. Perform image adjustment and restoration;
Sergei Prokudin-Gorsky (1863-1944) was a Russian photographer and chemist whose
collection of colour photographs is the oldest surviving to this date. He used a camera that took
a sequence of three black and white exposures using blue, red and green filters. By projecting
the three images using colored light it was then possible to recover the original colours. See
here for more details. At the beginning of the 20th century, Prokudin-Gorsky embarked on a
many year project to systematically document the life of the Russian Empire by means of the
new colour imaging technology. He then took many of the resulting negatives with him on
emigration following the revolution of 1917 and they were eventually purchased and digitized
by the US Library of Congress.
The objective of this assignment is to produce high quality colour reconstructions from
Prokudin-Gorsky’s negatives using simple image processing techniques.
You will need to extract images for the individual colour channels, align them and form a single
colour image. For this assignment, it is sufficient to use an x, y translation based transform but
feel free to implement other methods. It is recommended that you use the OpenCV library
either in C++ or Python.
Download both high and low-resolution negatives from Webcms3 (Course Work →
Assessments → Assessment 1 → Data Samples). Write a program that takes any one of these
files as an input and produces a corresponding colour image as output. To do this you should
divide the original image into three parts and then align the second and third channels to the
first, displaying the resulting offsets for each channel.
Figure 1: High quality colour reconstruction example.
A simple way to perform the alignment is by searching through all possible offsets in some
suitable range (e.g. 20 pixels for low resolution images) and computing for each a score
measuring the quality of the match. Three suitable metrics include sum of squared differences
(SSD), sum of absolute differences (SAD) and the normalized cross correlation (NCC).
Searching through all offsets can become computationally expensive for large resolution
images. To speed up the search procedure you can use a so-called image pyramid. An image
pyramid is essentially the image at multiple scales, with scales varying by a factor of two.
Alignment can then be done sequentially, starting with the highest level and incrementally
updating your estimates as you go down the pyramid.
Note that first you should implement an algorithm that can perform reconstruction on low
resolution images (Task 1) and only then try to modify your code to handle large images (Task
2- it should be easy to reuse much of the code).
Try to improve the visual quality of the results of the basic algorithm. Some possibilities
include colour and contrast adjustments, using a more sophisticated alignment procedure and
automatically removing borders.
Evaluation: Several new images will be released on the day the assignment is marked. You
will take your marker through the steps showing the output for both low and high-resolution
negatives before and after contrast correction. For the Task 3 you will need to choose images
from the Library of Congress collection that best demonstrate your enhancements.
This assignment is worth 15% of the course total. Tasks 1, 2 and 3 must be completed to
complete the assignment and will be marked against the maximum mark achievable.
Deliverables: In addition to demonstrating your work, you will also submit a short report (up
to 2 pages MAXIMUM). This report should explain briefly the approach you have taken to
align colour channels for Task 1, the modifications you performed to deal with large resolution
images in Task 2, and the adjustments you made for Task 3. You may also include the details
of any enhancements you have implemented.
The instructions on uploading the report will be released before submission date.
Deadline: Demo and report submission on Thursday of week 4 (August 17th), during the lab
time, 5-6 PM.
Software: Download OpenCV and read guided tutorial: http://opencv.org/