Homework 1 CSCE 633

Instructions for homework submission

a) For the math problems, please typewrite your answers in Latex, or handwrite your solution

very clearly and scan it. Non-visible solutions will not be graded: we wouldn’t like our TA to

have to guess what you are writing 🙂

b) For the experimental problems, please write a brief report and include your code.

c) Create a single pdf and submit it on eCampus. Please do not submit .zip files or colab

notebooks.

d) Please start early 🙂

e) The maximum grade for this homework, excluding bonus questions, is 10 points (out of 100

total for the class). There is 1 bonus point.

Question 1 (4 points)

Linear Perceptron Algorithm: The goal of this problem is to run a linear perceptron algorithm on paper and pencil. Assume that you have three training samples in the 2D space:

1. Sample x1 with coordinates (1, 3) belonging to Class 1 (y1 = 1)

2. Sample x2 with coordinates (3, 2) belonging to Class 2 (y2 = −1)

3. Sample x3 with coordinates (4, 1) belonging to Class 2 (y2 = −1)

The linear perceptron is initialized with a line with corresponding weight w(0) = [2, −1, 1]T

, or

else the line 2 − x + y = 0.

In contrast to the example that we have done in class, in this problem we will include the

intercept term w0.

(0.5 points) (i) Plot x1, x2, and x3 in the given 2D space. Plot the line corresponding to

weight w(0), as well as the direction of the weight w(0) on the line.

(1 point) (ii) Using the rule sign(w(t)

T

xn), please indicate the class in which samples x1, x2,

and x3 are classified using the weight w(0). Which samples are not correctly classified based

on this rule?

Note: You have to compute the inner product w(0)

T

xn, n = 1, 2, 3, and see if it is greater or

less than 0.

(1.5 points) (iii) Using the weight update rule from the linear perceptron algorithm, please

find the value of the new weight w(1) based on the misclassified sample from question (ii). Find

and plot the new line corresponding to weight w(1) in the 2D space, as well as the direction of

the weight w(0) on the line. Indicate which samples are correctly classified and which samples

are not correctly classified.

Note: The update rule is w(t + 1) = w(t) + ysxs, where xs and ys ∈ {−1, 1} is the feature

and class label of misclassified sample s.

Hint: The line corresponding to a vector w = [w0, w1, w2] can be written as w0+w1x+w2y = 0.

Make sure that you get the direction of the vector w correctly based on the sign of w1 and w2.

(1 point) (iv) Using the rule sign(w(t)

T

xn), run the linear perceptron algorithm, find and

plot the weights w(2) and the corresponding line. Please indicate which samples are classified

correctly and which samples are not classified correctly.

1

Question 2 (6 points)

Classifying flowers: We will use the IRIS dataset to classify three types of flowers (i.e., Setosa,

Versicolour, Virginica) based on their attributes (i.e., sepal length/width petal length/width).

We use data from the following UCI Machine Learning Repository:

https://archive.ics.uci.edu/ml/datasets/Irishttps://archive.ics.uci.edu/ml/datasets/Iris.

Inside “Homework 1” folder on Piazza you can find three files including the train and test data

(named “iris train.csv”, “iris dev.csv”, and ”iris test.csv”) for our experiments. The rows of

those files refer to the data samples, while the columns denote the features (columns 1-4) and

the class variable (column 5), as described bellow:

1. sepal length in cm

2. sepal width in cm

3. petal length in cm

4. petal width in cm

5. class: Iris Setosa, Iris Versicolour, Iris Virginica

(a.i) (0.5 points) Data exploration: Using the training data, compute the number of samples

belonging to each class. Are the classes equally distributed?

(a.ii) (0.5 points) Data exploration: Using the training data, plot the histogram of each

feature (i.e., 4 total histograms). How are the features distributed (e.g., unimodal, bimodal,

uniform distributions)?

(a.iii) (1 point) Data exploration: Using the training data, plot scatter plots of all pairs of

features (i.e., 6 in total, 6 total scatter plots). Use a color-coding to indicate the class in which

the samples belong to (e.g., blue for setosa, red for versicolour, green for virginica). What do

you observe? How separable do the classes look? Are there feature combinations for which the

three classes are more separable?

(b.i) (2 points) Classification: Implement a K-Nearest Neighbor classifier (K-NN) using

the euclidean distance (l2-norm) as a distance measure to classify between the three classes.

Please implement K-NN and do not use available libraries. In the report, please show

your code.

(b.ii) (1 point) Explore different values of K = 1, 3, 5, 7, . . . , 19. You will train one model for

each of the ten values of K using the train data and compute the classification accuracy (Acc)

of the model on the development set. Plot the Acc metric on the dev set against the different

values of K. Please report the best hyper-parameter K∗ based on the Acc metric. Please

implement this procedure from scratch and do not use available libraries.

Hint: Acc =

# correctly classified samples

# samples

(b.iii) (1 point) Report the Acc metric on the test set using K∗

. What do you observe?

(b.iv) (Bonus, 1 point) Instead of using the euclidean distance, experiment with different

types of distances or distance combinations (e.g. l0-norm, l1-norm, cosine similarity). Report

your findings.

Note: You do not need to explore all aforementioned distances.

2