MP6: Dynamic Time Warping


Rate this product

ECE/CS 434 | MP6: Dynamic Time Warping

In this MP, you will:

Implement the Dynamic Time Warping Algorithm
Use the DTW algorithm to identify missing windows in time series data
import numpy as np
import random
import scipy

# This function is used to format test results. You don’t need to touch it.
def display_table(data):
from IPython.display import HTML, display

html = “<table>”
for row in data:
html += “<tr>”
for field in row:
html += “<td><h4>{}</h4><td>”.format(field)
html += “</tr>”
html += “</table>”
Problem Setting
Many off-the-shelf smart watches today are capable of taking the user’s electrocardiogram (also called an ECG or EKG), a test that records the timing and strength of the electrical signals that make the heart beat. However, wearing the watch too loose, having sweat or dirt between the watch and wrist can all result in poor readings. In this MP, we will simulate this scenario. You will be given a ground truth reading  X , and a watch reading  Y . The watch reading will have 1 missing time window and added noise compared to the ground truth reading. In other words,  Y=X∖W+N , where  W  is the missing time window and  N  is random noise. Your task is to identify the start of the missing time window.

For example, let  X=[1,2,3,4,5]  and  Y=[0.9,2.1,5.2] . We can see that the values  3  and  4  in  X  are missing from  Y . Your program should then return  2 , which is the index of the first missing value in  X , aka the index of  3  in  X
Note that we are only simulating an ECG missing data scenario. The data used is not actually ECG data. However, this should not make a difference in your implementation.

Hint 1: The DTW algorithm essentially constructs a cost matrix,  C , between the two signals, based on the distance between points in the two signals. For every cell in this matrix, its value can be calculated as such:  C(cj,rj)=distance(cj,rj)+min{C(cj−1,rj−1),C(cj−1,rj),C(cj,rj−1)} . Refer to the lecture notes for more details.
Hint 2: The missing time window can be identified by examining the cost matrix: 1 time point in  Y  will match to multiple time points in  X . You can approach this by starting at (0, 0) of the cost matrix and traveling towards (len(X), len(Y)). At every step, make a decision to go right, down or diagonal by choosing the cell with minimal cost. If for multiple steps in a row, the decision is to go down, then you might have encountered the missing time window since 1 point in  Y  is being matched to multiple points in  X
Your Implementation
Implement your algorithm in the function find_missing_window(X, Y, N). Do NOT change its function signature. You are, however, free to define and use helper functions. You are NOT allowed to use existing Python DTW packages.

def find_missing_window(X, Y, N):
“””Identifies where data is missing.
X: The ground truth signal
Y: Signal with 1 time window missing from X and added noise
N: Approximate length of the missing time window
Returns: Candidate indices of the missing time window in X. See section above for an example.
You may return up to 3 candidate results. You will receive full points as long as 1 falls within the grading criteria.
For example, if you think the missing time window starts at index 3 but indices 8 and 40 are also possible,
then return [3, 8, 40].
return [0, 0, 0]
Running and Testing
Use the cell below to run and test your code, and to get an estimate of your grade.

def calculate_score(groundtruth, candidates, threshold):
for candidate in candidates:
if(groundtruth – threshold < candidate < groundtruth + threshold):
return 1
return 0
import matplotlib.pyplot as plt
if __name__ == ‘__main__’:
output = [[‘Test’, ‘Correct Index’, ‘Calculated Indices’, ‘Grade’]]
windows = [3, 5, 56, 32] # 20
N = [2, 2, 4, 5]
for i in range(4):
X = np.loadtxt(open(‘{}_X.csv’.format(i), “rb”), delimiter=”,”, skiprows=1)
Y = np.loadtxt(open(‘{}_Y.csv’.format(i), “rb”), delimiter=”,”, skiprows=1)
student_answer = find_missing_window(X, Y, N[i])
score = calculate_score(windows[i], student_answer, max(4, N[i] * 0.5))
output.append([i, windows[i], student_answer, “{:0.0f} / 10”.format(score * 10)])
output.append([‘<i>👻 Hidden test 0 👻</i>’,'<i>???</i>’, ‘<i>???</i>’, ‘<i>???</i> / 10’])
output.append([‘<i>👻 Hidden test 1 👻</i>’,'<i>???</i>’, ‘<i>???</i>’, ‘<i>???</i> / 10’])
output.append([‘<i>👻 Hidden test 2 👻</i>’,'<i>???</i>’, ‘<i>???</i>’, ‘<i>???</i> / 10’])
output.append([‘<i>👻 Hidden test 3 👻</i>’,'<i>???</i>’, ‘<i>???</i>’, ‘<i>???</i> / 10’])
Correct Index
Calculated Indices
[0, 0, 0]
10 / 10
[0, 0, 0]
0 / 10
[0, 0, 0]
0 / 10
[0, 0, 0]
0 / 10
👻 Hidden test 0 👻
??? / 10
👻 Hidden test 1 👻
??? / 10
👻 Hidden test 2 👻
??? / 10
👻 Hidden test 3 👻
??? / 10
You will be graded on the four datasets provided to you (10 points each) and four additional datasets. We will use the same code from the Running and Testing section above to grade all 8 traces of data. As long as 1 of 3 candidate outputs are within the grading threshold( max[4,window_size×0.5] ), you will receive 10 points. No partial credit is rewarded since we are essentially allowing 3 guesses.

Submission Guidelines
This Jupyter notebook (MP6.ipynb) is the only file you need to submit on Gradescope. Since this is the last MP, you can expect the grade you see on Gradescope to be the final grade of this MP. Regrade requests are not accepted.

Make sure any code you added to this notebook, except for import statements, is either in a function or guarded by __main__(which won’t be run by the autograder). Gradescope will give you immediate feedback using the provided test cases. It is your responsibility to check the output before the deadline to ensure your submission runs with the autograder.

MP6: Dynamic Time Warping
Open chat
Need help?
Can we help?