MGMTMFE 407: Empirical Methods in

Finance

Homework 1

Please use R to solve these problems. You can just hand in one set of solutions that has

all the names of the contributing students on it in each group. Use the electronic drop box

to submit your answers. Submit the R Öle and the Öle with a short write-up of your answers

separately.

[The quality of the write-up matters for your grade. Please imagine that youíre writing a

report for your boss at Goldman when drafting answers these questions. Try to be be clear

and precise.]

Problem 1: Building a simple autocorrelation-based forecasting model

Fama and French (2015) propose a Öve-factor model for expected stock returns. One of the

factor is based on cross-sectional sorts on Örm proÖtability. In particular, the factor portfolio

is long Örms with high proÖtability (high earnings divided by book equity; high ROE) and

short Örms with low proÖtability (low earnings divided by book equity; low ROE). This

factor is called RMW ñRobust Minus Weak.

1. Go to Ken Frenchís Data Library (google it) and download the Fama/French 5 Factors

(2×3) in CSV format. Denote the time series of value-weighted monthly factor returns

for the RMW factor from 196307-201911 as “rmw.” Plot the time-series, give the

annualized mean and standard deviation of this return series.

1

2. Plot the 1st to 60th order autocorrelations of rmw. Also plot the cumulative sum of

these autocorrelations (that is, the 5th observation is the sum of the Örst 5 autocorrelations, the 11th observation is the sum of the Örst 11 autocorrelations, etc.). Describe

these plots. In particular, do the plots hint at predictabilty of the factor returns? What

are the salient patterns, if any?

3. Perform a Ljung-Box test that the Örst 6 autocorrelations jointly are zero. Write out

the form of the test and report the p-value. What do you conclude from this test?

4. Based on your observations in (2) and (3), propose a parsimonious forecasting model

for rmw. That is, for the prediction model

rmwt+1 = +

0

xt + “t+1; (1)

choose the variables in xt ñit could be only one or a K 1 vector. While this analysis

is in-sample, I do want you to argue for your variables by attaching a “story” to your

model that makes it more ex ante believeable. (PS: This question is purposefully a

little vague. There is not a single correct answer here, just grades of more to less

reasonable as in the real world).

5. Estimate the proposed model. Report Robust (White) standard errors for ^, as well

as the regular OLS standard errors. In particular, from the lecture notes we have that

V arW hite

^

=

1

T

1

T

P

T

t=1

xtx

0

t

1

1

T

P

T

t=1

xtx

0

t

^”

2

t

1

T

P

T

t=1

xtx

0

t

1

; (2)

V arOLS

^

=

1

T

1

T

P

T

t=1

xtx

0

t

1

1

T

P

T

t=1

xtx

0

t

1

T

P

T

t=1

^”

2

t

1

T

P

T

t=1

xtx

0

t

1

: (3)

(In asymptotic standard errors, we do not adjust for degrees of freedom which is why

we simply divide by T).

Problem 2: Nonstationarity and regression models

1. Simulate T time series observations each of of the following two return series N times:

r1;t = + “1;t;

r2;t = + “2;t; (4

where = 0:5%, = 4%, and the residuals are uncorrelated standard Normals. Let

T = 600 and N = 10; 000. For each of the N time-series, regress:

r1;t = + r2;t + “t

; (5)

and save the slope coe¢ cient as

(n)

, where n = 1; :::; N. Give the mean and standard

deviation of across samples n and plot the histogram of the 10; 000 ís. Does this

correspond to the null hypothesis = 0? Do the regress standard errors look ok?

2. Next, construct N price sample of length T based on each return using:

p1;t = p1;t1 + r1;t;

p2;t = p2;t1 + r2;t; (6)

using p1;0 = p2;0 = 0 as the initial condition. Now, repeat the regression exercise using

the regression:

p1;t = + p2;t + “t

: (7)

Again report the mean and standard deviation of the N estimated ís and plot the

histogram. Does this correspond to the null hypothesis = 0? Do the regression

standard errors look ok? Explain what is going on here that is di§erent from the

previous return-based regressions.