Stat261

Assignment #5

The files: Stat261-Assign5-R-2018.pdf and StudentPerformanceData.pdf are required to complete this

assignment. The Stat261-Assign5-R-2018.Rmd contains the code that generated Stat261-Assign5-R2018.pdf. Section numbers in the assignment questions refer to the sections in Stat261-Assign5-R2018.pdf.

We will use the variables G3, sex, and Walc in this data set. Look up the definitions of these

variables in the file StudentPerformanceData.pdf.

1. Using the information (summary and graphs) in Section 1.1, comment on the variables G3,

sex, and Walc.

2. Section 1.2 contains an analysis of the first 10 grade observations.

(a) Comment on the distribution of these 10 observations given Figures 3 and 4.

(b) What is a 95% confidence interval for the mean of the 10 observations?

(c) Using the 10 observations, perform a test of the hypothesis that the mean grade is 10.

Include a concluding sentence which could be incorporated into a report about this data

to your boss. Hint: the required computed quantities are given in this section.

3. We are interested in comparing the grades for boys and girls in Section 1.3.

(a) Comment on Figure 5.

(b) Assuming that the variances for the boys and girls are equal, perform a test of the hypothesis that the mean grade for boys is the same as the mean grade for girls. Include

a concluding sentence which could be incorporated into a report about this data to your

boss.

(c) Without assuming that the variances for the boys and girls are equal, perform a test of the

hypothesis that the mean grade for boys is the same as the mean grade for girls. Include

a concluding sentence which could be incorporated into a report about this data to your

boss.

(d) Were your conclusions for the two tests above the same? Why do you suppose that is?

4. In Section 1.4, we are interested in whether there is a relationship between student grades and

Walc, weekend alcohol consumption.

(a) Comment on Figure 6 and 7.

(b) The results from fitting a straight line model to the grades as a function of weekend alcohol

consumption are given on page 10. What is the estimated straight line model for this data?

(c) Perform a test of the hypothesis that the slope parameter is zero. Include a concluding

sentence which could be incorporated into a report about this data to your boss.

(d) Figure 8 is a qqplot of the residuals from the straight line model fit. Comment on this

plot.

(e) Figure 10 contains the side-by-side boxplots of the grades by Walc, with the fitted line

drawn on top. Comment on this graph.

BONUS QUESTIONS:

1. (BONUS) Suppose that Y1, Y2, …, Yn are independent N(α, σ2

). Show that if σ is unknown, the

likelihood ratio statistic for testing H0 : α = α0 is given by:

D = n ln

1 +

1

n − 1

T

2

, where

T =

αˆ − α0

s/√

n

.

2. (BONUS) Testing equality of variances. Consider k independent normal samples of sizes

n1, n2, …, nk. Measurements from sample i have unknown variance σ

2

i

. Let s

2

1

, s2

2

, …, s2

k be

the sample variances computed from the sample data which are estimates of σ

2

1

, σ2

2

, …σ2

k

. Since

the measurements are normally distributed, we know that.

(ni − 1)s

2

i /σ2

i ∼ χ

2

(ni−1) for i = 1, 2, …, k.

Using the above distribution, the log likelihood for σi

is therefore:

ℓ(σi) = −(ni − 1) ln σi − (ni − 1)s

2

i /(2σ

2

i

).

(a) Find the joint log likelihood function of σ1, σ2, …, σk and show that it is maximized for

ˆσ

2

i = s

2

i

, i = 1, 2, …, k.

(b) Show that if σ1 = σ2 = … = σk = σ, then the MLE of σ

2

is given by,

s

2

pooled =

X

k

i=1

(ni − 1)s

2

i

!

/

X

k

i=1

(ni − 1)!

.

(c) Show that the likelihood ratio statistic for testing H0 : σ1 = σ2 = … = σk = σ is given by

D =

X

k

i=1

(ni − 1) ln(s

2

pooled/s2

i

)

.

2