Question 1: Designing a Bayesian Network (Reproduced from Russell & Norvig, Exercise 14.11, pg. 561) In your local nuclear power station, there is an alarm that senses when a temperature gauge exceeds a given threshold. The gauge measures the temperature of the core. Consider the Boolean variables: ? (alarm sounds), ?? (alarm is faulty), and ?? (gauge is faulty), and the multivalued nodes ? (gauge reading) and ? (actual core temperature).

a. Draw a Bayesian network for this domain, given that the gauge is more likely to fail when the core temperature gets too high. b. Is your network a polytree? Why or why not? c. Suppose there are just two possible actual and measured temperatures, normal and high; the probability that the gauge gives the correct temperature is x when it is working, but y when it is faulty. Give the conditional probability table associated with G. d. Suppose the alarm works correctly unless it is faulty, in which case it never sounds. Give the conditional probability table associated with A. e. Suppose the alarm and gauge are working and the alarm sounds. Calculate an expression for the probability that the temperature of the core is too high, in terms of the various conditional probabilities in the network.

Question 2: Inference in Bayesian Networks Consider the following Bayesian Network

R: Rush Hour B: Bad Weather A: Accident T: Traffic Jam S: Sirens

We will denote random variables with capital letters (e.g., R), and the binary outcomes with lowercase letters (e.g., r, and ¬r). The network has the following parameters:

P(b)=0.2 P(r)=0.15

P(t|r, b,a) = 0.95 P(t|r, ¬b,a) = 0.92 P(t|r,b, ¬a) = 0.90 P(t|r, ¬b, ¬a) = 0.85 P(t|¬r, b, a) = 0.35 P(t|¬r, b, ¬a) = 0.4 P(t|¬r, ¬b, a) = 0.6 P(t|¬r, ¬b, ¬a) = 0.05

P(s|a) = 0.8 P(s|¬a) = 0.2

P(a|b) = 0.6 P(a|¬b) = 0.4

Compute the following terms using basic axioms of probability and the conditional independency properties encoded in the above graph. You can use Bayes Ball properties to simplify the computation, if applicable.

a. P(s, r) b. P(a, ¬t) c. P(t|s)

Question 3: Variable Elimination

For the graph above, compute the MAP result of querying P(T|a) using variable elimination with the following order: R,S,B,A,T.

Clearly explain each step. For each of the intermediate factors created, explain what probabilistic function it represents.

Question 4: Learning with Bayesian Networks Consider the following Bayesian network. Assume that the variables are distributed according to Bernoulli distributions.

a. We are given the following dataset with 146 samples, from which we will estimate the parameters of the model.

A B C D # Instances 0 0 0 0 1 0 0 0 1 4 0 0 1 0 20 0 0 1 1 4 0 1 0 0 34 0 1 0 1 2 0 1 1 0 32 0 1 1 1 0 1 0 0 0 1 1 0 0 1 0 1 0 1 0 1 1 0 1 1 4 1 1 0 0 10 1 1 0 1 19 1 1 1 0 14 1 1 1 1 0

i. Enumerate the parameters that must be learned. Specify the parameter name and the probability that it represents (i.e., for each parameter, write something in the form, ?? = Pr (?). ii. Give the maximum likelihood estimate for each parameter. iii. Give the maximum a posterior estimate for each parameter after applying Laplace smoothing.

b. Assume that in addition to the data in the table above, you are given the following incomplete data instances: A B C D S1 1 0 1 ? S2 1 1 ? 0

We will apply the (soft) EM algorithm on these instances. Initialize the model using your parameter estimates from part a subpart ii (i.e., use the MLE).

i. Show the computation of the first E-step, providing the weights for each possible assignment of the incomplete data for each sample. ii. What are the parameters obtained for the first M-step? Weight each of the samples from the original dataset and the two new samples equally (i.e., you now have 148 samples). iii. Show the computation of the second E-step.