Question 1. Sheet Q1 contains data on house characteristics. Assume house prices follow the following equation. This equation shows the actual, not estimated, value of house. No one knows this equation, we will try to find something close to this using data we have.

Remodeled is a flag which is 1 if the house is remodeled, and 0 otherwise. Error has normal distribution with mean 5000 and standard deviation of 2000.

Moreover, price per SQFT of neighborhood 1 is $15 higher than price calculated above, and price per SQFT of neighborhood 2 is $5 less than price calculated above.


A.      Simulate house prices using the above information (note: after one simulation of errors, copy the values, and use the copied ones; otherwise they would change each time you make any change)

B.      Now we want to find a model for price of house, so we can use it to estimate price of other houses. Again, remember after this we have no idea about equation above. We just have some data and want to find a model for price of house.

Assume that no data is available on whether a house is remodeled. So we can not use this variable. Using the other variables, estimate a model for house prices. Write the equation for the linear model you are going to estimate, and find coefficients using Excel solver (don’t forget the intercept!)

C.      Calculate average and standard deviation of errors of model in part B.

D.      Re-estimate model of part B, this time without intercept. Calculate average of errors, and compare it with average errors in part C. What does this comparison tell you?

E.       Estimate model of part B using the Golden Rule of Beta Hat (refer to lecture 2, slide 15), and calculate variance of errors.

F.       Estimate matrix of variance-covariance matrix for coefficients, using the following formula:

 is variance of error terms from part E.

G.     Diagonal of variance-covariance matrix, shows variance of each coefficient. Estimate t-statistic for each coefficient using the formula:

H.     Identify which coefficients are statistically significant at 5% level of significance. Using only significant coefficients, estimate house price for a 3500 SQFT house in neighborhood 3, with 10 years age.

