1) Do exercises 1, 2 and 3 in section 3.10 of MUS – all references to MUS are to the revised edition.
2) A newly graduated MA student is hired by a federal government department and assigned to a write
a paper analysing average unemployment insurance (UI) benefit receipt by federal electoral riding.
(Each Member of Parliament cares a lot about this report.) The student uses Statistics Canada
survey data, which is flawless in that it has no measurement error or other related problems. (Or at
least we will consider it flawless for the purposes of this question.) The sample is a cross-section of
individuals and its size is massive.
The new hire considers this a simple task and runs an OLS regression with the dependent variable
being an indicator for UI receipt (i.e. 1 if the person had received benefits in the last year, 0
otherwise). As “explanatory variables” the recently graduated student includes: age (in years), age
squared, female (a 0/1 indicator), the industry in which the claimant worked prior to the claim (a set
of 32 indicator variables representing 33 industries) and a set of 249 indicators for the 250 electoral
ridings (he leaves district 153 - “Central Ottawa” as the omitted district). (Note: I’m not really sure
how many ridings there are, or that there actually is a “Central Ottawa”, but this is immaterial to the
The new hire shows the director the coefficients and says: “See the coefficient for district 2. It is
positive, but it is small in magnitude and not statistically significant. This implies that the people in
that district, wherever it is, are approximately equally likely to claim UI benefits as those in the
omitted group, which I selected to be Central Ottawa.”
The director starts to laugh (not a good beginning to a new graduate’s career) and she says: “You
must have done something wrong. District 2 is Northern Newfoundland, which has a lot of fishers
and fish plant workers with low levels of formal education and a high propensity to claim benefits. I
can assure you that they use much more UI per capita than those in Central Ottawa, which is a
wealthy area full of older, highly educated and highly paid civil servants who almost never get laid
off. District 2 has a much higher take-up rate.”
How can you explain the difference between the director’s intuition about UI claim rates (which is
correct) and the new graduate’s conclusion based on the regression (which was run correctly - i.e.
what is stated above was actually done)?