Description
Interpreting Statistical Results
For the following 6 questions, imagine
that your boss has had her entire analysis staff quit in a huff, but not before
they performed a bivariate regression and a multiple regression on a critical
issue related to productivity and gave her the results. Your boss has heard
that you have taken POS 303, and she requests your help to understand and
interpret the statistical table that the team has given her. Table 1 presents
the two linear regression models in question.
Your boss describes the two regressions
she requested from the analysis team to you. Both of the models predict how
many dollars of productivity employees created, using a 500 person random
sample of her company’s thousands of employees during a week in Septem- ber.
First, she wanted to know if just the years spent working at the company (Years
at Company) predict higher productivity (model 1). Then, her analysis team said
they should put together a model (model 2) that also included other factors
that could possibly explain productivity: gender of employee (1 female, 0
male), if employee holds a university degree (1 holds university degree, 0 does
not hold university degree), and a measurement of each employee’s score on a
work proficiency test that runs from 1 (lowest proficiency) to 100 (highest
proficiency). The team in labelled the table in a note below the regression
results.
- Setting aside ethics, based on table 1, would you tell
your boss that she should (illegally) focus on hiring only men (gender=0)
for the job? Why or why not, discussing the evidence you have in model 2?
(5 points)
- Your boss asks how predictive her work proficiency test
is of employee productivity. She asks how many dollars the table tells you
scoring 5 points higher on the exam will make the company based on this
model. Show and explain the calculation you performed. (7 points)
- Your boss also wants to know if she can trust that the
relationship she finds from this random sample of her employees about work
proficiency test scores will hold up in the rest of her employees who
weren’t sampled. What do you tell her, and based on what evidence? (3
points)
- Your boss is very confused that the coefficient numbers
for the variable “Years at Company” are quite different from one another
in the two models. Why would you tell her they are different? Based on
these models, would you tell her that years of service are a key causal
determinant of employee productivity? (5 points)
- Your boss wants to know if the analysis team really
needed to bother collecting the data on gender, university degree, and
proficiency tests. Did it help explain more of the variation in
productivity in the model? How much more and how do you know from the
table? (5 points)
- Finally, she wants to know if she should put a big
priority in her hiring process on hiring candidates with a university
degree. How would you answer her, and what information
2
from the table would you use to inform
that answer? (5 points)
Performing Statistical Analysis
For the next 14 questions, I have
provided you with a dataset in bblearn, in the same folder as this test prompt
file and your assignment link, called teachersalary.sav. You will perform data
analysis using it to answer the following questions. It is data on teacher’s
salary across 50 U.S. states and the District of Columbia in 2005-2006. It
contains five variables, including:
- state, the name of the state
- salary, the average salary of teachers in that state
(dollars)
- spending, the average state spending per pupil in that
state (dollars)
- northeastmidwest, an indicator variable taking the
value of 1 if that state is in the northeast or midwest region of the
United States, 0 otherwise
- south, an indicator variable taking the value of 1 if
that state is in the south region of the United States, 0 otherwise
- What are the mean, median, and standard deviation of
the variable average state spending per pupil (spending)? (5 points)
- What does the value of the median of average state
spending per pupil relative to the mean of average state spending per
pupil mean about the distribution of that variable? (1 point)
- Calculate and report the bivariate correlation
(sometimes called Pearson’s correlation or Pearson’s R as well) between
the variables average state spending per pupil (spend- ing) and average
salary of teachers in state (salary). In words, what does this statistic
mean about these two variables? (4 points)
- Perform a difference of means t-test on the variable
average salary of teachers in a state (salary) between states in the south
(south=1) and states not in the south (south=0) with a null hypothesis
that they have the same mean teacher salary. Do not assume equal
variances. What is the p-value of that test, and, in words, what can we
conclude from the test if our decision criteria is the .05 level of statistical
significance for rejecting the null? (3 points)
- Perform a difference of means t-test on the variable
average state spending per pupil in state (spending) between states in the
northeast or midwest (northeastmidwest=1) and states not in the northeast
or midwest (northeastmidwest=0) with a null hypothesis that they have the
same mean state spending per pupil. Do not assume equal variances. What is
the p-value of that test, and, in words, what can we conclude from the
test if our decision criteria is the .05 level of statistical significance
for rejecting the null? (3 points)