nterpreting Regression Coefficients for Categorical Variables



Problem Set 3



Part 1: Interpreting Regression Coefficients for Categorical Variables

A researcher is using free- and reduced-price lunch program (FRPL) status as a way to predict math test scores for students. Often, FRPL is used as a proxy variable for low income or low socioeconomic status (students from families that have low income or meet other criteria for assistance receive vouchers for FRPL at school). For this section of the problem set, consider a scenario where FRPL is a variable that is dummy coded so that a student that is in the FRPL program gets coded as a 1 and a student that is not in FRPL is coded as a 0. That is:

Student in FRPL = 1 (In FRPL)

Student in FRPL = 0 (Not in FRPL)

The researcher runs the proposed model in stats software. Substituting in the software’s estimates of the model coefficients, the resulting regression formula looks like this:

Where di  represents the dummy variable for FRPL.



1.      What is the mean test score for students in the FRPL program?

2.      What is the mean test score for students who are not in the FRPL program?

3.      Name three assumptions of this model and what you would do to check each of them.

Part II. Running analysis with One Categorical Predictor

Using the dataset in the HW folder (a revised version of the data also used in lab), a researcher wants to see if a student’s perception of personal math ability is predictive of their math scores. Seesmathrecode is a dummy variable in which students were asked to respond if they agree to the statement “I see myself as a math person.” Students selected either “agree” or “disagree.”

1.      Using regression analysis, use the variable “seesmathrecode” as a predictor and X1MTSCOR (standardized math score) as the dependent variable. Paste your output below.

2.      What is the mean standardized math score for students who agreed to the statement? For students who disagreed?.

3.      Interpret the regression coefficient of seesmathrecode.

4.      Bonus: If you were to create a scatter plot of the model above (in fact, you can do this in SPSS), you get two columns of data points. One column will line up with the x-value of 0 and one will line up with the x-value of 1 (vertically). If you were to plot a regression line for this data, the line would cross the two columns at particular points. What would these points be?

Part III: Regression with a Categorical Predictor and a Continuous Predictor

As usual, the researcher needs to make sure that what’s being measured by math perception is not simply socioeconomic status (SES).

1.      Write out the equation for a model with the variable x1sestpos and seesmathrecode as predictor variables and X1MTSCOR as a dependent variable.



Related Questions in statistics category