Assignment 2

Assignment Goals:

In this assignment, I hope that you will show that you can:

1. compute
correlations, scatterplots, and regression tables using R

2. reach
well-reasoned conclusions arising from correlations and scatterplots

3. demonstrate an
understanding of the linear regression equation and its components

4. interpret
correctly the standard error of prediction or estimate

Assignment Questions:

Professor Schmedlap is pilot testing three new and improved versions of
his already popular and lucrative Schmedlap Continuous Achievement Measure
(SCAM). He has administered the old SCAM
test and three new SCAM prototypes of his test to a sample of children. A partial listing of the three data sets is
as follows:

SET 1 SET
2 SET 3

X Y X Y
X Y

10.0 8.04 10.0 9.14 10.0 7.46

8.0 6.95
8.0 8.14 8.0 6.77

13.0 7.58 13.0 8.74 13.0 12.74

9.0 8.81
9.0 8.77 9.0 7.11

11.0 8.33 11.0 9.26 11.0 7.81

14.0 9.96 14.0 8.10 14.0 8.84

6.0 7.24
6.0 6.13 6.0 6.08

4.0 4.26
4.0 3.10 4.0 5.39

12.0 10.80 12.0 9.13 12.0 8.15

7.0 4.80
7.0 7.26 7.0 6.42

5.0 5.60
5.0 4.74 5.0 5.73

The X value in each data
set is the score students received on the old SCAM. The y variable refers to the new version of
the SCAM examined in each of the data sets.

1. Professor
Schmedlap begins his data analysis by calculating the correlation between the old
version and each new version of the SCAM.
He reasons that a reasonably strong relationship should be present, if
both versions are measuring the same underlying construct.

a) Calculate the
correlation between the new and old versions of the SCAM in each of the data
sets. (10 marks)

b) How much variance
is shared between the new and old SCAM in each version? (5 marks).

c) Based solely on
the correlative evidence, which new version would you say is best? Why? (10 marks).

2. Professor Schmedlap
knows that the new versions of his test will yield slightly different scores
because the scale has changed. He has a
novel idea. He proceeds to calculate the intercept and regression coefficient
for each data set. He reasons that the
resultant linear regression equations can be used by practitioners to convert
old SCAM scores to new SCAM scores. In
this way, users of the new SCAM will be able to make comparisons for students
who have scores on the old SCAM. He
will, of course, sell these equations as an option to test users.

a) What is the
linear equation for each data set? (10 marks).

b) What is the
standard error of estimate for each equation? (10 marks)

c) Generally
speaking, what factors effect the size of the standard error of estimate? (10
marks).

d) Using the linear
equation generated for the first data set, what new SCAM scores would be
predicted from old scam sores of 6, 7, and 10 (10 marks).

e) What is the
predicted range for each of these scores at the 68% confidence level (10 marks).

3. For each data
set, construct a scatter plot. What do
the plots tell you about the relationship between the new and old SCAM in each
data set? Now, which version of the New
Scam would you say is best and why? (25 marks).

Get Higher Grades Now

Tutors Online