Manipulating Spreadsheets and Two-factor ANOVAs
In class, we learned that the two-factor ANOVA can be a
powerful test that allows us to test the influence of two separate factors on a
variable of interest. We can assess the additive influences of each factor
separately and we can also multiplicative influences by testing to see if the
two factors are interdependent.
In class, we have been working with a data set assessing the
influence of hormone treatment and sex on blood calcium levels of birds. Oftentimes,
when collecting data, the spreadsheets will be filled out in a different way
that is not intuitive to a statistical program like R. Because of this, it is
important to understand how a data file should be formatted in order for R to
correctly utilize the data set during the analysis. Many times, data sets will
be constructed in a cell format where all of the observations within a
treatment group are lumped together. Something like this (this table is
provided in the homework folder for this week and is labeled as
“bloodcalcium.xlsx”:
No Hormone Treatment |
|
Hormone Treatment |
||
Female |
Male |
Female |
Male |
|
16.3 |
15.3 |
38.1 |
34 |
|
20.4 |
17.4 |
26.2 |
22.8 |
|
12.4 |
10.9 |
32.3 |
27.8 |
|
15.8 |
10.3 |
35.8 |
25 |
|
9.5 |
6.7 |
|
30.2 |
29.3 |
This format style allows the data collector to reduce the
amount of information that needs to be repeated during each observation. While
this is convenient at first, it is important understand that R, and all other
statistical software, requires redundancy of information for correct analyses.
That means that we need to adjust this dataset so that R can interpret the
results. Essentially, we need to reduce the number of columns as much as
possible while increasing the number of rows as a side effect. How can we
simplify the information in this table? R requires that the dependent variable
(values within the table) all be in a single column. Your first objective is to
convert “bloodcalcium.xlsx” into a format that is usable by R. I want you to
merge this information (within excel) into 3 columns labeled “calcium”,
“treatment”, and “sex”. Rename the correctly formatted file “bloodcalcium.csv”.
Remember, it is currently in a .xlsx format and the new file but be saved as a
.csv for R to read it correctly.
On the next page you will find an example of what the .csv
file should look like.
Get Free Quote!
285 Experts Online