The purpose of this project is to use identify types of variables, the context of data and to perform an exploratory data analysis on a given data set.

statistics

Description

Project 1: Analyzing a Data Set


Purpose:

The purpose of this project is to use identify types of variables, the context of data and to perform an exploratory data analysis on a given data set.


Task:

Read the introduction to the acupuncture data set and the data dictionary and look over the manuscript, Acupuncture for Primary Care: large, pragmatic, randomized trial.  

You will use the acupuncture.xlsx data set for this project.

For this data set, the following variables are baseline variables:  age, sex, migraine, chronicity, practice id and group.


Conduct your Exploratory Data Analysis following the steps below:


Provide the context for the data set.  Be sure to include the variable type for each of the variables.

What is the exposure variable for this study?  What are the outcome variables?

Select one of the outcome variables (headache severity score or headache frequency).  You will use this variable for the rest of the project.

For your selection, describe the distribution of your variable at baseline and at 1 year.  Does there appear that the distribution has changed over time?  Justify your answer graphically and/or numerically.

Compare the baseline measures for the treatment and control groups for your outcome variables.  Does there appear to be a difference between the groups?  Is this surprising?  Why or why not?

Compare the 1-year measures between the treatment and control groups for your outcome variable.  Does acupuncture appear to be effective?  Justify your answer.

Select sex or migraine.  For your selection, does the variable differ between treatment and control groups?  Justify your answer graphically and/or numerically.

Using the variable you selected from Step 7 (sex or migraine), conduct a stratified analysis for your outcome at baseline and one year (see attached table).  Does it appear that there is a difference between the groups within the treatment and control groups?  Between the treatment and control groups?  Justify your answer.

Select age or chronicity.  For your selection, does the variable differ between treatment and control groups?  Justify your answer graphically and/or numerically.

Using the variable from Step 9 (age or chronicity), does the correlation between your variable and your outcome at baseline and 1 year differ for treatment and control groups?  Justify your answer.

Write a summary of your results from the steps above.

Submit your summary and any tables/graphs.


Instruction Files

Related Questions in statistics category