For this assignment, we use a survey on cannabis consumption in Canada. The survey was conducted in 2017 by Statistics Canada. The dataset is saved into the file A5datai.rda, where i is the
number assigned to you in Quiz 6. The dataset A5data included in the file contains 17 variables. The
data was obtained through the University Library website. You click on the “file & resources” tab
and choose “Statistics & numerical data”. You then click on “ODESI Data Retrieval”. ODESI stands
for “Ontario Data Documentation, Extraction Service and Infrastructure”. If you like data analysis,
it is a great source. To retreive the raw data, you expand “Health”, “Canada”, “Canadian Tobacco,
Alcohol and Drugs Survey”, “2017” and “Dataset: Canadian Tobacco, Alcohol and Drugs Survey,
2017: Person file”, which is the survey used in this assignment. The data are in “Metadata”. Once
selected, you can click on the download button on the top right of the right window. For R, you select
the CSV format. It will come with a PDF in which all variables are described. You can also get the
description on the website by clicking on “Variable Description” below “Metadata”. Any survey data
need a little cleaning before you can use them in a regression. In this course, we don’t have time to
cover it, but if you are interested, the document cleanData.R uploaded to Learn shows how to create
the whole dataset. Your file is a subset of this dataset. For example, he variable CAN 010 is the
answer to: “During your lifetime, have you ever used or tried marijuana?”. The possible answers are:
1- Yes, 2- No, 6- Valid Skip, 7- Don’t know, 8- Refusal, and 9- Not stated. I used that variable to
create the variable “EverUsed”, which is 1 if the answer was 1 and 0 if the answer was 2. Individuals
who provided other answers were removed from the sample.
Get Free Quote!
340 Experts Online