This assessment involves writing a report that summarises a data science related investigation that you have conducted on data that you have collected yourself.

computer science

Description

OVERVIEW 

This assessment involves writing a report that summarises a data science related investigation that you have conducted on data that you have collected yourself. The investigation must involve the main topics covered in the subject, most noticeably data pre-processing (representation, wrangling, tidying) and exploratory data visualisation using R/RStudio.


It is a merger of Assessments 3 (Exploratory Visualisation) and 4 (Pre-Processing – Parts A and B), however neither the dataset nor the pre-processing/exploratory steps to be carried out will be provided, you have to make independent choices and decisions.


You will need to find your own data using good practices. Your dataset cannot be smaller than 1000 observations of 5 variables, except if the targeted data science problem to be addressed relates to spatial-temporal data, case in which less than 5 dimensions could be allowed.


Preferably, you should use a dataset relevant to your place of work. Do not use data from textbooks or from R packages. Do not use data from the same public sources that have been used in the subject (e.g. UCI repository). You can use public data, but the data should be appropriate for addressing a relevant data science problem. 


Related Questions in computer science category