This assessment involves writing a report that summarises a data science related investigation that you have conducted on data that you have collected yourself. The investigation must involve the main topics covered in the subject, most noticeably data pre-processing (representation, wrangling, tidying) and exploratory data visualisation using R/RStudio.
It is a merger of Assessments 3 (Exploratory Visualisation) and 4 (Pre-Processing – Parts A and B), however
neither the dataset nor the pre-processing/exploratory steps to be carried out will be provided, you have to make
independent choices and decisions.
You will need to find your own data using good practices. Your dataset cannot be smaller than 1000 observations
of 5 variables, except if the targeted data science problem to be addressed relates to spatial-temporal data, case in
which less than 5 dimensions could be allowed.
Preferably, you should use a dataset relevant to your place of work. Do not use data from textbooks or from R
packages. Do not use data from the same public sources that have been used in the subject (e.g. UCI repository).
You can use public data, but the data should be appropriate for addressing a relevant data science problem.
Get Free Quote!
423 Experts Online