In this assignment, you will do a series of tutorials followed by a data mining task. You will apply the CRISP-DM data mining technique to your selected domain. Specifically, gain insights into the data, formulate a set of hypothesizes, preprocess your data, and build a decision tree model. This assignment is graded on a scale of 100. For reference, when discussing the data mining process please reference CRISP-DM (shown below). You should submit the final Rapid Miner models. You will also include an analysis document.
Data preprocessing tutorials
Analysis Task: After you have installed RapidMiner, please complete the following data preprocessing
tutorials (please note, the datasets you need for the tutorials are included in the RapidMiner download).
Upon completing each tutorial, briefly summarize what you did and discuss how it relates to what we have
discussed in class.
Decision Tree tutorial
Analysis Task: Read the following documentation and complete the example at the end of this document:
Analysis Task: Download and open the Supermarket Transactions dataset. You should go through the
dataset and think about each attribute. Then in your analysis document, for each attribute you should
give its name, a description, its type (numeric or categorical), if you notice any outliers or missing values,
and any other interesting things you notice. You should also address data reduction & attribute reduction.
You should also describe the class labels and explain what they mean.
Get Free Quote!
272 Experts Online