In this assignment, you will do a series of tutorials followed by a data mining task. You will apply the CRISP-DM data mining technique to your selected domain. Specifically, gain insights into the data,

data mining

Description

In this assignment, you will do a series of tutorials followed by a data mining task. You will apply the CRISP-DM data mining technique to your selected domain. Specifically, gain insights into the data, formulate a set of hypothesizes, preprocess your data, and build a decision tree model. This assignment is graded on a scale of 100. For reference, when discussing the data mining process please reference CRISP-DM (shown below). You should submit the final Rapid Miner models. You will also include an analysis document.


Step 1: 

Data preprocessing tutorials Analysis Task: After you have installed RapidMiner, please complete the following data preprocessing tutorials (please note, the datasets you need for the tutorials are included in the RapidMiner download).


Upon completing each tutorial, briefly summarize what you did and discuss how it relates to what we have discussed in class.


Step 2: 

Decision Tree tutorial Analysis Task: Read the following documentation and complete the example at the end of this document:


Step 3: 

Data Understanding Analysis Task: Download and open the Supermarket Transactions dataset. You should go through the dataset and think about each attribute. Then in your analysis document, for each attribute you should give its name, a description, its type (numeric or categorical), if you notice any outliers or missing values, and any other interesting things you notice. You should also address data reduction & attribute reduction. You should also describe the class labels and explain what they mean.


Related Questions in data mining category