The project is largely focused on the process, and less on the findings. I will be looking for your adherence to the process of conducting a data analytic project, as well as your reflections on what you learned from going through the process.

data mining

Description

PROJECT INSTRUCTIONS

You have gotten a taste of what it takes to be a data professional. Now you get a chance to put it all together! Your Final Project is to choose a subject area (domain) that you are interested in and conduct your own data project in that area!

The project is largely focused on the process, and less on the findings. I will be looking for your adherence to the process of conducting a data analytic project, as well as your reflections on what you learned from going through the process.

THE PROCESS

1.      SELECT a domain area of research

2.      FORMULATE a problem statement & hypothesis. describe the problem in detail you wish to explore.

3.      FRAME the question(s) according to your domain

a. Understand A Business

b. Understand A Stakeholders

4.      OBTAIN data for your project

a.  Describe the Data: Information about the dataset itself, e.g., the attributes and attribute types, the number of instances, your target variable.

5.      SCRUB the data, this includes cleaning and preparing the data for analytic purposes

6.      ANALYZE the data, looking for patterns and insights (EDA & Analytics)

7.      SUMMARIZE your findings

 

THE DELIVERABLES

       A project report document (APA formatted) between 5-8 pages in length, not including title page, content page, or images/graphics/reference. The report should have the below sections:

       Introduction: this is where you provide a brief description of your personal motivation for the project and the framing question. Tell the reader why they care about the results you are about to present and why is the question you will be answering is important. A description of your dataset including what type of data it contains, how many attributes, how many instances. Any additional challenges such as messy or missing values.

       Data Analysis: this is where you describe your data (summary statistics, EDA), explain the methods you used to analyze the data. Discuss how the method works, why it was well suited for your data, and how you applied it.

       Results: this is where you describe and explain your findings. Why do you think you found the results you did and what do you think they mean?

       Conclusion: this should provide a concise answer to the analytical question posed in the introduction along with a brief description of why the analysis answered the way it did, which should be consistent with your results section. Additionally, you may wish to posit questions raised by your analysis for future analysis.

       Reflection: Conduct a self-reflection for each of the phases in The Process Section above to uncover key learnings that you can apply towards future projects or which you can share with your colleagues. Document at least 4 key learnings from your reflection. Describe observations, challenges, lucky breaks, emotions that you experienced, etc. you may have experienced during a specific phase.

        Do not just provide diagrams and statistics, each table & figure included must have a caption (e.g., Figure number and textual description) that is referenced from the text (e.g., “Figure 2 shows a frequency diagram for ...”).

       You should also provide your source code of a well-documented and formatted Jupyter Notebook and dataset files.

 

GRADING

This assignment is worth 100 points, which is 10% of your final grade. Your assignment will be evaluated based on a successful compilation and adherence to the project requirements. Grading criteria:

       50 pts for project report

       50 pts for Python implementation

 

BLACKBOARD SUBMISSION

Submit your project to blackboard by the due date, no late submissions will be accepted. You should submit a well-documented Jupyter Notebook and dataset files. Submit both .ipynb and ..docx file, name your files  First_Lastname_FinalProject.xxx.


Related Questions in data mining category