You are working in a team of business analysts at a large company in the present day. The “AdaBoost” algorithm was only published a few weeks ago and has not been widely implemented yet.

data mining

Description

Assignment Description

Context: You are working in a team of business analysts at a large company in the present day. The “AdaBoost” algorithm was only published a few weeks ago and has not been widely implemented yet. The management of your company wants your team to research whether this algorithm could be used to replace an existing classifier which is used in some parts of the business. Your task is to implement the AdaBoost algorithm, provide a technical report that explores the performance characteristics of your implementation and a separate executive briefing that

provides a summary of your findings and a recommendation about its use in the business.

 

AdaBoost Implementation

Implement the Adaboost algorithm as described in Week 5. Your implementation must be in Python. You may use existing packages to implement your weak learners e.g. the use decision trees from sklearn. Your implementation must be sklearn compatible. In short you must write a custom AdaBoost class that implements both fit and predict functions. To confirm compatibility, we will use the check_estimator function from sklearn.

 

Notes:

You can read the full requirements and details here https://scikitlearn.

org/stable/developers/develop.html (https://scikitlearn.

org/stable/developers/develop.html)

Your code must be documented using Numpy/Scipy style docstrings

https://realpython.com/documenting-python-code/

(https://realpython.com/documenting-python-code/)

 

Technical Report (15 page max)

Your report should provide:

outline of the algorithm

a comprehensive analysis and discussion of the performance characteristics of your algorithm, including:

 

analysis of all hyper-parameters on a synthetic dataset (or multiple synthetic datasets)

comparison of your implementation on real or benchmark datasets to:

the sklearn AdaBoost implementation

other standard classifiers of your choice


Related Questions in data mining category