Implement the hierarchical agglomerative clustering with the following linkage: single, complete, average and centroid

computer science

Description

Data Analysis Assingment:

1. Implement the hierarchical agglomerative clustering with the following linkage: single, complete, average and centroid .Your program should be written in R. You are not allowed to use any existing implementations of hierarchical agglomerative clustering in R or any other language, and should code the hierarchical agglomerative clustering from first principles. However, you can use built-in R functions to visualise your results.

 

2. Apply your program to the NCI microarray data set which can be downloaded from the course’s Moodle page. This dataset has 64 columns and 6830 rows, where each column is an observation (a cell line) and each row represents a feature (a gene); (Therefore, the data set is represented via its transposed data matrix.) . Preprocess the data set as appropriate.


Related Questions in computer science category