[Get it solved] This is the file19.txt we needed this file for calculatin...

Check Out Our Work & Get Yours Done

Submit Work

Download Sample

Enroll in the complete course for only $250 USD*

Order Now

Submit work Offers

This is the file19.txt we needed this file for calculating our problem HARTIGAN is a dataset directory that contains test data for clustering algorithms.

data mining

Description

This is the file19.txt we needed this file for calculating our problem

HARTIGAN is a dataset directory that contains test data for clustering algorithms. The data files are all simple text files, and the format of the data files is explained on the web page at https://people.sc.fsu.edu/~jburkardt/datasets/hartigan/hartigan.html

Perform K-means clustering on file19.txt on the above web page.

# file19.txt

# Reference:

# John Hartigan,

# Clustering Algorithms,

# Wiley, 1975.

# ISBN 0-471-35645-X

# LC: QA278.H36

# Dewey: 519.5'3

# "Name" is the name of the animal.

# "I", "i", "C", "c", "P", "p", "M", "m", is the tooth pattern, the

# number of top incisors, bottom incisors, top canines, bottom canines,

# top premolars, bottom premolars, top molars, and bottom molars.

"Dentition of Mammals, Hartigan page 170"

9 columns

66 rows

"Name" "I" "i" "C" "c" "P" "p" "M" "m"

"Opossum" 5 4 1 1 3 3 4 4

"Hairy tail mole" 3 3 1 1 4 4 3 3

"Common mole" 3 2 1 0 3 3 3 3

"Star nose mole" 3 3 1 1 4 4 3 3

"Brown bat" 2 3 1 1 3 3 3 3

"Silver hair bat" 2 3 1 1 2 3 3 3

"Pigmy bat" 2 3 1 1 2 2 3 3

"House bat" 2 3 1 1 1 2 3 3

"Red bat" 1 3 1 1 2 2 3 3

"Hoary bat" 1 3 1 1 2 2 3 3

"Lump nose bat" 2 3 1 1 2 3 3 3

"Armadillo" 0 0 0 0 0 0 8 8

"Pika" 2 1 0 0 2 2 3 3

"Snowshoe rabbit" 2 1 0 0 3 2 3 3

"Beaver" 1 1 0 0 2 1 3 3

"Marmot" 1 1 0 0 2 1 3 3

"Groundhog" 1 1 0 0 2 1 3 3

"Prairie Dog" 1 1 0 0 2 1 3 3

"Ground Squirrel" 1 1 0 0 2 1 3 3

"Chipmunk" 1 1 0 0 2 1 3 3

"Gray squirrel" 1 1 0 0 1 1 3 3

"Fox squirrel" 1 1 0 0 1 1 3 3

"Pocket gopher" 1 1 0 0 1 1 3 3

"Kangaroo rat" 1 1 0 0 1 1 3 3

"Pack rat" 1 1 0 0 0 0 3 3

"Field mouse" 1 1 0 0 0 0 3 3

"Muskrat" 1 1 0 0 0 0 3 3

"Black rat" 1 1 0 0 0 0 3 3

"House mouse" 1 1 0 0 0 0 3 3

"Porcupine" 1 1 0 0 1 1 3 3

"Guinea pig" 1 1 0 0 1 1 3 3

"Coyote" 1 3 1 1 4 4 3 3

"Wolf" 3 3 1 1 4 4 2 3

"Fox" 3 3 1 1 4 4 2 3

"Bear" 3 3 1 1 4 4 2 3

"Civet cat" 3 3 1 1 4 4 2 2

"Raccoon" 3 3 1 1 4 4 3 2

"Marten" 3 3 1 1 4 4 1 2

"Fisher" 3 3 1 1 4 4 1 2

"Weasel" 3 3 1 1 3 3 1 2

"Mink" 3 3 1 1 3 3 1 2

"Ferrer" 3 3 1 1 3 3 1 2

"Wolverine" 3 3 1 1 4 4 1 2

"Badger" 3 3 1 1 3 3 1 2

"Skunk" 3 3 1 1 3 3 1 2

"River otter" 3 3 1 1 4 3 1 2

"Sea otter" 3 2 1 1 3 3 1 2

"Jaguar" 3 3 1 1 3 2 1 1

"Ocelot" 3 3 1 1 3 2 1 1

"Cougar" 3 3 1 1 3 2 1 1

"Lynx" 3 3 1 1 3 2 1 1

"Fur seal" 3 2 1 1 4 4 1 1

"Sea lion" 3 2 1 1 4 4 1 1

"Walrus" 1 0 1 1 3 3 0 0

"Grey seal" 3 2 1 1 3 3 2 2

"Elephant seal" 2 1 1 1 4 4 1 1

"Peccary" 2 3 1 1 3 3 3 3

"Elk" 0 4 1 0 3 3 3 3

"Deer" 0 4 0 0 3 3 3 3

"Moose" 0 4 0 0 3 3 3 3

"Reindeer" 0 4 1 0 3 3 3 3

"Antelope" 0 4 0 0 3 3 3 3

"Bison" 0 4 0 0 3 3 3 3

"Mountain goat" 0 4 0 0 3 3 3 3

"Musk ox" 0 4 0 0 3 3 3 3

"Mountain sheep" 0 4 0 0 3 3 3 3

2.2 K-means clustering (2.5 points divided evenly among the components)

Perform K-means clustering on file19.txt on the above web page.

This file contains a multivariate mammals dataset; there are 9 columns and 66 rows.

(a) Data cleanup (1 point divided evenly by components below)

(i) Think of what attributes, if any, you may want to omit from the dataset when you do the clustering. Indicate all of the attributes you removed before doing the clustering.

(ii) Does the data need to be standardized? (iii) You will have to clean the data to remove multiple spaces and make the comma character the delimiter. Please make sure you include your cleaned dataset in the archive file you upload.

(b) Clustering (2 points divided evenly by components below)

(i) Determine how many clusters are needed by running the WSS or Silhouette graph. Plot the graph using fviz_nbclust().

(ii) Once you have determined the number of clusters, run k-means clustering on the dataset to create that many clusters. Plot the clusters using fviz_cluster().

(iii) How many observations are in each cluster?

(iv) What is the total SSE of the clusters?

(v) What is the SSE of each cluster?

(vi) Perform an analysis of each cluster to determine how the mammals are grouped in each cluster, and whether that makes sense? Act as the domain expert here; clustering has produced what you asked it to. Examine the results based on your knowledge of the animal kingdom and see whether the results meet expectations. Provide me a summary of your observations.

Hint: to get the indices of all animals in cluster 1, you would execute: > which(k$cluster == 1) assuming k is the variable that holds the output of the kmeans() function call.

Related Questions in data mining category

Data Execution Protection (DEP) Be sure to answer question in complete sentences and full detail. Please provide research reference in APA style. Viruses and other malware often exploit bugs known as buffer overflows in widely used software. One method of

The database contains content relating to places on the Earth and a website provided acts as an interface for viewing this information.

Here is it important to describe the context of your problem, previous studies…then state your aim/motivation.

Project Deliverable 3: Database and Data Warehousing Design This assignment consists of two (2) sections: a design document and a revised project plan. You must submit both sections as separate files for the completion of this assignment

Need Help with Access Db - how to import data from excel and need steps as to how to send the MS Access database to sharepoint

What are the Objectives for the Theme that you selected.

What is the difference between database types and capacities? How do data inaccuracies affect patient care and reimbursement?

Case Study: Database Development Read the following articles available in the ACM Digital Library: Note: The ACM Digital Library is a Strayer Library database located in iCampus > Campus & Library > Learning Resource Center > Databases.

Write The Reflection As A Letter To Yourself In The Future, One For Which You Will CC: Instructor (Your Secondary Audience).

For this assignment, you are required to identify and develop one (or more) visualisation(s) for the given multidimensional data set using existing software or programming platform, (e.g. Tableau, TabuVis, R, etc.).

Get Higher Grades Now

Tutors Online

Description

Drop Files Here Or Click to Upload

May

January

February

March

April

May

June

July

August

September

October

November

December

2025

1950

1951

1952

1953

1954

1955

1956

1957

1958

1959

1960

1961

1962

1963

1964

1965

1966

1967

1968

1969

1970

1971

1972

1973

1974

1975

1976

1977

1978

1979

1980

1981

1982

1983

1984

1985

1986

1987

1988

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

2025

2026

2027

2028

2029

2030

2031

2032

2033

2034

2035

2036

2037

2038

2039

2040

2041

2042

2043

2044

2045

2046

2047

2048

2049

2050

Sun	Mon	Tue	Wed	Thu	Fri	Sat
27	28	29	30	1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

00:00

00:30

01:00

01:30

02:00

02:30

03:00

03:30

04:00

04:30

05:00

05:30

06:00

06:30

07:00

07:30

08:00

08:30

09:00

09:30

10:00

10:30

11:00

11:30

12:00

12:30

13:00

13:30

14:00

14:30

15:00

15:30

16:00

16:30

17:00

17:30

18:00

18:30

19:00

19:30

20:00

20:30

21:00

21:30

22:00

22:30

23:00

23:30

Warning: require_once(/home/u706648698/domains/calltutors.com/public_html/service_page_footer.php): failed to open stream: No such file or directory in /home/u706648698/domains/calltutors.com/public_html/Assignment.php on line 380

Fatal error: require_once(): Failed opening required '/home/u706648698/domains/calltutors.com/public_html/service_page_footer.php' (include_path='.:/opt/alt/php73/usr/share/pear') in /home/u706648698/domains/calltutors.com/public_html/Assignment.php on line 380

Sun	Mon	Tue	Wed	Thu	Fri	Sat
27	28	29	30	1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Sun	Mon	Tue	Wed	Thu	Fri	Sat
27	28	29	30	1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Enroll in the complete course for only $250 USD*

This is the file19.txt we needed this file for calculating our problem HARTIGAN is a dataset directory that contains test data for clustering algorithms.

data mining

Description

Get instant assignment help service

Related Questions in data mining category

Sun	Mon	Tue	Wed	Thu	Fri	Sat
27	28	29	30	1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31