[Solved] The file SalesData2.xlsx contains 5000 rows of data. This...

Check Out Our Work & Get Yours Done

Submit Work

Download Sample

Enroll in the complete course for only $250 USD*

Order Now

Submit work Offers

The file SalesData2.xlsx contains 5000 rows of data. This is another set of data from CPCG, the local coffee and gelato shop from Assignment 4.

data mining

Description

The file SalesData2.xlsx contains 5000 rows of data. This is another set of data from CPCG, the local coffee and gelato shop from Assignment 4. Each row summarizes a customer transaction, as before. However, the set of attributes is different from the previous dataset. In particular, this dataset has a binary attribute called return_visit that is equal to 1 if the customer returned to CPCG at some point in the future, and 0 otherwise. This attribute will be our dependent variable (or “label” in RapidMiner); we want to know, based on the characteristics of a transaction, which customers are likely to return and which customers are not.

Other attributes include which of the six CPCG Locations the transaction was from, the Time of day at which the transaction occurred, the Discount the customer received in dollars (usually equal to zero), the total Net Sales of the transaction in dollars, and the Tip that the customer gave.

First, import the dataset into RapidMiner, making sure to change the type of return_visit to binomial, and the role of return_visit to label.

1. Building a logistic regression model

a) Create a process in RapidMiner that builds a logistic regression model for this dataset. (For now, you do not need to use Cross Validation, nor make any predictions.) However, you will first need to convert the Location attribute into a set of dummy variables. The easiest way to do this is with the Nominal to Numerical operator; in the “comparison groups” list, choose Location as the “comparison group attribute,” and enter “Bakery” as the comparison group. This will set Bakery as the default location, and create a dummy variable for each of the other five CPCG stores.

You will also need to use the Remap Binominals operator to force RapidMiner to code 0 as the “negative” value and 1 as the “positive” value of return_visit.

Show a screenshot of the Process panel.

b) Run the process from part a, and show a screenshot of the logistic regression output. c) Based on your output from part b, identify TWO significant independent variables. If you list more than two, only the first two will be counted as your answer.

d) Based on your output from part b, how does Time appear to influence the likelihood of a customer returning? (Remember that Time indicates the time of day that the transaction occurred; higher values mean the transaction was later in the day.)

2. Evaluating classification models

In this question, you will need to use the Cross Validation operator to evaluate the accuracy of the 0-1 classifications made by logistic regression and two other approaches. For ALL parts of the question, use a local random seed of 12345 in the Cross Validation operator’s parameters. Do not remove any attributes from the dataset, even if they were insignificant in the logistic regression model from the first question.

a) Create a process that uses Cross Validation to evaluate your logistic regression model for this dataset. Do not change any parameters other than setting the local random seed to 12345. Show a screenshot of the Cross Validation process. (NOT the main Process panel; I want to see what’s happening within the Cross Validation operator).

b) Run your process from part a. Show a screenshot of the performance output. (It should be a table that includes an overall accuracy percentage above it.)

c) How many total customers did the logistic regression model predict would return?

d) Replace the Logistic Regression operator with a k-Nearest Neighbors operator, set k equal to 25, and uncheck “weighted vote.” Run the process. What is the overall accuracy of this k-NN model?

e) Based on your results, which of these two models is more accurate?

Price $15

Buy Ready Solution

(1069 times downloaded)

OR

Get Same Assignment Done From Scratch

Get instant assignment help service

Related Questions in data mining category

RGV, Inc., sells international specialty food items to buyers from around the world. Management is currently realizing that issues such as rising shipping costs due to gas prices as well as competition are cutting into the company's profit margins.

Data Execution Protection (DEP) Be sure to answer question in complete sentences and full detail. Please provide research reference in APA style. Viruses and other malware often exploit bugs known as buffer overflows in widely used software. One method of

What useful information can be extrapolated on your visualization that you want to convey to the end-user/audience?

Which model was best? Report the results. Which variables were important? Is the result what you would expect?

Here is it important to describe the context of your problem, previous studies…then state your aim/motivation.

Could viruses be forms of infinite loops Also when browsing the internet and you get stuck on a certain page

Database Administrator’s Role"

Case Study Instructions Director’s Request for PCs Completion of the Case Study will utilize (1) an MS Word Table, (2) an MS Access database, and (3) an MS PowerPoint Presentation You will meet the Director’s requirements that are described on this page b

We will be using “Anaconda”, which is a free Python distribution package that has all of the tools that we will need

Project Deliverable 3: Database and Data Warehousing Design This assignment consists of two (2) sections: a design document and a revised project plan. You must submit both sections as separate files for the completion of this assignment

Disclaimer

The ready solutions purchased from Library are already used solutions. Please do not submit them directly as it may lead to plagiarism. Once paid, the solution file download link will be sent to your provided email. Please either use them for learning purpose or re-write them in your own language. In case if you haven't get the email, do let us know via chat support.

Get Higher Grades Now

Tutors Online

Description

Drop Files Here Or Click to Upload

May

January

February

March

April

May

June

July

August

September

October

November

December

2025

1950

1951

1952

1953

1954

1955

1956

1957

1958

1959

1960

1961

1962

1963

1964

1965

1966

1967

1968

1969

1970

1971

1972

1973

1974

1975

1976

1977

1978

1979

1980

1981

1982

1983

1984

1985

1986

1987

1988

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

2025

2026

2027

2028

2029

2030

2031

2032

2033

2034

2035

2036

2037

2038

2039

2040

2041

2042

2043

2044

2045

2046

2047

2048

2049

2050

Sun	Mon	Tue	Wed	Thu	Fri	Sat
27	28	29	30	1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

00:00

00:30

01:00

01:30

02:00

02:30

03:00

03:30

04:00

04:30

05:00

05:30

06:00

06:30

07:00

07:30

08:00

08:30

09:00

09:30

10:00

10:30

11:00

11:30

12:00

12:30

13:00

13:30

14:00

14:30

15:00

15:30

16:00

16:30

17:00

17:30

18:00

18:30

19:00

19:30

20:00

20:30

21:00

21:30

22:00

22:30

23:00

23:30

Warning: require_once(/home/u706648698/domains/calltutors.com/public_html/service_page_footer.php): failed to open stream: No such file or directory in /home/u706648698/domains/calltutors.com/public_html/Assignment.php on line 380

Fatal error: require_once(): Failed opening required '/home/u706648698/domains/calltutors.com/public_html/service_page_footer.php' (include_path='.:/opt/alt/php73/usr/share/pear') in /home/u706648698/domains/calltutors.com/public_html/Assignment.php on line 380

Sun	Mon	Tue	Wed	Thu	Fri	Sat
27	28	29	30	1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Sun	Mon	Tue	Wed	Thu	Fri	Sat
27	28	29	30	1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Sun	Mon	Tue	Wed	Thu	Fri	Sat
27	28	29	30	1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31