Assignment Instructions
Customer Care professionals capture customer
feedback and comments during their interaction with various customers everyday.
This comments data in the form of text is referred to as unstructured data.
There has been an exponential growth in unstructured data, and analyzing this
data is critical for forming business strategies.
You have access to this Comments data. You
need to build a text mining system to mine this textual data and (eventually)
categorize each comment as Positive or Negative so that the business can
effectively track customer sentiment and make appropriate corrective
strategies.
For example, the first comment mentioned below
should be categorized as Positive, and second one as Negative: -
·
The kind of offers and
the kind of treatment you get of being card holder and kind of service and
support you receive is great.
·
The card service is
neither widely available nor acceptable compared to other card.
Download the attached dataset or use
link: https://edge.apus.edu/access/content/group/business-common/ANLY/ANLY610/sample_comments2.csv
Perform the following
text cleaning techniques in R, and show your work:
·
Read through the following tutorial regarding some of the simplest ways of
cleaning text data.
·
Analyze the data file
and come up with what you might consider to be "stop words".
Are there any additional words you might want to exclude beside the
standard ones? Discuss your findings.
·
Read the data into R,
and provide the first few rows of data (hint: use the head function)
·
Do standard
transformations (e.g., converting to plain text document, removing punctuation,
and so on). Detail what you do with screenshots.
·
Stem the document for
future use (we will use this data in future weeks)
Please copy/paste screen
images of your work in R, and put into a Word document for submission. Provide
detail explanation to your answers.
Use APA format, and
provide at least 2 academic references.
Get Free Quote!
380 Experts Online