Your CEO says he wants to be able to analyze the company's entire dataset of millions and millions of customer records (rather than just a sample), and asks if a BigData environment or a Relational Database system should be used. How would you respond

data mining

Description

Your CEO says he wants to be able to analyze the company's entire dataset of millions and millions of customer records (rather than just a sample), and asks if a BigData environment or a Relational Database system should be used. How would you respond?

 

 

Question 2 of 30

2 Points

Based on the BigData reading, the analysis, in real-time, of streaming web traffic data refers to which "V"?

  • A. Velocity
  • B. Variability
  • C. All of these
  • D. Volume
  • E. None of these

 

Question 3 of 30

2 Points

From the textbook chapter, match coding is a database process referring to:

  • A. None of these
  • B. A way to authenticate that a person accessing a database has a correct login and password.
  • C. A way to determine what teams play each other in a tournament
  • D. A way to match interested people with others of similar interests
  • E. A way to assign a unique code to a record (e.g. customer) in the database
  • F. All of these

 

Question 4 of 30

2 Points

DBMS is an acronym that refers to:

  • A. The process of Design, Build, Make, Sell a product
  • B. A general reference to a Database Management System
  • C. A Master of Science Degree in Blockchain
  • D. All of these
  • E. None of these

 

Question 5 of 30

2 Points

You want to test consumer reaction to a product. In terms of random samples and representative samples, which of the following is the best way to think about these?

  • A. Neither are important and neither needs to be considered
  • B. Both are important and both need to be considered
  • C. It is more important to have a random sample than a representative sample.
  • D. It is more important to have a representative sample than a random sample


Question 6 of 30

2 Points

Imagine that you are looking at a scatterplot with a best-fit regression line showing the relationship between dollars spent (Y) and Age of consumer (X). You notice that the regression line slopes upward and that many (most) of the points are far away from the regression line. Which of the following conclusions would be true:

  • A. The R-squared value should be high because most of the variance in Dollars Spent can be explained by Age.
  • B. The R-squared value should be low because Age is explaining only a little of the variation in Dollars Spent
  • C. There is no relationship between a scatterplot of the data and the calculation of an R-squared value
  • D. All of the above
  • E. None of the above

 

Question 7 of 30

2 Points

In the scatterplot with a best-fit regression line showing the relationship between dollars spent (Y) and Age of consumer (X), if the regression line slopes upward and many (most) of the points are far away from the regression line. Which of the following conclusions would be true:

  • A. The correlation would be low and positive
  • B. The correlation would be high and negative
  • C. The correlation would be low and negative
  • D. The correlation would be high and positive
  • E. All of the above
  • F. None of the above

 

Question 8

 

Imagine that you predict Dollars Spent by including an additional predictor variable, Salary (as well as keeping Age in the model). All coefficients are positive and statistically significant as is the R-square. Which of the following statements are accurate?

  • A. Each predictor is positively correlated with Dollars Spent
  • B. Consumers that are older and that have higher salaries spend more dollars
  • C. If we know a new consumer's Age and Salary, and their Age and Salary are within the range of values that the model was built upon, we can predict how much dollars the consumer will spend.
  • D. All of the above
  • E. None of the above

Question 9 of 30

2 Points

You have a set of data points that you've analyzed and modeled with regression, but you now want to draw conclusions for data that are outside the range that you've analyzed. This would be an example of ...

  • A. Interpolation and is not a good idea
  • B. None of these
  • C. Extrapolation and is not a good idea
  • D. Interpolation and is a good idea
  • E. All of these
  • F. Extrapolation and is a good idea.

 

Question 10 of 30

2 Points

If there is correlation between Customer Satisfaction (Y), and Amount Spent (X), of r=0.50, then a simple regression model using amount spent to predict customer satisfaction will have how much explanatory power (as a percentage)?

  • A. 100%
  • B. 25%
  • C. Any of these
  • D. 50%
  • E. None of these

 

Question 11 of 30

3 Points

You are performing a Segmentation Analysis and want to target customers that are most likely to respond.  A cell in your analytic sample has a response rate of 4%, and the overall response rate for all cells is 1.5%  If your targeted group must have a response rate index greater than 1.0, would this cell be part of the Target group -  Yes or No, and explain why. 

 

Question 12 of 30

2 Points

Imagine that you run a Marketing Analytics team. Your team has run a Segmentation Analysis and all 10,000 customers in the company database have been assigned to one of three segments. Your Chief Marketing Officer and Chief Financial Officer have asked how much revenue might be expected if the company presented an offer only to the segment of Best customers, and each customer can order one product and each product brings in $5. Your analyst has calculated an expected revenue amount of $50,000. What might you conclude?

  • A. Your analyst has made a critical calculation error.
  • B. $50,000 would require a response rate that would not be possible given the CMO's marketing plans
  • C. $50,000 would mean that the company would have to include customers that are not in the Best segment
  • D. All of the above - so have the analyst resubmit the analysis
  • E. None of the above - your analyst could be right, so present the $50,000 answer


Question 13 of 30

2 Points


You’re an expert in recommending marketing strategies based on Lift calculations. You have a database table showing groups of customers in four quartiles based on scores that predicts likelihood to respond. Incremental and Cumulative Lift values for each quartile are available. 


Your client wants to launch a campaign to send an offer to customers. The campaign has to have at least a 6.5% overall response rate.   To determine if  the campaign should include names that are in the 2nd quartile group you would need to check 

  • A. The cumulative RR of the 2nd quartile to see if it is at least 6.5%
  • B. The incremental RR of the 2nd quartile to see if it is at least 6.5%
  • C. The overall RR across all quartiles to see if it at least 6.5%
  • D. All of the above
  • E. None of the above

 

Question 14 of 30

2 Points

Imagine that you plot a cumulative gains chart from a model, and the curve for the model closely follows a linear relationship showing that by targeting 'x'% of the responders, you can capture 'y'% of all orders, and the model's line is such that X=Y. You can conclude that:

Instruction Files

Related Questions in data mining category