Using SAS, build a k-nearest neighbor algorithm to predict the quality of wine from different factors. (input files are provided):
1 - Provide the summary statistics for all the variables from the dataset. Explain some of the key aspects of the dataset.
2 - Review the SAS code in the file knn.sas and for each SAS statement, provide explanation of the code as comments
3 - Perform the k-NN using k = 1, 2, and 3. For each case, provide the code and explain the SAS output and give interpretation(s).
4 - Which case (k = 1, 2, or 3) provides the best model? Explain why using the output from #3
Thank you.
Sun | Mon | Tue | Wed | Thu | Fri | Sat |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 |
8 | 9 | 10 | 11 | 12 | 13 | 14 |
15 | 16 | 17 | 18 | 19 | 20 | 21 |
22 | 23 | 24 | 25 | 26 | 27 | 28 |
29 | 30 | 1 | 2 | 3 | 4 | 5 |