[Solved] In this assignment, you will need to implement a simple r...

Check Out Our Work & Get Yours Done

Submit Work

Download Sample

Enroll in the complete course for only $250 USD*

Order Now

Submit work Offers

In this assignment, you will need to implement a simple recommender system using a book rating data set DBbook_train_ratings

data mining

Description

Should be done in ipython notebook . Should use scikit learn.

In this assignment, you will need to implement a simple recommender system using a book rating data set DBbook_train_ratings.tsv (reference: https://lists.w3.org/Archives/Public/public-rww/2013Dec/0002.html). The first column of this data set contains user IDs. The second column contains itemIDs (i.e., book ids). The third column contains the rating scores (1 – 5). The purpose of studying this data set is to create a data mining model that recommend books to users. The data set (DBbook_train_ratings.tsv) can be downloaded from D2L.

Please submit an iPython Notebook (you don’t really need to submit the dataset. If you want to change the original dataset, you need to write code to do that). Please use “run all” to run your code before you submit so that your iPython notebook will show the outputs of your code. You will lose 1 point if you do not “run all”. You probably want to copy and modify the code from the ipython notebook ml_100k posted on D2L.

Please develop an ipython notebook titled 770_21_a2_yourlastname and finish the tasks below, where (C) indicates that you need to write code for the task, (O) indicates that you need to show output, and (A) that you need to type your answers using Markdown text. In your iPython notebook, at the beginning of each cell, you need to indicate which task the cell is about. For example, in the cell related to task 1, you should first type “# Task 1: Import data”. If you do not clearly label the cells, you will lose 1-2 points (out of 18 points). Whenever you see “print” in the questions, you need write print statement to print the intended outputs.

1. Please write code to print the number of unique users and the number of unique books in this data set. (C)(O)

2. Please write code to create the utility matrix. Each row of this matrix represents a user, and each column represents an item. Print the first 10 rows of matrix. Please write code to print the number and the percentage of cells in the utility matrix that are not populated. Please write code to fill these empty cells with 0s. (C)(O)

3. Please write code to print the top 5 similar users to userID 2 based on Euclidean distance. (C)(O)

4. Please write code to print the Euclidean distance between itemID 18 and itemID 1. Please write code to print the Enclidean distance between itemID36 and itemID 1. Write a print statement that tells me between itemID36 and itemID18, which is more similar to itemID 1 and why. For example, you can write a print statement like print(“itemID36 is more similar to itemID 1 because some reason…” ). (C)(O)

5. Please write code to print the top 5 similar items to itemID 8010. (C)(O)

6. Write code to remove books and users with less than 20 rating scores from the utility matrix by copying and maybe modifying the following codes. Write code to print the shape of the dataset. (C)(O)

df_item_fre = df_data1.groupby("itemID").count()

df_user_fre = df_data1.groupby("userID").count()

selected_items = df_item_fre[df_item_fre["userID"]>20].index

dense_matrix = dense_matrix[selected_items]

selected_users = df_user_fre[df_user_fre["itemID"]>20].index

dense_matrix = dense_matrix.loc[selected_users]

7. Please use the dataset you obtained from task 6 and write code to remove users that haven’t rated itemID8010, and then please write code to print the counts of the different rating scores of this item (hint: use the function value_counts()). Print the shape of the dataset. (C)(O)

8. Write code to partition the data set you obtained from 7 for validating the performance on predicting rating on itemID 8010. Randomly select 25% of the users as the testing set and the others as the training set. Please print the dimensions of the training set and the testing set. Please write code to print the mean rating of itemID 8010 in the training set and its mean rating in the testing set. (Hint: use dense_matrix[8010].mean() method to calculate the means) (C)(O)

9. Use the training and test dataset obtained in 8 and write code to 1) print the userID of the the user in the 5th row (not userID5) in the test dataset, and 2) predict this user’s rating of itemID 8010 based on the top 5 similar users in the training dataset, and print the user’s predicted rating and the actual rating of the book. (C)(O)

Instruction Files

myworkbook2.docx

20.3 KB

DBbooktrainratings1.tsv

959.1 KB

Price $15

Buy Ready Solution

(664 times downloaded)

OR

Get Same Assignment Done From Scratch

Get instant assignment help service

Related Questions in data mining category

I just want to know if you typed the whole page citation references to the sentences you used

Provide the summary statistics for all the variables from the dataset. Explain some of the key aspects of the dataset.

Students should publish a Tableau Dashboard on tableau public

As part of the formal assessment for the programme you are required to submit a Data Handling and Decision Making report.

List the first name and the last name of each customer who placed more than the average number of orders. Along with the names, show the total each of the listed customers paid on all orders he or she placed.

In this assignment, you will use a technique called “data augmentation” which is often used in deep learning when the training data is small.

Your organization, a consumer automobile research firm, wishes to analyze data from a study of fuel economy among the major automobile models to determine how the variables in the data set correlate with fuel economy.

CREATING A DATABASE/LOGGING SYSTEM FOR TRACKING THE RELEASE OF INFORMATION REQUESTS INTRODUCTION TO THE PROBLEM We are taking the healthcare facility for the research on the process to have a logging or database system for the ROI (Release of Information)

Seeking assistance with a data structures project in java

List the order number & date of all orders in the month of October.

Disclaimer

The ready solutions purchased from Library are already used solutions. Please do not submit them directly as it may lead to plagiarism. Once paid, the solution file download link will be sent to your provided email. Please either use them for learning purpose or re-write them in your own language. In case if you haven't get the email, do let us know via chat support.

Get Higher Grades Now

Tutors Online

Description

Drop Files Here Or Click to Upload

Get Free Quote!

417 Experts Online

Get Instant Help with your Questions &
boost your grades

you can count us with it
Highly Satisfied Students 4.9/5
Based On 19835+ Reviews

Get Help Now

We Provide Services Across The Globe

Disclaimer: The reference papers or solutions provided by Calltutors.com serve as model papers or solutions for students or professionals and are not to be submitted as it is to any institutions. These documents are intended to be used for research and reference purposes only. University and company's logo's are the property of respected owners. We don't have affiliation with the mentioned universities. By using our services means, you agree to our Honor Code , Privacy Policy , Terms & Conditions , Payment , Refund & Cancellation Policy.

Enroll in the complete course for only $250 USD*

In this assignment, you will need to implement a simple recommender system using a book rating data set DBbook_train_ratings

data mining

Description

Instruction Files

Price $15

OR

Get instant assignment help service

Related Questions in data mining category

Disclaimer

Policy

Exploring

Other

Connect With Us

Get Instant Help with your Questions &
boost your grades

you can count us with it
Highly Satisfied Students 4.9/5
Based On 19835+ Reviews

We Provide Services Across The Globe

Enroll in the complete course for only $250 USD*

In this assignment, you will need to implement a simple recommender system using a book rating data set DBbook_train_ratings

data mining

Description

Instruction Files

Price $15

OR

Get instant assignment help service

Related Questions in data mining category

Disclaimer

Policy

Exploring

Other

Connect With Us

Get Instant Help with your Questions & boost your grades

you can count us with it Highly Satisfied Students 4.9/5 Based On 19835+ Reviews

We Provide Services Across The Globe

Get Instant Help with your Questions &
boost your grades

you can count us with it
Highly Satisfied Students 4.9/5
Based On 19835+ Reviews