DEFINITION Collaborative Filtering methods are based on collecting and analyzing a large amount of information on user’s behavior, activities or preferences to . GET USER LATENT FEATURE → COMPARE YOUR PREFERENCES WITH OTHER USERS’ PREFERENCES, make recommendation based on preferences of users who have similar preferences with you. ADVANTAGE it is user-based and therefore no need to analyze the item itself ASSUMPTION: A user’s preferences are the same in the past and in the future. Personal tastes are correlated. THREE PROBLEMS: • __: Need a large amount of existing data on a user in order to make accurate recommendations. • __: a large amount of computation power is often required to calculate recommendation as there are usually millions of users and items. • __: the number of items is extremely large that even the most active users will only have rated a small subset of the overall database. • don’t work on items that have no rating SPLIT TRAINING AND TESTING SET When splitting up the data into training and test sets, you should randomly select (user, movie) pairs, not select random users or movies. The whole idea of “collaborative filtering” (for Netflix) is to predict ratings for movies you haven’t watched based on the ratings you provided for ones you have. If a user is present only in the testing set, the model cannot possibly be basing predictions based on their other ratings. DEFINITION recommend items that are according by items with items that a user liked in the past or present . 1. (tf-idf) 2. (most focus on two types of information: a model of the user’s preference & a history of the user’s interaction with the recommendation system) ISSUE whether the system is able to and . the value from a recommendation system is significantly less, if it .