To address some of the limitations of content-based filtering, collaborative filtering uses similarities between users and items simultaneously to provide recommendations. This allows for serendipitous recommendations; that is, collaborative filtering models can recommend an item to user A based on the interests of a similar user B. Furthermore, the embeddings can be learned automatically, without relying on hand-engineering of features.
A Movie Recommendation Example
Consider a movie recommendation system in which the training data consists of a feedback matrix in which:
- Each row represents a user.
- Each column represents an item (a movie).
The feedback about movies falls into one of two categories:
- Explicit— users specify how much they liked a particular movie by providing a numerical rating.
- Implicit— if a user watches a movie, the system infers that the user is interested.
To simplify, we will assume that the feedback matrix is binary; that is, a value of 1 indicates interest in the movie.
When a user visits the homepage, the system should recommend movies based on both:
- similarity to movies the user has liked in the past
- movies that similar users liked
For the sake of illustration, let's hand-engineer some features for the movies described in the following table:
|The Dark Knight Rises||PG-13||Batman endeavors to save Gotham City from nuclear annihilation in this sequel to The Dark Knight, set in the DC Comics universe.|
|Harry Potter and the Sorcerer's Stone||PG||A orphaned boy discovers he is a wizard and enrolls in Hogwarts School of Witchcraft and Wizardry, where he wages his first battle against the evil Lord Voldemort.|
|Shrek||PG||A lovable ogre and his donkey sidekick set off on a mission to rescue Princess Fiona, who is emprisoned in her castle by a dragon.|
|The Triplets of Belleville||PG-13||When professional cycler Champion is kidnapped during the Tour de France, his grandmother and overweight dog journey overseas to rescue him, with the help of a trio of elderly jazz singers.|
|Memento||R||An amnesiac desperately seeks to solve his wife's murder by tattooing clues onto his body.|
Suppose we assign to each movie a scalar in \([-1, 1]\) that describes whether the movie is for children (negative values) or adults (positive values). Suppose we also assign a scalar to each user in \([-1, 1]\) that describes the user's interest in children's movies (closer to -1) or adult movies (closer to +1). The product of the movie embedding and the user embedding should be higher (closer to 1) for movies that we expect the user to like.
In the diagram below, each checkmark identifies a movie that a particular user watched. The third and fourth users have preferences that are well explained by this feature—the third user prefers movies for children and the fourth user prefers movies for adults. However, the first and second users' preferences are not well explained by this single feature.
One feature was not enough to explain the preferences of all users. To overcome this problem, let's add a second feature: the degree to which each movie is a blockbuster or an arthouse movie. With a second feature, we can now represent each movie with the following two-dimensional embedding:
We again place our users in the same embedding space to best explain the feedback matrix: for each (user, item) pair, we would like the dot product of the user embedding and the item embedding to be close to 1 when the user watched the movie, and to 0 otherwise.
In this example, we hand-engineered the embeddings. In practice, the embeddings can be learned automatically, which is the power of collaborative filtering models. In the next two sections, we will discuss different models to learn these embeddings, and how to train them.
The collaborative nature of this approach is apparent when the model learns the embeddings. Suppose the embedding vectors for the movies are fixed. Then, the model can learn an embedding vector for the users to best explain their preferences. Consequently, embeddings of users with similar preferences will be close together. Similarly, if the embeddings for the users are fixed, then we can learn movie embeddings to best explain the feedback matrix. As a result, embeddings of movies liked by similar users will be close in the embedding space.