Project
movierec
Why MovieRec?
I find it fascinating how platforms like Netflix and YouTube are able to know what I like. They never fail to suggest my next favourite show or video, which then keeps me on the screen for hours. Compared to Titanic, this project interests me more as it is more relatable. I got to experience coding the simplified version of a movie recommenders' system and have picked up many new skills along the way!
Project overview
In this project, I'll be merging data and finding averages to find movies similar to a viewer's watchlist and preferences. To do this, I'll be using pandas and 2 files from Kaggle to obtain my data, which consists of viewers' movie ratings and a list of movies.
First, we merge the data!
By merging the datasets from the 2 files, we'll be able to see the views, ratings, and movie titles as a whole. This will also make it easier to retrieve information later on. We'll be merging them based on the movie item number and displaying only the first 10.
Next, calculating average!
For just one movie, there could be countless numbers of ratings and views. Hence, we need to average them out to sift out our top movies. Here, we'll be forming two tables, one for ratings, and one for views, each showing the top 5.
Let's do ratings first:
Next, the views:
Now, let's combine the two!
However, this looks a little messy, doesn't it? The numbers are all over the place, so let's organise it to the number of ratings:
How do we then find similar movies?
Firstly, we need to compare viewers' ratings of a movie to other movies, finding the correlation between the data. Let's take, for example, the Star Wars (1977) film. How do the ratings for this film differ between viewers? Continuing from the previous code...
Next, we can use this to find how viewers' ratings of the Star Wars (1977) film corresponds to other movies. By continuing from the code above...
Now, last but not least, to sort the data to recommend the next movie! Adding on to the previous codes:
Finally...
Conclusion
Overall, this project has been quite a learning curve for me. I got to pick up new skills such as merging datasets, correlation, and many other functions. However, the experience was very enjoyable and I hope to learn to further my skills, producing more analyses in the future.