top of page
Image by Krists Luhaers

Project
movierec

Why MovieRec?

I find it fascinating how platforms like Netflix and YouTube are able to know what I like. They never fail to suggest my next favourite show or video, which then keeps me on the screen for hours.  Compared to Titanic, this project interests me more as it is more relatable. I got to experience coding the simplified version of a movie recommenders' system and have picked up many new skills along the way!

Project overview

In this project, I'll be merging data and finding averages to find movies similar to a viewer's watchlist and preferences. To do this, I'll be using pandas and 2 files from Kaggle to obtain my data, which consists of viewers' movie ratings and a list of movies.

First, we merge the data!

By merging the datasets from the 2 files, we'll be able to see the views, ratings, and movie titles as a whole. This will also make it easier to retrieve information later on.  We'll be merging them based on the movie item number and displaying only the first 10.

merge .jpg
merge .jpg

Next, calculating average!

For just one movie, there could be countless numbers of ratings and views. Hence, we need to average them out to sift out our top movies. Here, we'll be forming two tables, one for ratings, and one for views, each showing the top 5. 

Let's do ratings first:

average ratings.jpg
average ratings.jpg

Next, the views:

average views.jpg
average views.jpg

Now, let's combine the two!

combine.jpg
combine.jpg

However, this looks a little messy, doesn't it? The numbers are all over the place, so let's organise it to the number of ratings:

sort.jpg
sort.jpg

How do we then find similar movies?

Firstly, we need to compare viewers' ratings of a movie to other movies, finding the correlation between the data. Let's take, for example, the Star Wars (1977) film. How do the ratings for this film differ between viewers? Continuing from the previous code...

star wars.jpg
star wars.jpg

Next, we can use this to find how viewers' ratings of the Star Wars (1977) film corresponds to other movies. By continuing from the code above...

correlation to other movies.jpg
correlation to other movies.jpg

Now, last but not least, to sort the data to recommend the next movie! Adding on to the previous codes:

similar movies.jpg
similar movies.jpg

Finally...

Conclusion

Overall, this project has been quite a learning curve for me. I got to pick up new skills such as merging datasets, correlation, and many other functions. However, the experience was very enjoyable and I hope to learn to further my skills, producing more analyses in the future.

bottom of page