Building a Movie Recommendation Engine | Machine Learning Projects

Sdílet
Vložit
  • čas přidán 1. 02. 2019
  • Building a Movie Recommendation Engine session is part of Machine Learning Career Track at Code Heroku. If you would like to get enrolled in the program you can reach out to us on WhatsApp +91-9967578720
    Recommendation Web App Demo: www.codeheroku.com/static/movi...
    Part 2: Collaborative Filtering: • Movie Recommendation S...
    How to build web app for your ML project:
    Part 1: / how-to-turn-your-machi...
    Part 2: / part-2-how-to-turn-you...
    All completed Python scripts and associated datasets are on the class Github repo: github.com/codeheroku/Introdu...
    Alternative Link:
    drive.google.com/file/d/1sJ9N...
    Watch all our Machine Learning videos in the playlist here:
    • Machine Learning
    Prerequisites for this workshop can be downloaded by following instructions here:
    www.codeheroku.com/post?name=P...
    At some point each one of us must have wondered where all the recommendations that Netflix, Amazon, Google give us, come from. We often rate products on the internet and all the preferences we express and data we share (explicitly or not), are used by recommender systems to generate, in fact, recommendations.
    In this hands-on workshop we will understand basics of a recommendation system and also build our own. We will be building a content based recommendation engine using Python and Scikitlearn. We will cover concepts such as cosine distance, euclidean distance and when to use each of them. Finally, we will use IMDB 5000 movie dataset to build a content based recommendation engine using CountVectorize and Cosine similarity scores between movies.
    Who Should Attend?
    You are curious about machine learning and data science
    You love building things and learning by working on projects
    You are looking for a job in data science / data analytics positions
    Follow us on:
    Instagram: / codeheroku
    Twitter: / codeheroku
    LinkedIn: / mihirthak. .
    Email: hello@codeheroku.com
    WhatsApp: +91-9967578720
  • Věda a technologie

Komentáře • 288

  • @CodeHeroku
    @CodeHeroku  Před 5 lety +13

    All completed Python Scripts and Datasets are on our Github repo:
    github.com/codeheroku/Introduction-to-Machine-Learning/tree/master/Building%20a%20Movie%20Recommendation%20Engine
    Alternative Link: drive.google.com/file/d/1sJ9N2T2zDQwvywHCC6RCO68olL97Mp4O/view?usp=sharing
    Jupyter Notebook for this tutorial:
    github.com/codeheroku/Introduction-to-Machine-Learning/blob/master/Building%20a%20Movie%20Recommendation%20Engine/Movie_Recommendation_Engine.ipynb

    • @torrestam8527
      @torrestam8527 Před 4 lety

      I would like to ask why did u put a matrix in cosine_similarity? I though its parameter must be item 1 and item 2 like this:
      cosine_similarity( item 1, item 2 )

  • @hdsz7738
    @hdsz7738 Před 10 měsíci

    just love the way you teach, super simple to understand, no bs, no rocket science, very straightforward and to the point. keep it up and continue providing such quality of stuff

  • @itsybitsykrafter
    @itsybitsykrafter Před 4 lety +11

    You teach really well. I bet I'll watch all your videos for your teaching style. Keep up the good work.

  • @shreeharjoshi6143
    @shreeharjoshi6143 Před 2 lety

    Definitely worth a like and a comment. Really enjoyed how you ensured none were left behind by repeatedly asking questions.

  • @its_sharan54
    @its_sharan54 Před 3 lety +1

    This is my first time on your channel,You are a really great instructor,glad to be here,i understood everything

  • @samkodag4454
    @samkodag4454 Před 5 lety +77

    I am beginner and I understood everything.
    Great explanation sir thank you so much hope you upload more projects like this.

  • @fahadabbas3640
    @fahadabbas3640 Před 3 lety +2

    One of the best Tutorials I found for the beginners. You made this thing understand very easy. Thanks You

  • @superparth1995
    @superparth1995 Před 2 lety +2

    Great explanation, I would really recommend watching this tutorial if you're completely new to ML. Thanks for this awesome video!

  • @chamaljayasinghe4210
    @chamaljayasinghe4210 Před 2 měsíci

    One of an amazing tutor that I have ever met. Good Luck ...🎉🎉

  • @aamnaalam3460
    @aamnaalam3460 Před 2 lety

    The explanation was really really very good, even the slightest points and concepts are explained very clearly.

  • @sseaditya
    @sseaditya Před 5 lety +5

    This was incredibly helpful. Thank You!

  • @cain8087
    @cain8087 Před 4 lety +9

    Simply Amazing. ...such a great explanation.

  • @Prajwal_KV
    @Prajwal_KV Před 3 lety

    You are really good at explaining things.Thanks a lot for this video.Learned a lot from this.

  • @yousufabdullah2188
    @yousufabdullah2188 Před 5 lety +3

    You are really good at explaining things. Keep up the great work. Please keep making new Machine learning video. I would definitely recommend my friends to subscribe to your channel.

    • @CodeHeroku
      @CodeHeroku  Před 5 lety

      Thanks Yousuf. Love and support from you guys is the only thing that motivates us!

  • @unpatel1
    @unpatel1 Před 2 lety

    I enjoyed this project and will go over all your other projects. Thank you.

  • @anudeepreddy4124
    @anudeepreddy4124 Před 4 lety +5

    Thank you it has changed my perspective on recommendation systems and its very helpful, useful content.

  • @sakethayellanki604
    @sakethayellanki604 Před 4 lety +2

    Thanks a lot for uploading this video...It gave me a head start for my project

  • @prathamsinghal5261
    @prathamsinghal5261 Před 5 lety +9

    Great job sir....now my confidence is also rising towards the ML and that's all because of you. Thank you Sir

  • @msscrooge1003
    @msscrooge1003 Před 4 lety +4

    This is so useful! Thanks so much for this tutorial

  • @tungochuy
    @tungochuy Před 3 lety

    love this SO SO MUCH. gonna use it for my thesis

  • @santiagoyeomans
    @santiagoyeomans Před 3 lety +1

    Awesome video! This is the kind of video I was looking for

  • @sweetytripathi4254
    @sweetytripathi4254 Před 5 lety

    I tried to learn ML from many videos earlier but didn't get it properly Thanx for making such videos it's really helpful and finally I understand what is going on ... Keep making more videos...

    • @CodeHeroku
      @CodeHeroku  Před 5 lety +1

      Thanks for your encouragement. The support that we get from our audience is what keeps us motivated ❤

  • @praneethaluru2601
    @praneethaluru2601 Před 3 lety

    Superb explanation in short period of time.

  • @theanonymoustalk
    @theanonymoustalk Před 4 lety

    where should I start looking if I wanted to preprocess or clean the other features, OR use a more updated csv

  • @1982Dibya
    @1982Dibya Před 3 lety

    Awsum explanation..made my concepts clear totally

  • @MotiVerseYT2
    @MotiVerseYT2 Před 3 lety

    Thank you soo much sir for the project.
    Helped me a lot 👍🏻❤️

  • @ilhanmahardika4857
    @ilhanmahardika4857 Před 9 měsíci

    Thank you very much sir, very clear explanations

  • @dialloalphaissaga3264
    @dialloalphaissaga3264 Před 3 lety +1

    Hello Mihir, thank you very much for your tutorial on the recommendation of the movies, I have a question if I wanted to have in a dataframe all the movies and these 5 first similar movies in column how could I do it.
    thank you very much for your help

  • @ImACoolGirl23
    @ImACoolGirl23 Před 3 lety

    @CodeHeroku Thank you for a video. Can you please tell me what you used to make Web App?

  • @ravikamble8142
    @ravikamble8142 Před 3 lety

    Sir it's very helpful to us plz bring more videos on project like this

  • @AnuragSingh-vv3qv
    @AnuragSingh-vv3qv Před 3 lety

    Great explanation understood everything but i just wanted to ask that what was the need to combine the features why did we do that?
    We could even generate the similarity scores for a row based on different features like genres,cast, rating and etc. this is possible right?

  • @torrestam8527
    @torrestam8527 Před 4 lety

    I would like to ask why did u put a matrix in cosine_similarity? I though its parameter must be item 1 and item 2 like this:
    cosine_similarity( item 1, item 2 )

  • @ramchandra6126
    @ramchandra6126 Před 5 lety +2

    Great video boss!

  • @094_cse_srekaravarshannk3

    Thank you very much
    I understood everything
    you are one of my best Teachers
    print("Thank you so much" * 10000000000000)

  • @vanithasenthil1553
    @vanithasenthil1553 Před 3 lety

    Wonderful explanation. Thank you

  • @kadhanayagipavithra
    @kadhanayagipavithra Před 4 lety +1

    incredibly helpful. :)

  • @anilb1076
    @anilb1076 Před 4 lety

    can use iloc function to select all rows of specified feature instead of using function and writing 2 lines code

  • @makhoba808
    @makhoba808 Před 4 lety +1

    Thank you very much. This is awesome!

  • @magmacodes9143
    @magmacodes9143 Před 3 lety

    This is a great video. Thank you.

  • @ashwinimandani2829
    @ashwinimandani2829 Před 3 lety

    Great video and great explanation. But can you please tell me how you performed feature selection.

  • @ritvikdayal5970
    @ritvikdayal5970 Před 4 lety +1

    you can combine two columns as :
    df['Combined_Features'] = df[ ['keywords', 'cast', 'geners', 'directors'] ].agg( '-'.join, axis=1)
    where '-' is the separator and can be changed with a blank space.

  • @RiffswithMohit
    @RiffswithMohit Před 4 lety

    Why we use ROW in line no. 23 and how to make dynamic for
    return row['keywords'].... etc this line, because if you already define features then why we have to use as static, please answer for the same I am stuck at this point

  • @nusk003
    @nusk003 Před 3 lety

    You are the best sir ❤️❤️ thank you so much

  • @AhamedKabeer-wn1jb
    @AhamedKabeer-wn1jb Před 4 lety

    Beautiful explanation Sir...

  • @_ekcup_chai
    @_ekcup_chai Před 4 lety

    @codeheraku I am getting error while filling NA that is must have equal len keys and value.

  • @prateekmehndiratta1596
    @prateekmehndiratta1596 Před 5 lety +4

    in practical way we will get data points having attributes 1 or 0 mostly , so why use this

  • @brewedscript1014
    @brewedscript1014 Před 3 lety

    it's so helpful but I think it's a good idea to provide subtitles. There are some people who watch videos faster speed and if the accent is not clear, it's really hard to follow.

  • @sharaththatikonda5386
    @sharaththatikonda5386 Před 5 lety +1

    very good explanation!

  • @Prajwal_KV
    @Prajwal_KV Před 3 lety

    could you please tell me when to use euclidean and when to use cosine similarity ?

  • @TheMarioAdams
    @TheMarioAdams Před 3 lety +3

    Hi there, I have implemented your model and it works perfectly however I've tried to use it in my program where there is a text file containing a watch list for logging what the user has watched and I'm trying to do the cosine similarity function for each movie in the text file (so each line in the text file). In my code I have used all of your code from this video in a function I defined called 'cosinesimilarity()' Here is my code:
    watchlist = open("watchlist.txt", "r")
    with open('watchlist.txt') as file:
    for line in file:
    cosinesimilarity(line)
    This looks fine to me; I want it to look at each movie in the text file and do cosine similarity for each but, I keep getting this error:
    ,line 16, in get_index_from_title
    return df[df.title == title]["index"].values[0]
    IndexError: index 0 is out of bounds for axis 0 with size 0
    This is the code it's referring to from the video:
    def get_index_from_title(title):
    return df[df.title == title]["index"].values[0]
    Could you help me out with what to do here? You didn't really explain how 'get_index_from_title()' works in the video.

  • @sanskaarpatni9137
    @sanskaarpatni9137 Před 4 lety

    Great explanation Sir!

  • @ImACoolGirl23
    @ImACoolGirl23 Před 3 lety

    @CodeHeroku Is this dataset from Kaggle? Can you please share a link to the original repository of dataset?

  • @AkashRaj-if6di
    @AkashRaj-if6di Před 3 lety

    Sir i have one important question, i run your index.html and there we have to enter movie name and our model will give all movie related to that keyword(whatever we enter)...i am right?? So how we say that this is a ml or ds problem??? this looks like coded type of problem (if the keyword entered by user in search movie bar is avalable in our list, display them)...it works like this??

  • @bushraakram3523
    @bushraakram3523 Před 4 lety

    i dont understand the count matrix function. how does it works with movies??

  • @sharaththatikonda5386
    @sharaththatikonda5386 Před 5 lety +4

    you did not provide cleaned data set in github.

  • @faribazare7350
    @faribazare7350 Před rokem

    Many thanks for your great tutorial. I have a question; when I want to create the movie_index object I get this error. why? and how can I fix it? IndexError: index 0 is out of bounds for axis 0 with size 0

  • @reributan7240
    @reributan7240 Před 3 lety

    Great Video Sir !!

  • @anupkg2535
    @anupkg2535 Před 3 lety

    Good one. thanks for the video.

  • @ArunKumar-yb2jn
    @ArunKumar-yb2jn Před 2 lety

    Just curious. Data Science projects are usually executed on Jupyter notebook. How come you are using Sublime? Secondly, are you still on Python 2.X? Why?

  • @HEYTHERE-ko6we
    @HEYTHERE-ko6we Před 3 lety

    IndexError: index 0 is out of bounds for axis 0 with size 0. I'm getting this error while executing the above program. What to do?

  • @purusottam234
    @purusottam234 Před 2 lety

    Great Content

  • @reddy764
    @reddy764 Před 4 lety +1

    Can you provide a series on Time series Analysis and forecasting in python ?

  • @JohnSamuelM-nt3vx
    @JohnSamuelM-nt3vx Před 3 lety

    i have an error in sorting step. it says syntax is wrong in the sorted_similar_movies line.

  • @cli_ninja
    @cli_ninja Před 5 lety

    I have a question:
    The documentation of cosine_similarity said that the function takes two vectors
    I used the count_matrix.toarray()[0] and count_matrix.toarray()[1] in the arguements of the method.
    It gave me an error to reshape it,
    so i did: count_matrix.toarray()[0].reshape(-1, 1) and count_matrix.toarray()[1].reshape(-1, 1)
    The output was not similar to yours. Why?

    • @CodeHeroku
      @CodeHeroku  Před 5 lety

      Hello,
      1. Here is the method signature for cosine_similarity from scikit-learn documentation: cosine_similarity(X, Y=None, dense_output=True)
      i.e. the second parameter Y is optional and when Y is None it computes the pairwise similarities between all samples in X.
      2. You are trying to compute distance between the 1st and 2nd row of your count_matrix which not what we are looking for.

  • @suryatej2573
    @suryatej2573 Před 3 lety

    bro ur explanation was awesome and if possible can u give suggestions to me for a laptop recommendation system and i want to use content based and collabarative filtering which means hybrid method so can u give me suggestions for this

  • @harperbye
    @harperbye Před 3 lety

    Can we use pca with this to reduce features?

  • @AmanRaj-tj7lj
    @AmanRaj-tj7lj Před 2 lety

    my print statement on line 8 throws error. I am using python v2 in sublime text. it asks me to add parantheses.

  • @KLCS-cv4ef
    @KLCS-cv4ef Před 4 lety +1

    Thank you so much sir

  • @hamza4283
    @hamza4283 Před 4 lety

    just by giving the name of some movie how are we getting the same genre i cant get it can anybody help me it will be really kind of you

  • @smkj9405
    @smkj9405 Před 3 lety

    thanks sir the video was really helpful

  • @pratikjain3656
    @pratikjain3656 Před 4 lety

    hey i want to print the overview of each movie in the output along with its name.Overview is available in dataset but i am not able to figure out how to implement it.Anyone please help.

  • @nikegada5230
    @nikegada5230 Před 3 lety

    Sir I have tried to run the project
    But i rate the movie after that the recommendation is not been shown on website
    how to show the output on website

  • @bhavikdudhrejiya852
    @bhavikdudhrejiya852 Před 3 lety

    great explanation

  • @yogeshdeveloper5346
    @yogeshdeveloper5346 Před 4 lety +1

    43:20, why no parentheses in the print statement. Are you using python 2?

  • @gateyo499
    @gateyo499 Před 5 lety

    Hi Sir, thank you for the great explanation and video. I have tried to run the codes in sublime, however, I faced some issues and one of it as below:
    line 28
    print "Error:", row
    suppose print ("Error:",row), is it?
    I dont know and need your help to advise. Thank you.

    • @CodeHeroku
      @CodeHeroku  Před 5 lety +1

      Yes, you already know the answer, if you are on Python 3.0, you should add parentheses () for all print statements.. So try print("Error");print(row)

    • @gateyo499
      @gateyo499 Před 5 lety +1

      @@CodeHeroku Thank you for your prompt respond! Actually I just realized that python version 2 and 3 have this kind of difference. Now confirmed and it works! Thank you.

  • @BarbaraAboagye
    @BarbaraAboagye Před 4 lety

    Very interesting. For facebook, could an item also be product? facebook marketplace and the ads you are shown.

  • @chetanacharyapv2890
    @chetanacharyapv2890 Před 3 lety

    What is ML in this? What does the machine learn and give output here?

  • @JayShah_._._
    @JayShah_._._ Před 3 lety

    what if user likes multiple movie can we use content based for that ?

  • @amankumarmahore4072
    @amankumarmahore4072 Před 3 lety +2

    Can u make a tutorial on how to deploy this model using Django

  • @anytimeanywhere4163
    @anytimeanywhere4163 Před 3 lety

    because both places exist on the Earth and when we talk about any plane which is existing on the 2D plane along with the x, y plane that's why you are using x and y Asix If you had to show somewhere in the sky to these places then You would then display in the sky in 3D digram right

  • @ishanimenghani6739
    @ishanimenghani6739 Před 4 lety

    What is the source of dataset? and of which year it is?

  • @jesti988
    @jesti988 Před 4 lety

    can we done same code as in jupyter notebook?

  • @vidyamc4340
    @vidyamc4340 Před 3 lety

    I am not getting what is latent factor meant? Please explain

  • @Rayn_roy
    @Rayn_roy Před 3 lety

    wonderfull brother super

  • @hlgsagar3792
    @hlgsagar3792 Před 4 lety

    Good explaination

  • @rajmani4486
    @rajmani4486 Před 4 lety

    Sir my vs code is working for sklearn but not run pandas...how set vs code for this project..please help me sir

  • @eshaanchauhan5854
    @eshaanchauhan5854 Před 4 lety

    great video.

  • @akshaykumarr7541
    @akshaykumarr7541 Před 4 lety

    can we achieve this using java?

  • @ijustankit
    @ijustankit Před 3 lety

    in case of changing the movie name,
    it throws error
    Traceback (most recent call last):
    File "f:\CODES\Movie Recommendation Engine(without ui)\movie_recommender_starter.py", line 62, in
    movie_index = get_index_from_title(movie_user_likes)
    File "f:\CODES\Movie Recommendation Engine(without ui)\movie_recommender_starter.py", line 13, in get_index_from_title
    return df[df.title == title]["index"].values[0]
    IndexError: index 0 is out of bounds for axis 0 with size 0

  • @user-yu2jg7sf1r
    @user-yu2jg7sf1r Před 5 lety

    can someone please explain get_index_from_title fn or the similar fn to this

  • @DZAKY244
    @DZAKY244 Před 4 lety

    the best tuts

  • @codelta2
    @codelta2 Před 3 lety

    Any idea how to test and evaluate this type of RS?

  • @vinaykumar1393
    @vinaykumar1393 Před 3 lety

    how can i put my recommendation engine on website ?

  • @vaideg1859
    @vaideg1859 Před 3 lety

    can u share the method for finding accuracy

  • @r-rk
    @r-rk Před 2 lety

    So, how do we evaluate this recommendation system?

  • @shubhamjindal9214
    @shubhamjindal9214 Před 5 lety

    Sir I am unable to understand that where are we even using the Machine Learning concept in this code. We are just forming the list of movies based upon the matches calculated by the cosine similarity ?
    It will be of great help if you could please tell

    • @CodeHeroku
      @CodeHeroku  Před 5 lety +4

      That's a really good question and honestly even I am not convinced with lot of answers that I have found - on whether a content based recommendation engine qualifies as a ML approach. I believe it is more close to statistical learning (where you make certain assumptions /hypothesis about the problem). Irrespective of the bucket you put it in, the concepts that you learn while building recommendation systems are close to other supervised ML problems and thus are usually taught in most ML courses.

  • @mianmuhammadnouman132
    @mianmuhammadnouman132 Před 4 lety +1

    Thanks it helped me alot.
    I am confused about that if we find similarities then how to recommend these movies.

    • @CodeHeroku
      @CodeHeroku  Před 4 lety

      Once we found the similarity scores we can recommend movies similar what a user has seen and liked before.

  • @fawazolokodana6189
    @fawazolokodana6189 Před 4 lety

    countvectorizer didnt work for me but I used tfidfVectorizer did work for me

  • @AkashRaj-if6di
    @AkashRaj-if6di Před 3 lety

    you are using Python 2.X??

  • @NikhilKumar-vf9nv
    @NikhilKumar-vf9nv Před 4 lety

    Why choose cosine model??