Chat with Llama-3 with HuggingFace & Build a chatbot with Gradio

Sdílet
Vložit
  • čas přidán 29. 08. 2024
  • Run Meta Llama 3 8B tutorial with Transformers locally and for free and then test this model by creating an app with Gradio.
    00:01: Introduction
    00:54 Setup
    03:38 Load the model
    06:27 Create the prompt
    08:08 Inference
    10:07 Build the app
    15:21 Test the app
    ▶ Subscribe: bit.ly/subscri...
    ▶ Join my channel: bit.ly/join-ti...
    RELATED VIDEOS:
    ▶ LangChain Tutorials: bit.ly/langcha...
    ▶ Generative AI for DS: bit.ly/genai-f...
    ▶ HuggingFace Tutorials: bit.ly/hugging...
    ▶ Generative AI Tutorials: bit.ly/generat...
    ▶ LLMs Tutorials: bit.ly/llm-tut...
    FOLLOW ME:
    ▶ Medium: / tirendazacademy
    ▶ X: / tirendazacademy
    ▶ LinkedIn: / tirendaz-academy
    Project files: github.com/Tir...
    Hi, I am Tirendaz, PhD. I create content on generative AI & data science. My goal is to make the latest technologies understandable for everyone. Don't forget to subscribe and turn on notifications so you don't miss the latest videos.
    #ai #generativeai #datascience

Komentáře • 12

  • @mustafadogruer3830
    @mustafadogruer3830 Před 4 měsíci +4

    You always keep up the updates. Thanks for the video!

  • @VincentTang-tn1bi
    @VincentTang-tn1bi Před 22 dny

    It seems the chatbot will not answer before checking the history🤔

  • @mohammadsbeeh6131
    @mohammadsbeeh6131 Před 2 měsíci +1

    for me i didnt get same output as u when debuged prompt in 07:15

  • @dheenathsundararajan7225
    @dheenathsundararajan7225 Před měsícem +1

    i did the exact ways u did brother..but the model takes a lot of time to generate the response..is there any possible fix?

    • @iiTzMemo
      @iiTzMemo Před měsícem

      What are your pc specs?

  • @thetanukigame3289
    @thetanukigame3289 Před 3 měsíci

    Thank you for the great video. It was really helpful for getting everything set up. If I may ask, I have a 4090 graphics card and I can this maxing out my GPU usage so the cuda should be working correctly. However, my prompts when asked take anywhere between 20s and 2 minutes to return and after a few questions the chatbot stops responding at all and just stays processing. Is this normal?

    • @TirendazAI
      @TirendazAI  Před 3 měsíci

      Which large model do you use? If you're using llama-3:70B, I think it's normal.

  • @Alicornriderm
    @Alicornriderm Před 3 měsíci

    Thanks for the tutorial! Is this actually running locally? If so, how did it download so quickly?

    • @TirendazAI
      @TirendazAI  Před 3 měsíci

      Yes, the model is running locally and for free. To download the model, you can use ollama.

    • @muhaiminhading3477
      @muhaiminhading3477 Před 2 měsíci

      @@TirendazAI hi, I have use ollama3 with this localhost port : 127.0.0.1:11434.
      but I am confusing, how to load the model with transformers ? so I can follow your step?

  • @iiTzMemo
    @iiTzMemo Před měsícem

    Does this work on mac with m2 too?