Coding Llama 3 from scratch in PyTorch - Part 1

SdĂ­let
VloĆŸit
  • čas pƙidĂĄn 5. 05. 2024
  • In this video series, you will learn how to train and fine-tune Llama 3 model from scratch.
    The goal is to code LLaMA 3 from scratch in PyTorch to create models with sizes 3B, 6B, 35B and 45BM params. In this first video, you'll learn about upcycling, downcycling and infini-attention.
    📚Papers:
    - Sparse Upcycling Training Mixture-of-Experts from Dense Checkpoints
    : arxiv.org/abs/2212.05055
    - Pre-training Small Base LMs with Fewer Tokens: arxiv.org/abs/2404.08634
    Leave No Context Behind Efficient Infinite Context Transformers with Infini-attention: arxiv.org/abs/2404.07143
    đŸ’» To follow along you can use this colab notebook:
    - github.com/Blaizzy/Coding-LLM...
    đŸŽ„ Coding Llama 2 from scratch video series
    Part 1: czcams.com/users/liveXHmag4damTg
    Part 2: czcams.com/users/liveLSWDpFmbE90
    Part 3: ‱ Coding Llama 2 from sc...
  • Věda a technologie

Komentáƙe • 8

  • @AC-go1tp
    @AC-go1tp Pƙed 16 dny +3

    This is very thoughtful and great initiative! researchers with enough gray matter but limited means can be still in the game . Thank you PC🙏!

    • @princecanuma
      @princecanuma  Pƙed 15 dny

      Most welcome!
      It’s my pleasure:)
      I lived through this so others don’t have to.

  • @ngamcode2485
    @ngamcode2485 Pƙed 5 dny

    this is very impressive and great content. thank you

  • @kishoretvk
    @kishoretvk Pƙed 15 dny

    Super impressive. Great value
    One question
    How do I further train the model on my custom content
    Instead of LORA ?
    Can we further full training it and add new memory

    • @princecanuma
      @princecanuma  Pƙed 9 dny

      Most welcome!
      You can do that, but that can be very expensive.

  • @vivekpadman5248
    @vivekpadman5248 Pƙed 3 dny

    Bro how did you train llama 3 without paper?