Coding Llama-2 from scratch in PyTorch - Part 2

Sdílet
Vložit
  • čas přidán 28. 02. 2024
  • In this video series, you will learn how to train and fine-tune Llama 2 model from scrach.
    The goal is to code LLaMA 2 from scratch in PyTorch to create models with sizes 100M, 250M and 500M params. In this second video, you'll learn about different attention mechanisms (MHA, MQA and GQA) in detail and how to implement them in the 100M model we built last time
    This is a step-by-step guide to Llama 2 model implementation based on the research paper.
    To follow along you can use this colab notebook:
    colab.research.google.com/dri...
  • Věda a technologie

Komentáře • 7