Monarch Mixer: Making Foundation Models More Efficient - Dan Fu | Stanford MLSys #86

Sdílet
Vložit
  • čas přidán 24. 07. 2024
  • Episode 86 of the Stanford MLSys Seminar Series!
    Monarch Mixer: Making Foundation Models More Efficient
    Speaker: Dan Fu
    Abstract:
    Machine learning models are increasingly being scaled in both sequence length and model dimension to reach longer contexts and better performance. However, existing architectures like Transformers scale quadratically along both these axes. In this talk I'll discuss Monarch Mixer (M2), a new architecture that uses the same sub-quadratic primitive along both sequence length and model dimension. M2 mixes information along the sequence and model dimensions using Monarch matrices, a simple class of expressive structured matrices that captures many linear transforms, achieves high hardware efficiency on GPUs, and scales sub-quadratically.
    Bio:
    Dan Fu is a PhD student in the Computer Science Department at Stanford University, where he is co-advised by Christopher Ré and Kayvon Fatahalian. His research is at the intersection of systems and machine learning and focuses on developing algorithms and architectures to make machine learning more efficient.
    Monarch Mixer arXiv: arxiv.org/abs/2310.12109
    FlashFFTConv arXiv: arxiv.org/abs/2311.05908
    --
    Stanford MLSys Seminar hosts: Simran Arora, Dan Fu
    Twitter:
    / simran_s_arora
    / realdanfu​
    --
    Check out our website for the schedule: mlsys.stanford.edu
    Join our mailing list to get weekly updates: groups.google.com/forum/#!for...
    #machinelearning #ai #artificialintelligence #systems #mlsys #computerscience #stanford
  • Věda a technologie

Komentáře • 4

  • @kvotheosem-sangue
    @kvotheosem-sangue Před měsícem

    Explained so clearly! The paper gets you confused when gets into the math due to the material being so dense, thanks for extending to a video format

  • @jawadmansoor6064
    @jawadmansoor6064 Před 6 měsíci +1

    axriv link please?

    • @backtofocused438
      @backtofocused438 Před 6 měsíci +1

      Indeed! It is such a wonderful work and such a fantastic way to learn and I world have expected that for such a fantastic scientic exploration about this

    • @StanfordMLSysSeminars
      @StanfordMLSysSeminars  Před 6 měsíci +1

      Added to the description!