The Next 100x - Gavin Uberti | Stanford MLSys #92

Sdílet
Vložit
  • čas přidán 27. 02. 2024
  • Episode 92 of the Stanford MLSys Seminar Series!
    The Next 100x - How the Physics of Chip Design Shapes the Future of Artificial Intelligence
    Speaker: Gavin Uberti
    Abstract:
    Moore's law is slowing down, but AI models are rapidly getting bigger. But why exactly is this happening? How chip designers dealt with it in the past? Why is it happening unevenly across transistors, wires, and memory? And how can AI designers avoid fighting the physical limitations, and work with them instead?
    Bio:
    Gavin is the founder of Etched, a company making highly specialized AI chips for Transformers. Before founding Etched, Gavin studied math at Harvard and worked for Xnor and OctoML building AI compilers like Apache TVM. His interests lie in AI scaling laws, watermarking and watermark detection, and in the interaction of chip design with the above topics.
    --
    Stanford MLSys Seminar hosts: Avanika Narayan, Benjamin Spector, Michael Zhang
    Twitter:
    / avanika15​
    / bfspector
    / mzhangio
    --
    Check out our website for the schedule: mlsys.stanford.edu
    Join our mailing list to get weekly updates: groups.google.com/forum/#!for...
    #machinelearning #ai #artificialintelligence #systems #mlsys #computerscience #stanford
  • Věda a technologie

Komentáře • 9

  • @jaytau
    @jaytau Před 4 měsíci +7

    Would it be possible to use an external mic for the speaker and the person who asks the question?
    Its quite challenging to hear

  • @sucim
    @sucim Před 4 měsíci +2

    Very interesting and well presented!

  • @LazyDotDev
    @LazyDotDev Před 10 dny +2

    Great talk, but why didn't anyone ask questions around competition. What is to prevent Nvidia, AMD, or Intel from producing niche chips like this? With their R&D teams, Quality Assurance systems, Warranties, and supply chains, they likely thought of this and if not should be able to deploy a more competitive and reliable solution fast.
    That being said I really appreciate Gavin breaking down the history here I learn a lot of new things.

    • @manonamission2000
      @manonamission2000 Před 2 dny

      Corporations tend to move slowly... it is less expensive (relatively, $ and time) for a nimble co to attempt to innovate like this... also, the gamble is the Sohu platform becomes so appetizing that it ends up as an acquisition target... again, both are simply bets... not without risk

    • @LazyDotDev
      @LazyDotDev Před 2 dny

      @@manonamission2000 Sure, you could argue some leaders like Blockbuster moved slow when the rising leader Netflix transitioned to online and on-demand content.
      However, unlike on-demand streaming services, Gen Ai is the most revolutionary technology of our time and if this direction was so promising and yet as simple as creating a niche chip focused solely on transformers then you'd think Intel and AMD with it's massive R&D teams would already be doing it to get an edge on Nvidia.
      These serious business questions should have been asked, I'll do more research but hard to take any of this seriously if such as basic question could not have been asked/answered.

  • @nauy
    @nauy Před 2 měsíci +1

    Nice history lesson. Nothing about the ‘next 100x’ promised in the title.

  • @briancase6180
    @briancase6180 Před 10 dny

    Dude, you're at Stanford; I think students know what an inverter does. This was an ML seminar talk? How? And, how did this have anything to do with the topics explicitly raised in the Abstract? Just asking.... And, BTW, HBM isn't the only type of memory that's relevant especially for inference, which is, BTW, the focus of his company.

  • @vicaya
    @vicaya Před 4 měsíci +1

    37:40, as you already realized that LLM (and transformer architecture in general) is memory constrained, the extra FLOPS are wasted until TSMC productize SOT-MRAM. groq with SRAM is a more realistic short term approach for small models.

  • @georgehart5182
    @georgehart5182 Před 4 měsíci +3

    it's cool, but this is going to be a long road. The main problem is software at the IR (e.g. CUDA), not necessarily hardware. There are many companies that can make interesting transistor permutations that have been doing it for a long time and they are not magically "accelerating superintelligence". This is a software ecosystem problem more than anything else. good luck.