The Next 100x - Gavin Uberti | Stanford MLSys #92
Vložit
- čas přidán 27. 02. 2024
- Episode 92 of the Stanford MLSys Seminar Series!
The Next 100x - How the Physics of Chip Design Shapes the Future of Artificial Intelligence
Speaker: Gavin Uberti
Abstract:
Moore's law is slowing down, but AI models are rapidly getting bigger. But why exactly is this happening? How chip designers dealt with it in the past? Why is it happening unevenly across transistors, wires, and memory? And how can AI designers avoid fighting the physical limitations, and work with them instead?
Bio:
Gavin is the founder of Etched, a company making highly specialized AI chips for Transformers. Before founding Etched, Gavin studied math at Harvard and worked for Xnor and OctoML building AI compilers like Apache TVM. His interests lie in AI scaling laws, watermarking and watermark detection, and in the interaction of chip design with the above topics.
--
Stanford MLSys Seminar hosts: Avanika Narayan, Benjamin Spector, Michael Zhang
Twitter:
/ avanika15
/ bfspector
/ mzhangio
--
Check out our website for the schedule: mlsys.stanford.edu
Join our mailing list to get weekly updates: groups.google.com/forum/#!for...
#machinelearning #ai #artificialintelligence #systems #mlsys #computerscience #stanford - Věda a technologie
Would it be possible to use an external mic for the speaker and the person who asks the question?
Its quite challenging to hear
Very interesting and well presented!
Great talk, but why didn't anyone ask questions around competition. What is to prevent Nvidia, AMD, or Intel from producing niche chips like this? With their R&D teams, Quality Assurance systems, Warranties, and supply chains, they likely thought of this and if not should be able to deploy a more competitive and reliable solution fast.
That being said I really appreciate Gavin breaking down the history here I learn a lot of new things.
Corporations tend to move slowly... it is less expensive (relatively, $ and time) for a nimble co to attempt to innovate like this... also, the gamble is the Sohu platform becomes so appetizing that it ends up as an acquisition target... again, both are simply bets... not without risk
@@manonamission2000 Sure, you could argue some leaders like Blockbuster moved slow when the rising leader Netflix transitioned to online and on-demand content.
However, unlike on-demand streaming services, Gen Ai is the most revolutionary technology of our time and if this direction was so promising and yet as simple as creating a niche chip focused solely on transformers then you'd think Intel and AMD with it's massive R&D teams would already be doing it to get an edge on Nvidia.
These serious business questions should have been asked, I'll do more research but hard to take any of this seriously if such as basic question could not have been asked/answered.
Nice history lesson. Nothing about the ‘next 100x’ promised in the title.
Dude, you're at Stanford; I think students know what an inverter does. This was an ML seminar talk? How? And, how did this have anything to do with the topics explicitly raised in the Abstract? Just asking.... And, BTW, HBM isn't the only type of memory that's relevant especially for inference, which is, BTW, the focus of his company.
37:40, as you already realized that LLM (and transformer architecture in general) is memory constrained, the extra FLOPS are wasted until TSMC productize SOT-MRAM. groq with SRAM is a more realistic short term approach for small models.
it's cool, but this is going to be a long road. The main problem is software at the IR (e.g. CUDA), not necessarily hardware. There are many companies that can make interesting transistor permutations that have been doing it for a long time and they are not magically "accelerating superintelligence". This is a software ecosystem problem more than anything else. good luck.