How ML Optimizes Data Systems feat. Tim Kraska | Stanford MLSys Seminar Episode 26

Distributed and Decentralized Learning - Ce Zhang | Stanford MLSys #68

Monarch Mixer: Making Foundation Models More Efficient - Dan Fu | Stanford MLSys #86

Může mít NATURÁL dobrou formu?

A teacher captured the cutest moment at the nursery #shorts

Dívka zkouší prokletou hru! #shorthorrorstories #shorthorrorstory #horrorshorts

Disrupting Distributed ML feat. Guanhua Wang | Stanford MLSys Seminar Episode 25

Stanford MLSys Seminars

zhlédnutí 2 149

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 25. 07. 2024
Episode 25 of the Stanford MLSys Seminar Series!
Disruptive Research on Distributed ML Systems
Speaker: Guanhua Wang
Abstract:
Deep Neural Networks (DNNs) enable computers to excel across many different applications such as image classification, speech recognition and robotics control. To accelerate DNN training and serving, parallel computing is widely adopted. System efficiency is a big issue when scaling out. In this talk, I will make three arguments towards better system efficiency in distributed DNN training and serving. First, Ring All-Reduce for model synchronization is not optimal, but Blink is. By packing spanning trees rather than forming rings, Blink achieves higher flexibility in arbitrary networking environments and provides near-optimal network throughput. Blink is filed as a US patent and is being used by Microsoft. Blink gains lots of attention from industry, such as Facebook (distributed PyTorch team), ByteDance (parent company of TikTok app). Blink was also featured on Nvidia GTC China 2019 and news from Baidu, Tencent. Second, communication can be eliminated via sensAI's class parallelism. sensAI decouples a multi-task model into disconnected subnets, each is responsible for decision making of a single task. sensAI's attribute of low-latency, real-time model serving attracts several Venture Capitals in the Bay Area. Third, Wavelet is more efficient than gang-scheduling. By intentionally adding task launching latency, Wavelet interleaves peak memory usage across different waves of training tasks on the accelerators, and thus it improves both computation and on-device memory utilization. Multiple companies, including Facebook and Apple, show interests to Wavelet project.
Speaker bio:
Guanhua Wang is a final year CS PhD in the RISELab at UC Berkeley, advised by Prof. Ion Stoica. His research lies primarily in the ML+Systems area including fast collective communication schemes for model synchronization, efficient in-parallel model training and real-time model serving.
--
0:00 Starting soon
4:23 Presentation
36:50 Discussion
The Stanford MLSys Seminar is hosted by Dan Fu, Karan Goel, Fiodar Kazhamiaka, and Piero Molino, Chris Ré, and Matei Zaharia.
Twitter:
/ realdanfu
/ krandiash
/ w4nderlus7
--
Check out our website for the schedule: mlsys.stanford.edu
Join our mailing list to get weekly updates: groups.google.com/forum/#!for...
#machinelearning #ai #artificialintelligence #systems #mlsys #computerscience #stanford #berkeley

Komentáře • 1

@tingsun5547 Před 10 měsíci
Great talk. Thanks, Dr. Wang!

Další v pořadí

Automatické přehrávání

How ML Optimizes Data Systems feat. Tim Kraska | Stanford MLSys Seminar Episode 26

How ML Optimizes Data Systems feat. Tim Kraska | Stanford MLSys Seminar Episode 26

Distributed and Decentralized Learning - Ce Zhang | Stanford MLSys #68

Distributed and Decentralized Learning - Ce Zhang | Stanford MLSys #68

Monarch Mixer: Making Foundation Models More Efficient - Dan Fu | Stanford MLSys #86

Monarch Mixer: Making Foundation Models More Efficient - Dan Fu | Stanford MLSys #86

Může mít NATURÁL dobrou formu?

Může mít NATURÁL dobrou formu?

A teacher captured the cutest moment at the nursery #shorts

A teacher captured the cutest moment at the nursery #shorts

Dívka zkouší prokletou hru! #shorthorrorstories #shorthorrorstory #horrorshorts

Dívka zkouší prokletou hru! #shorthorrorstories #shorthorrorstory #horrorshorts

Now it's my turn, he can't be angry with me #chang0000 #chany #c4class #shorts #viin #shorts

Now it's my turn, he can't be angry with me #chang0000 #chany #c4class #shorts #viin #shorts

The Next 100x - Gavin Uberti | Stanford MLSys #92

The Next 100x - Gavin Uberti | Stanford MLSys #92

ML for ML Compilers - Mangpo Phothilimthana | Stanford MLSys #80

ML for ML Compilers - Mangpo Phothilimthana | Stanford MLSys #80

The moment we stopped understanding AI [AlexNet]

The moment we stopped understanding AI [AlexNet]

Notes on AI Hardware - Benjamin Spector | Stanford MLSys #88

Notes on AI Hardware - Benjamin Spector | Stanford MLSys #88

Official PyTorch Documentary: Powering the AI Revolution

Official PyTorch Documentary: Powering the AI Revolution

Training LLMs at Scale - Deepak Narayanan | Stanford MLSys #83

Training LLMs at Scale - Deepak Narayanan | Stanford MLSys #83

What's the future for generative AI? - The Turing Lectures with Mike Wooldridge

What's the future for generative AI? - The Turing Lectures with Mike Wooldridge

MIT Introduction to Deep Learning | 6.S191

MIT Introduction to Deep Learning | 6.S191

Hardware-aware Algorithms for Sequence Modeling - Tri Dao | Stanford MLSys #87

Hardware-aware Algorithms for Sequence Modeling - Tri Dao | Stanford MLSys #87

Koupil jsem Nejrychlejší Autíčko na Ovládání za 30 000 Kč!

Koupil jsem Nejrychlejší Autíčko na Ovládání za 30 000 Kč!

#JasonDeruloTV // Wow 🤩 #GotPermissionToPost From @fasheroisbrasil #FromTheIslands

#JasonDeruloTV // Wow 🤩 #GotPermissionToPost From @fasheroisbrasil #FromTheIslands

Logan Paul Fails To Fool Kai Cenat In Mr Beast Video

Logan Paul Fails To Fool Kai Cenat In Mr Beast Video

Playing hide and seek with my dog 🐶

Playing hide and seek with my dog 🐶

Maybe a little TOO much gel 😂

Maybe a little TOO much gel 😂

Ráno po jednorázovke

Ráno po jednorázovke

Může mít NATURÁL dobrou formu?

Může mít NATURÁL dobrou formu?

Llegó al techo 😱

Llegó al techo 😱