Official PyTorch Documentary: Powering the AI Revolution

How to hire programmers | Chris Lattner and Lex Fridman

Llamafile: bringing AI to the masses with fast CPU inference: Stephen Hood and Justine Tunney

A Minecraft Movie | Teaser

How much pee is in a pool?

Clash of the Stars 9 | Nejrychlejší KO v historii | Deathmatch 5 proti sobě

Unlocking Developer Productivity across CPU and GPU with MAX: Chris Lattner

AI Engineer

zhlédnutí 4 490

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 11. 09. 2024
Today's leading generative AI applications have workloads that span high performance GPU compute, CPU preprocessing, data-loading, and orchestration - often spread across a combination of Python, C++/Rust, and CUDA C++ - which increases the complexity and slows down the cycle of innovation. This talk explores the capabilities and power of the Modular Mojo programming language and Modular Accelerated Xecution (MAX) platform, which unifies CPU and GPU programming into a single Pythonic programming model that is simple and extensible. This results in reduced complexity and improved developer productivity, and streamlines innovation. We'll walk through CPU and GPU support with real-world examples, providing details of how AI application developers can use MAX and Mojo to define an end-to-end AI pipeline and overcome the complexities.
Recorded live in San Francisco at the AI Engineer World's Fair. See the full schedule of talks at www.ai.enginee... & join us at the AI Engineer World's Fair in 2025! Get your tickets today at ai.engineer/2025
About Chris
Chris Lattner is a co-founder and the CEO of Modular, which is building an innovative new developer platform for AI and accelerated compute. Modular provides an AI engine that accelerates PyTorch and TensorFlow inference, as well as the Mojo🔥 language, which extends Python into systems and accelerator programming domains. He has also co-founded the LLVM Compiler infrastructure project, the Clang C++ compiler, the Swift programming language, the MLIR compiler infrastructure, the CIRCT project, and has contributed to many other commercial and open source projects at Apple, Tesla, Google and SiFive.

Komentáře • 6

@jianghong6444 Před měsícem
at 8:16 the presenter is comparing MAX against llama.cpp using CPU as inference, now the main contributor of llamafile claims that llama.cpp mainly focus on GPU stack (which sort of makes sense since CPU can be comparatively slower), so I'm not sure how big of a impact that would be.
@haichengwu799 Před měsícem
Do you turn on splitk or streamk in cutlass? Your measurement of cutlass does not look correct . Haicheng @ nvidia
@JL-1735 Před měsícem ⁺¹⁶
I have zero interest in Modular neither in MAX as long as it’s not fully open source. They have the right to make it closed, but the “we are making some things open” without any clarity or guarantee that the rest of the stack will eventually become open, is equal to it just being closed source. I would consider it a rug pull, as Chris has been teasing the community and been earning positive press -as if- it’s an open source project.
@LisaSamaritan Před měsícem ⁺⁵
He explains it on Lex Fridmans podcast #381. You can jump to 02:21:57. But basically he had bad experience from making Swift, where everyone wanted new functionality at he same time as the core parts was being developed and that led to a bunch of bugs and rewrites, and he don't want to make that mistake again.
He will release parts as they become stable enough, that this will not happen.
@LisaSamaritan Před měsícem
Besides all of his other projects* is open source, so why do you think he wouldn't do it again?
* The LLVM/MLIR compiler
The Clang compiler
The Swift programming language
The biggest question was surrounding MAX. MAX is written in Mojo, but isn't a part of the language. It now has a free license, for lokal/on prem use. You have to pay for using it in the cloud and for commercial support.
[Also, nothing prevents you from writing your own MAX like solution in Mojo... Modular have to make money somehow and the license seems fair. Most people get it for free and the ones that can afford to pay, will pay.]
But even without MAX, you will have Mojo, that is as simple to use as Python and can run any Python program at an expected 2-10x speed improvement (compared to Python's own compiler, without any optimization).
A 10-100x improvement if you use the Mojo specific, low level parts (basically like writing a part in RUST).
And in rare occasions you can get a greater improvement. There is some algorithm that have shown 36000x extra speed (if I remember correctly).
As with everything, whatever extra speed you will get depends on many factors.
@RickySupriyadi Před měsícem
OMG if there is new standard of API can communicate with LLM OMG that really change the world if they all use this standard automation in simple step! uh not really what about security.... like rouge LLM roaming around and exploiting those API wow more talks like these please.
oh if it's open source LLM can communicate with those API might get more secured? maybe

Další v pořadí

Automatické přehrávání

Official PyTorch Documentary: Powering the AI Revolution

Official PyTorch Documentary: Powering the AI Revolution

How to hire programmers | Chris Lattner and Lex Fridman

How to hire programmers | Chris Lattner and Lex Fridman

Llamafile: bringing AI to the masses with fast CPU inference: Stephen Hood and Justine Tunney

Llamafile: bringing AI to the masses with fast CPU inference: Stephen Hood and Justine Tunney

A Minecraft Movie | Teaser

A Minecraft Movie | Teaser

How much pee is in a pool?

How much pee is in a pool?

Clash of the Stars 9 | Nejrychlejší KO v historii | Deathmatch 5 proti sobě

Clash of the Stars 9 | Nejrychlejší KO v historii | Deathmatch 5 proti sobě

KONČÍM CESTU NA OLYMPII A ZÁVODNÍ KARIÉRU

KONČÍM CESTU NA OLYMPII A ZÁVODNÍ KARIÉRU

Mojo🔥: a deep dive on ownership with Chris Lattner

Mojo🔥: a deep dive on ownership with Chris Lattner

A Deep Dive Into Sendable for the Apple Developer - Tim Condon - SwiftCraft 2024

A Deep Dive Into Sendable for the Apple Developer - Tim Condon - SwiftCraft 2024

[UPDATE] Mojo Is Faster Than Rust - Mojo Explains More

[UPDATE] Mojo Is Faster Than Rust - Mojo Explains More

From LLVM to Mojo: Chris Lattner on Revolutionizing AI Development

From LLVM to Mojo: Chris Lattner on Revolutionizing AI Development

Chris Lattner: Compilers, LLVM, Swift, TPU, and ML Accelerators | Lex Fridman Podcast #21

Chris Lattner: Compilers, LLVM, Swift, TPU, and ML Accelerators | Lex Fridman Podcast #21

How Codeium Breaks Through the Ceiling for Retrieval: Kevin Hou

How Codeium Breaks Through the Ceiling for Retrieval: Kevin Hou

GraphRAG: The Marriage of Knowledge Graphs and RAG: Emil Eifrem

GraphRAG: The Marriage of Knowledge Graphs and RAG: Emil Eifrem

Java Is Better Than Rust

Java Is Better Than Rust

Meet The New Mark Zuckerberg | The Circuit

Meet The New Mark Zuckerberg | The Circuit

SMACK ONE MEZCAL

SMACK ONE MEZCAL

Proč LikeHouse Tak Tragicky Selhal?

Proč LikeHouse Tak Tragicky Selhal?

Apple peeling hack

Apple peeling hack

Nastya and balloon challenge

Nastya and balloon challenge

TOHODLE JSTE SI V AVENGERS NEVŠIMLI #zajimavosti #avengers

TOHODLE JSTE SI V AVENGERS NEVŠIMLI #zajimavosti #avengers

这三姐弟太会藏了！#小丑#天使#路飞#家庭#搞笑

这三姐弟太会藏了！#小丑#天使#路飞#家庭#搞笑

Don't Let This Happen To You... 😂

Don't Let This Happen To You... 😂

This Is the Most Satisfying Pimple DIY Ever 🤩 #diy #hack

This Is the Most Satisfying Pimple DIY Ever 🤩 #diy #hack