Mamba Might Just Make LLMs 1000x Cheaper...

The moment we stopped understanding AI [AlexNet]

The Painful Launch of Stable Diffusion 3

The clown broke the wings of the white angel and gave the wings to Harley Quinn!#cosplay

AI: Giganti, horečka a konec světa | KOVY

STRÁVIL JSEM NOC V KAPSLOVÉM HOTELU... špatný nápad

The Largest Mamba LLM Experiment Just Dropped

bycloud

zhlédnutí 36 612

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 6. 07. 2024
Check out HubSpot's ChatGPT at work bundle! clickhubspot.com/2os
A long awaited sequel in LLM research has appeared, AI21Labs has dropped the biggest Mamba experiment, which is on par with other open source LLM models! Just with a few twists...
Original Mamba Paper
[Paper] arxiv.org/abs/2312.00752
[Code] github.com/state-spaces/mamba
MambaFormer
[Paper] arxiv.org/pdf/2402.04248.pdf
AI21Labs
[Blog] www.ai21.com/blog/announcing-...
[Huggingface] huggingface.co/ai21labs/Jamba...
[NVIDIA NIM] nvda.ws/3Jn5pxb
VideoMamba
[Paper] arxiv.org/abs/2403.06977
[Code] github.com/OpenGVLab/VideoMamba
Special thanks to LDJ for helping out with the content in this video!
This video is supported by the kind Patrons & CZcams Members:
🙏Andrew Lescelius, alex j, Chris LeDoux, Alex Maurice, Miguilim, Deagan, FiFaŁ, Daddy Wen, Tony Jimenez, Panther Modern, Jake Disco, Demilson Quintao, Shuhong Chen, Hongbo Men, happi nyuu nyaa, Carol Lo, Mose Sakashita, Miguel, Bandera, Gennaro Schiano, gunwoo, Ravid Freedman, Mert Seftali, Mrityunjay, Richárd Nagyfi, Timo Steiner, Henrik G Sundt, projectAnthony, Brigham Hall, Kyle Hudson, Kalila, Jef Come, Jvari Williams, Tien Tien, BIll Mangrum, owned, Janne Kytölä, SO, Richárd Nagyfi, Hector, Drexon, Claxvii 177th, Inferencer, Michael Brenner
[Discord] / discord
[Twitter] / bycloudai
[Patreon] / bycloud
[Music] Massobeats - Lush
[Profile & Banner Art] / pygm7
[Video Editor] Silas
0:00 Intro
1:16 Hubspot
2:24 Jamba
8:08 VideoMamba
Věda a technologie

Komentáře • 70

@bycloudAI Před 2 měsíci ⁺⁹
Check out HubSpot's ChatGPT at work bundle here: clickhubspot.com/2os
unfortunately topping the last mamba edit is way too hard, but I guess now at least we know *_mamba is real_*
@rounaksen1683 Před 2 měsíci
Hove you seen google's griffin and hawk?
@vinc6966 Před 2 měsíci ⁺⁶⁰
If mamba does not scale well, we still have diffusion models for text
@thipoktham5164 Před 2 měsíci ⁺²
Why not both?
@sascha_becker Před 2 měsíci ⁺⁵⁷
Jamba Mamba ¡Ay, caramba!
@user-zu6wg9wt8m Před 20 dny
bien dicho
@beerbytes9895 Před 2 měsíci ⁺²⁵
@fireship game up your memes this boy is strapped to the teeth.
@dolcruz6838 Před 2 měsíci ⁺²⁶
Would be interesting to see the infinite context from the "Leave No Context Behind:
Efficient Infinite Context Transformers with Infini-attention" Paper explained.
@farrael004 Před 2 měsíci ⁺²
Ikr. I wonder why that paper didn't get more traction
@svendpai Před 2 měsíci ⁺²⁷
love your memes so much
@bernard-ng Před 2 měsíci ⁺¹⁹
wait.... this is not a @fireship video damm
@sunshine19088 Před 21 hodinou
close enough
@vongolashodaime1975 Před 2 měsíci ⁺³³
Hey, would you be interested in making a video about ponydiffusion ?
@kolkoki Před 2 měsíci
Isn't pony diffusion just a latent diffusion foundation model, like stable diffusion?
@vongolashodaime1975 Před 13 dny
@@kolkoki I got no clue about any of that sorry, I just know that, at least back then, pony revolutionized accuracy to character LoRAs and made the generations of already existing characters so much more accurate than other checkpoints.
@RealTwiner Před 2 měsíci
I dont watch this channel much, but I did see that epic mamba short in one of your videos and it has been ingrained in my mind ever since.
@OxygenGenesis Před 2 měsíci
Love your video essays, good and easy to understand and nice to catch up to SOTA methods.
@mrrespected5948 Před 2 měsíci ⁺¹
Very nice
@OfficialNierto Před 2 měsíci ⁺¹
could we use it through ollama?
@logangarcia Před 2 měsíci ⁺¹
so is this cheaper than Mistral 7B? ❤
@zzzzzzz8473 Před 2 měsíci
appreciate these videos . the main thing ive heard regarding mamba v transformers is that the discoveries of optimizations within transformers are still abundant , quantization alone is massive in enabling the networks to run on average hardware , and the ridiculousness of 1.56bit quantization working is incredible where as with mamba no quantization is available .
@akaanoone6939 Před 2 měsíci ⁺⁶
If you enjoy CZcams and it pays bills then sure but play it safe so you don't make life much harder than necessary. Plus you might be able to do research at the same time and present it to people in a more consumable form
@drexon88 Před 2 měsíci ⁺²
Everyone is combining models rn. Some people combine NeRF and GS and that worked as well. I guess that ML will become just a mixer for architectures at least for some commercial devs
@JackCrossSama Před 2 měsíci ⁺¹
we need one called Mongoose
@erickmarin6147 Před 2 měsíci
Im trying to write bitnet layers for Veri log
@tnguyen8633 Před 2 měsíci ⁺¹
dank af
@edhofiko7624 Před 2 měsíci
so whats next? kalman filter with learned dynamic?
@dsgda153 Před 2 měsíci
Oh god. How much of a memelord can you be?! The "can you get much higher" right after the lobotomy? I love you man.
@JosephCatrambone Před 2 měsíci
Isn't mashing together RNNs and Transformers just RWKV?
@lobiqpidol818 Před 2 měsíci ⁺²
Nah bro infini attention is where it's at
@cvs2fan Před 2 měsíci ⁺³
wait a sec bycloud still makes videos? :V
@Metruzanca Před 2 měsíci ⁺¹
The part on Jamba honestly sounds like someone making shit up with fake words, but thats actually all real.
The "Microservices" video by KRAZAM is now reality.
@Ivan.Wright Před 2 měsíci ⁺³
Every time I hear Mamba I can only think of the Python CLI
@hakimehamdouchi7468 Před měsícem
so.... still waiting on the guff file ey?
@JorgetePanete Před 2 měsíci
7:17 LLM Models live inside ATM Machines
@rasuru_dev Před 2 měsíci
Gemma 7B competing with llama70b, mixtral, and jamba damn scale that thing up
@TerrinX Před 2 měsíci
The Mambaaaaaaa the Mamba is reaaaaaaaaaaaaallllllll
@diadetediotedio6918 Před 2 měsíci
3:36
It would still be good for people wanting small models to run on very cheap devices without needing all the quality, no?
@Kazekoge101 Před 2 měsíci
what happened with Hyena?
@jessedbrown1980 Před měsícem ⁺¹
Obviously. I published in December of 2023: Anchoring_Global_Security_Autonomous_Shipping_with_Mind_Reading_AI_GPT-core_and_MAMBA-_core_Agents_RAG-Fusion_AI_Communities_Hive-_AI_and_the_Human_Psyche #mindreading #AI #agent cores #Mamba2 and GPT4, 5 and sequential models #IDE
@smellthel Před 2 měsíci
we live in the future bros
@user-fr2jc8xb9g Před 2 měsíci
Man i'm tired of waiting for GPT-5 , what are they waiting for?
@VisionaryPathway Před 2 měsíci
They're currently red-teaming the model
@user-fr2jc8xb9g Před 2 měsíci ⁺¹
@@VisionaryPathway thanks for answering! How long do you think it will take until release?
@VisionaryPathway Před 2 měsíci
@@user-fr2jc8xb9g personally, I think it’s releasing anytime within next 4-12 weeks (my own opinion/prediction)
@jerrydaboss1 Před 2 měsíci ⁺⁶
329th view. Can I get a heart?
@annaczgli2983 Před 2 měsíci ⁺¹⁶⁴
Why copy Fireship's thumbnails? Sad, man.
@joshford256 Před 2 měsíci ⁺¹⁰⁰
There's no way you think someone can own the format of, "character on the right highlighting big text on the left"??? Thumbnails are like, the least important part of a video when you watch it as a viewer, but it's the most important part when it comes to grabbing viewers' attention. Why shouldn't you use other creators' ideas on what works, when that's not where your creative input is, and it's super important to know you have a successful thumbnail style?
@pizzadog9876 Před 2 měsíci ⁺⁴¹
Who cares, we're here for him, not his thumbnail
@iceshadow487 Před 2 měsíci ⁺⁴⁰
He's been making these style thumbnails for 2+ years now. It's not copying, and it never will be. It's fine to take inspiration from other people when you like their work. And have you considered that he could have also just had this idea himself? It's extremely common for multiple people to have essentially the exact same idea.
@Injazz1 Před 2 měsíci ⁺¹⁵
Thumbnails look similar because there are literally common guidelines that are proven to improve the reach of any YT video either by being more likeable to eyes or because algorithm picks them to trending tab
@NeostormXLMAX Před 2 měsíci ⁺⁵
Didnt fireship copy this guy?
@user-up5kn1ix6v Před 2 měsíci ⁺²
1st
@user-up5kn1ix6v Před 2 měsíci ⁺²
First
@googleyoutubechannel8554 Před měsícem
In the next improvement paper... they're going to suggest a 'hybrid architecture' where you skip the mamba layer entirely....
@frazuppi4897 Před 2 měsíci ⁺²
nobody really uses vanilla attentions in LLMs so like most of what mamba says is BS
@ariseyhun2085 Před 2 měsíci ⁺³
Its extremely obvious that the thumbnails are replicas of Fireship, I know you're trying to grow your channel but it's a little off putting
@dfsgjlgsdklgjnmsidrg Před 2 měsíci
this dude is copying fireship
@ikartikthakur Před 10 dny
maybe he's his otosan
@j0hnr3x Před 2 měsíci ⁺¹
Please stop copying fireship content and thumbnails
@Teapot_418 Před 2 měsíci ⁺⁴
Pathetic @fireship ripoff.

Další v pořadí

Automatické přehrávání

Mamba Might Just Make LLMs 1000x Cheaper...

Mamba Might Just Make LLMs 1000x Cheaper...

The moment we stopped understanding AI [AlexNet]

The moment we stopped understanding AI [AlexNet]

The Painful Launch of Stable Diffusion 3

The Painful Launch of Stable Diffusion 3

The clown broke the wings of the white angel and gave the wings to Harley Quinn!#cosplay

The clown broke the wings of the white angel and gave the wings to Harley Quinn!#cosplay

AI: Giganti, horečka a konec světa | KOVY

AI: Giganti, horečka a konec světa | KOVY

STRÁVIL JSEM NOC V KAPSLOVÉM HOTELU... špatný nápad

STRÁVIL JSEM NOC V KAPSLOVÉM HOTELU... špatný nápad

Cách đi bóng dính chân như Messi! #actionc #short #football #footballshorts

Cách đi bóng dính chân như Messi! #actionc #short #football #footballshorts

100+ Linux Things you Need to Know

100+ Linux Things you Need to Know

How I Robbed The 2024 Steam Summer Sale - Steam Is Perfectly Balanced With NO EXPLOITS...

How I Robbed The 2024 Steam Summer Sale - Steam Is Perfectly Balanced With NO EXPLOITS...

Is the Future of Linear Algebra.. Random?

Is the Future of Linear Algebra.. Random?

I Am The Golden Gate Bridge & Why That's Important.

I Am The Golden Gate Bridge & Why That's Important.

I Built a CoPilot+ AI PC (without Windows)

I Built a CoPilot+ AI PC (without Windows)

5 AI Scams That Are Wildin' Right Now

5 AI Scams That Are Wildin' Right Now

xLSTM: The Sequel To The Legendary LSTM

xLSTM: The Sequel To The Legendary LSTM

AI Deception: How Tech Companies Are Fooling Us

AI Deception: How Tech Companies Are Fooling Us

This is What Limits Current LLMs

This is What Limits Current LLMs

Voice of Alexa Confuses Alexa😂

Voice of Alexa Confuses Alexa😂

If Google Recreated The Apple Vision Pro part 2

If Google Recreated The Apple Vision Pro part 2

Poslední CZC.cz výprodeje...

Poslední CZC.cz výprodeje...

ВОЗМОЖНО ЛИ ПОЧИСТИТЬ КЛАВИАТУРУ КЛЕЕМ?🤔 #shorts

ВОЗМОЖНО ЛИ ПОЧИСТИТЬ КЛАВИАТУРУ КЛЕЕМ?🤔 #shorts

The Weird, Terrible Smartphones They Only Have in North Korea

The Weird, Terrible Smartphones They Only Have in North Korea

Apple Watch with a CAMERA?! 😳

Apple Watch with a CAMERA?! 😳

Tiny Keyboard Family

Tiny Keyboard Family

#phonescreenprotector #tempered #smartphone #temperedglass #cellphone #goodthing #mobilephone #tech

#phonescreenprotector #tempered #smartphone #temperedglass #cellphone #goodthing #mobilephone #tech