LPUs, NVIDIA Competition, Insane Inference Speeds, Going Viral (Interview with Lead Groq Engineers)

Sdílet
Vložit
  • čas přidán 27. 05. 2024
  • This is an interview with Andrew Ling (VP, Compiler Software) and Igor Arsovski (Chief Architect and Fellow) from Groq. We cover topics ranging from the founding story to chip design and manufacturing and so much more. Plus, they reveal how Groq's insane inference speed can generate much better quality from existing models!
    Check out Groq for Free: www.groq.com
    Join My Newsletter for Regular AI Updates 👇🏼
    www.matthewberman.com
    Need AI Consulting? ✅
    forwardfuture.ai/
    My Links 🔗
    👉🏻 Subscribe: / @matthew_berman
    👉🏻 Twitter: / matthewberman
    👉🏻 Discord: / discord
    👉🏻 Patreon: / matthewberman
    Rent a GPU (MassedCompute) 🚀
    bit.ly/matthew-berman-youtube
    USE CODE "MatthewBerman" for 50% discount
    Media/Sponsorship Inquiries 📈
    bit.ly/44TC45V
  • Věda a technologie

Komentáře • 273

  • @Batmancontingencyplans
    @Batmancontingencyplans Před 2 měsíci +72

    Matt is flying high, kudos buddy for landing this interview!

  • @andresprieto6554
    @andresprieto6554 Před 2 měsíci +10

    I am only 11 minutes in, but i love how passionate and knowledgeable Igor is about his industry.

    • @IGame4Fun2
      @IGame4Fun2 Před 2 měsíci

      He as fast as grog, saying "yea, yea good.." before question is finished 😂

  • @MrLargonaut
    @MrLargonaut Před 2 měsíci +75

    Grats on landing the interview!

  • @kamelirzouni4730
    @kamelirzouni4730 Před 2 měsíci +25

    Matt, thank you so much for the interview. You addressed many questions I was eager to understand. The point that truly astounded me was how inference affects model behavior, significantly enhancing response quality. This is a game-changer. Groq has managed to combine speed and quality. I'm eager for it to become widely available and to have the opportunity to run it locally.

  • @maslaxali8826
    @maslaxali8826 Před 2 měsíci +22

    I was not expecting this.. Wow bro, natural interviewer

  • @alelondon23
    @alelondon23 Před 2 měsíci +10

    well done, Matthew! great interview.
    This guys at Groq are crushing it! Great attitude, OOTB thinking, hard work, letting their delivery speak for itself. A very refreshing alternative to the typical over-hyped promises of vaporware. Thank you, Groq!

  • @GuidedBreathing
    @GuidedBreathing Před 2 měsíci +10

    15:40 Holy grail of automated vectorizing compiler, threading multi core synchronization.. peak performance, kick the compiler to the side under the hood for finance applications.. Great interview thus far☺️ the repeating loop for reasoning is on hardware on the groq chip; yep that makes things allot faster and very exciting, to repeat itself for the reasoning👏👏👏good job

  • @74Gee
    @74Gee Před 2 měsíci +9

    When asked about running locally on a cellphone they skillfully avoided the fact that you need a rack of chips for inference - although working as an integrated system, the 500+ tokens per second come from around 500+ chips.

    • @ritteradam
      @ritteradam Před 2 měsíci

      Actually Igor answered honestly before Andrew took over: SRAM is much bigger than DRAM, so it’s not a good idea for LLMs.

    • @74Gee
      @74Gee Před 2 měsíci +1

      @@ritteradamThere are pros and cons. SRAM uses less power and produces less heat - so it's a good fit.
      The simple honest answer is you need hundreds of groq chips so it's not viable for personal computing. But that would be a hype-killer wouldn't it.

    • @MDougiamas
      @MDougiamas Před 2 měsíci

      Well but remember what they have is on 14nm … new chips are being designed for 2nm … Groq 3 might be vastly more portable and powerful

  • @aaronpitters
    @aaronpitters Před 2 měsíci +5

    Great interview! So the innovation to create a simpler design and faster chip came because they didn't have the money to hire people to create a traditional chip. Love that!

    • @albeit1
      @albeit1 Před 2 měsíci

      Constraints force people to innovate. The obstacle is often the way.

  • @albeit1
    @albeit1 Před 2 měsíci +2

    The traffic scheduling analogy is interesting. Each vehicle in every moment occupies a particular space. And no other vehicle can occupy it. If you can schedule all of them and every pedestrian, you can maximize throughput.
    That also reminds of one reason service oriented architecture work. Small web requests and small vehicles both get out of the way a lot faster. Two herds of mopeds crossing paths can do that a lot faster than two trains.

  • @gynthos6368
    @gynthos6368 Před 2 měsíci +26

    I just realised, you look like Jon from Garfield

  • @torarinvik4920
    @torarinvik4920 Před 2 měsíci +9

    Awesome, please do more of these "expert interviews" if you can :D

  • @adtiamzon3663
    @adtiamzon3663 Před 2 měsíci +2

    😍 Matt, sooo interesting and informative interview with the lead Groq engineers, Igor and Andrew! Easy to comprehend their presentation, indeed. Thank you, guys. Keep simplifying and relatable. Keep innovating. 🌞👏👏👍💐💞🕊

  • @aiAlchemyy
    @aiAlchemyy Před 2 měsíci +12

    Thats Some amazing Valuable content

  • @Alice8000
    @Alice8000 Před 2 měsíci +9

    Nice work Groq boys!

  • @SirajFlorida
    @SirajFlorida Před 2 měsíci +4

    Wow, great job on this interview. I've been really excited about Groq. Thumb clicked. LoL

  • @nicolashuray1356
    @nicolashuray1356 Před 2 měsíci +1

    Just wow ! Thanks Matt, Andrew and Igor for that incredible interview about Groq architecture. I'm just fascinated about the beauty of that design and all the uses cases it gonna unlock !

  • @NahFam13
    @NahFam13 Před 2 měsíci +2

    THIS IS THE CONTENT I WANTED TO SEE!!
    Dude I literally complained about a video you made and you have NO idea how happy it makes me to see you doing this interview and asking the types of questions I would ask.

  • @justinIrv1
    @justinIrv1 Před 2 měsíci +3

    Incredible interview! Thank you all.

  • @PLACEBOBECALP
    @PLACEBOBECALP Před 2 měsíci +8

    I think Matt was having the best day of his life talking to these 2 guys, I don't think that smile left Matt's face for the entire interview. Great interview, about time someone asked some questions that matter, instead of the parroted repetition of When will this and that be ready is it AGI, will robots call me nasty names behind my back?

    • @matthew_berman
      @matthew_berman  Před 2 měsíci +5

      Lol. Indeed I was having a blast!!

    • @PLACEBOBECALP
      @PLACEBOBECALP Před 2 měsíci +2

      @@matthew_berman Ha ha Me too man... well until my jaw hit the floor when he described the architecture of the chip at the smallest scale, 10,000 transistors fit in a single blood cell and they need to use Extreme Ultra Violet light... it truly blew my mind. Do you know if Moore's law allows for an additional reduction in scale, or is 4nm the limit, if this is the case, i assume the technology to build chips atom by atom must have been going on in the background for years in preparation for this long understood inevitability??

    • @Maelzelmusic
      @Maelzelmusic Před 2 měsíci

      To my understanding, you can go smaller. Up to 2 or 3 nm but there’s a point where the size gets so small that you enter into quantum realm wave/particle nature and then you get other problems, mainly related to cooling and interpretability of results. Im just going by memory here but you can research further in perplexity or other types of search. It’s a very interesting topic. PS Marques Brownlee/MKBHD has a great video on quantum computers actually.
      Cheers.

  • @joe_limon
    @joe_limon Před 2 měsíci +13

    This is the single greatest interview I have seen this year

    • @matthew_berman
      @matthew_berman  Před 2 měsíci +1

      thank you joe!

    • @joe_limon
      @joe_limon Před 2 měsíci +1

      @@matthew_berman I think the ai they described at the end could finally reliably answer the how many words in your next response question.

  • @TheJohnTyra
    @TheJohnTyra Před 2 měsíci +4

    This is fantastic Matt!! 🎉 Really enjoyed the technical deep dive on this hardware architecture. 🤓💯

  • @JMeyer-qj1pv
    @JMeyer-qj1pv Před 2 měsíci +6

    Nvidia announced that their upcoming Blackwell chip improves inference speed by 30x. I wonder if that will bring it close to Groq's inference speed or if Groq will still be faster. I'm also curious why the Groq architecture doesn't work for training LLMs.

    • @PaulStanish
      @PaulStanish Před 2 měsíci +1

      To the best of my knowledge, the memory doesn't need to change as much for backpropagation so they don't need to be as conservative with timing assumptions etc.

    • @seanyiu
      @seanyiu Před 2 měsíci

      The cost for GPU will always be much higher regardless of performance

  • @koen.mortier_fitchen
    @koen.mortier_fitchen Před 2 měsíci +1

    So cool this interview. I follow the Matt’s for all my Ai news: Matt Wolfe, Mattvidpro and Matthew 👌

  • @AIApplications-lg1ud
    @AIApplications-lg1ud Před měsícem

    Thank you! Awesome conversation! The idea that the Groq architecture would also yield better LLM answers and less hallucination is revolutionary.

  • @gkennedy_aiforsocialbenefit
    @gkennedy_aiforsocialbenefit Před 2 měsíci +1

    Truly incredible interview! wow! Andrew and Igor are brilliant, cool and humble...Just like you Matt. So refreshing. Really excited about the last question and answer concerning Agents. Deeply grateful to you and happy for you Matt. Have been following every video of yours from the onset.

  • @JoseP-cw3je
    @JoseP-cw3je Před 2 měsíci +10

    To run llama 70b unquantize with Groq cards of 230MB, you'd need a staggering 1,246 of them at $20K each - that's $25 million total. Their crazy 80TB/s bandwidth would let you run the entire model stupidly fast on this setup. But good luck with the 249kW power draw! For comparison a H100 for that same $25M, you get 833 units at $30K per GPU. Each H100 has "only" 80GB VRAM, so the 280GB model would need to be split across 3-4 GPUs. But with 833 GPUs, you could run around 238 instances insteadof just 1 with Groq. The H100 rig would still chug 583kW, so even if Groq cards can be 80x the speed of a H100 is still 3x behind the H100 in price per performance so to be competitive they would need to be close to 7k.

    • @diga4696
      @diga4696 Před 2 měsíci +1

      I would say close to 5k, Blackwell with its dgx stack is a ready to rack solution which will offer even better price per performance, and working with a familiar stack is huge for bigger clients

    • @dewardsteward6818
      @dewardsteward6818 Před 2 měsíci +2

      Please provide a legitimate source for the $20k. The mouser thing people point at is a joke.

    • @actepukc
      @actepukc Před 2 měsíci +1

      Haha, this breakdown does make you wonder what other burning questions Matt couldn't ask during the interview. Maybe Groq's pricing strategy will be revealed in the sequel, just like he hinted at follow-up questions?

  • @nuclear_AI
    @nuclear_AI Před 2 měsíci +2

    In the context of computing and chips, when folks talk about a 7 nm (nanometer) process or a 5 nm (nanometer) process, they’re referring to the size of the smallest feature that can be created on a chip. Smaller nanometer processes mean more transistors can be packed into the same space, leading to more powerful and efficient chips.
    I hope this helps visualize how incredibly small a nanometer is and the scale at which modern technology operates. It’s like a magical journey from the world we see down to the realm of atoms and molecules, all packed into the tiny silicon chips powering the gadgets we use every day!👇👇👇
    Imagine you have a meter stick. It’s about as long as a guitar or a bit taller than a large bottle of soda. That's our starting point: one meter.Meter (m) - Our starting point. Picture it as the height of a guitar.Decimeter (dm) - Divide that meter stick into 10 equal parts, and each part is a decimeter. Think of it like the length of a large notebook or a bit shorter than the width of your keyboard.Centimeter (cm) - Take one of those decimeters and chop it into 10 smaller pieces. Each piece is now a centimeter, roughly the width of your fingernail or a large paperclip.Millimeter (mm) - If we slice a centimeter into 10 tiny slivers, you get millimeters. That's about the thickness of a credit card or a heavy piece of cardboard.Now, hold onto your hat, because we're about to shrink down into the world of the incredibly tiny:Micrometer (µm) - Dive deeper and slice a millimeter into 1,000 pieces. Each piece is a micrometer, also known as a micron. You can't see these with your eyes alone; it’s about the size of bacteria or a strand of spider silk.Nanometer (nm) - And now, the star of our journey! Cut one of those micrometers into 1,000 even tinier pieces. These are nanometers. A nanometer is so small that it’s used to measure atoms, molecules, and the tiny features on computer chips that you mentioned. To put it in perspective, a human hair is about 80,000 to 100,000 nanometers wide. So, we’re talking seriously small scales here.

  • @autohmae
    @autohmae Před 2 měsíci

    Thanks for this interview ! Great to see you were able to get this interview. You can be proud. And even if their are things you don't know, this is often still very useful, asking simple questions, because it will let them think and speak instead of answering short questions.
    Regardless if they are a big deal or not, it helped me better understand the inefficiencies in the existing systems. Their might be many questions I would have asked that Matt wouldn't know where to start. Especially if I had time to think about them... but these more surface level questions are very useful. Because I knew parts were hand written/tuned and knew their is even a big research area just in networking things together, but not really got the big picture. Removing inefficiencies is a huge deal, removing a whole bunch of them at multiple levels is a game changer.
    Also shows, if an important part of CUDA is hand-written and it took so many man hours to make by really smart people, than it will mean AMD can't catch up as easily as many would like to see (their reasoning is: competition is good).

  • @jessicas-discoveries-age-6-12
    @jessicas-discoveries-age-6-12 Před 2 měsíci +1

    Great interview Matt really inciteful. Being able to talk to LLM's in real time will actually make it feel we are that much closer to AGI even if there is still work to do to make it happen in reality.

  • @kongchan437
    @kongchan437 Před 2 měsíci +1

    Great to hear more tech pioneers from U of T starting with Dr.Hinton himself. I remember our big Lisp manual was not like commercially published text book so maybe made by U of T researchers ? l remember seeing some very long Lisp program and wondered which grad student had that highly abstract recursive thinking ability.

  • @seancriggs
    @seancriggs Před 2 měsíci +1

    Outstanding content, Matt!
    Very well managed and explained.
    Thank you for doning this!

  • @planetchubby
    @planetchubby Před 2 měsíci +4

    this interview is awesome, really cool

  • @howardelton6273
    @howardelton6273 Před 2 měsíci +1

    Awesome interviewer achievement unlocked. This is a great format.

  • @cablackmon
    @cablackmon Před 2 měsíci

    This SUPER interesting and enlightening. Especially the part about how inference speed can affect the actual quality of the output. Thank you! Keep it up Matt!

  • @markwaller650
    @markwaller650 Před 2 měsíci

    Amazing interview and insights. Really interesting - how you asked the questions to make this accessible to us. Thank you all!

  • @RikHeijmen
    @RikHeijmen Před 2 měsíci +2

    Matt! Wow! Didz you find out more about the last thing they talked about? About feeding the answer multiple times and asking questions in a slightly different way? It seems like a new way of using the groq chat rather than a new model, right?

    • @unom8
      @unom8 Před 2 měsíci

      It sounds like energy based modelling, no?

  • @kumargaurav2170
    @kumargaurav2170 Před 2 měsíci

    Till date best video for providing insights about LPUs beyond just their faster inference speed. You should conduct more such videos as it unlocks so much of behind the scenes for Normal ppl. Outstanding video & Outstanding company groq 🙏🏻🙏🏻

  • @jimg8296
    @jimg8296 Před měsícem

    Fantastic interview. Learned so much. Thank you.

  • @instiinct_defi
    @instiinct_defi Před 2 měsíci +2

    Amazing, This content is greatly appreciated!🔥🔥

  • @ZeroIQ2
    @ZeroIQ2 Před 2 měsíci

    That was a great interview, so much interesting information, good job Matthew!

  • @user-eo1vg6oc3v
    @user-eo1vg6oc3v Před 2 měsíci +1

    An interesting combo of ideas presented one was using Claude 3 Opus to train the much smaller Claude 3 Haiku which makes it quicker by being smaller and prompt in step by step. Then it was suggested that adding quiet star to rethink before answering could make the answers 10-50% more accurate. This architecture on groq seems to simplify the traffic flow with ‘one way’ timed traffic. The final suggestion about reiterating the question could be solved by adding Quiet Star which automates that by directing a review of the whole process overall before answering which gave 10-50% more accuracy especially for math or code. So when will this be usable for the general public? -a Groc cloud app?

  • @Maelzelmusic
    @Maelzelmusic Před 2 měsíci

    Lovely video, Matt. Huge props for your evolution :).

  • @semeandovidaorg
    @semeandovidaorg Před 2 měsíci

    Great interview!!! Thank you!

  • @bladestarX
    @bladestarX Před 2 měsíci +1

    Great interview, Matt; you are the best. I think Groq helped create awareness about the benefits of designing and optimizing a chip for inference. However, wasn't this already known by leading companies like NVIDIA? GPUs just happened to be the most appropriate existing architecture that works best for AI training and inference. Remember, prior to ChatGPT, it was all about AI classification and training. Inference was just not a thing. I am not sure if it wasn't because of the focus on the need for inference; something like an LPU would simply not be justified for mass production. So, the reason why the big players don't have LPUs is simply because the demand for these LPUs was not there before ChatGPT woke up the world about LLMs. LPUs actually have a simpler architecture and fewer components than a general-purpose GPU. I believe Groq will benefit from being the first, but it will be very difficult to defend or keep up with the larger chip manufacturers as they have the infrastructure to create LPUs that will probably perform 10x faster than Groq's 14nm.

    • @GavinS363
      @GavinS363 Před 2 měsíci +2

      This comment doesn't make any sense, what infrastructure is it that you speak of Navita having that gives them a huge advantage in designing chips? I think you are mistakenly believing these companies such as grok and Nivita are not only designing these chips but manufacturing them as well, this is incorrect.
      The only company who both designs and manufacturers silicone is Intel, the rest all only design and then subcontract out Fabs. Usually it's TSCM, who only builds chips to spec and does not design themselves. That's how it is now and how it will remain in the foreseeable future. Trying to build a Fab without having access to Nation level money is basically impossible at this point.

    • @bladestarX
      @bladestarX Před 2 měsíci

      @@GavinS363 Everyone knows NVIDIA itself does not operate fabrication plants (fabs) for chip production but outsources the manufacturing to third-party foundries like TSMC and Samsung. they focus on design and development don’t they have facilities for research and development, testing, and other purposes related to their products and technologies? You don’t consider these critical infrastructure? How about their 30,000 employees including their scientists, engineers and architects. Do you think they can give them an advantage when designing LPUs? Not sure why you thought I was explicitly talking about fabs, specially on a video about chip design and architecture. Maybe I should have said chip producer instead of manufacturer?

    • @user-ey6fd9im8o
      @user-ey6fd9im8o Před 2 měsíci

      @@GavinS363 Make sure your spelling is correct first. NVIDIA not Nivita and TSMC not TSCM. >

  • @kingrara5758
    @kingrara5758 Před 2 měsíci

    great interview, so interesting. Loved seeing everyone's enthusiasm. Your videos are my favourite source of AI news. big thank you.

  • @BradleyKieser
    @BradleyKieser Před 2 měsíci

    Absolutely the best interview ever! WOW!

  • @charlestheodorezerner2365
    @charlestheodorezerner2365 Před 2 měsíci

    Love your content. Thank you for all you do. And I love Groq. This was a really fresh look into an area (namely, the inner workings of hardware) that is rarely covered. So this was great.
    One insane benefit to Groq that I wished you had asked about: energy consumption. I gather that Groq chips are not only vastly faster, they are also vastly more energy efficient-which is insane when you think about it. Typically, energy consumption increases significantly with increases in speed. (Compare a 4090 to a 4060). Not Groq. It’s blazingly fast while using a small fraction of the energy of a traditional GPU. This is a HUGE deal to me-not only because it decreases the cost of inference, but for environmental reasons. When you scale up the compute necessary to power the world’s inference needs, the energy impact is scary. I wouldn’t be surprised if AI inference becomes a greater source of green house gas emissions than automobile use in a few years. And if I understand it correctly, Groq chips are massively more ecologically friendly. Ultimately, that should be as big a deal as the speed itself. Would love to understsnd better why they are so much more efficient….

  • @glennm7086
    @glennm7086 Před 2 měsíci

    Perfect level of detail. I wanted an LPU primer.

  • @vinaynk
    @vinaynk Před měsícem

    Very informative. This thing will be the heart of skynet :)

  • @rikhoffbauer
    @rikhoffbauer Před 2 měsíci

    This is great! More like this! Very interesting and insightful

  • @darwinboor1300
    @darwinboor1300 Před 2 měsíci

    Thanks gentlemen,
    The comparison seems to be between a momentum bound industry locked to existing architectures and looking for better ways to play musical chairs with their data and a startup (Groq) practicing first principles to produce a new hardware model suited to the task at hand that moves data and results through memory and compute in multiple parallel queues.
    I look forward to seeing more from Groq.

  • @NoCodeFilmmaker
    @NoCodeFilmmaker Před 2 měsíci +2

    Their API is really competitive too

  • @RonLWilson
    @RonLWilson Před 2 měsíci +1

    Interesting!
    BTW, I spent my career with asynchronous software and synchronous software was a big no no in that it too rigidly coupled and we needed to handle sloppy data flows over a distributed architecture..
    That said we did write some of the drivers in hand written assembly language that was synchronous for the drivers where we needed the speed.

  • @831Miranda
    @831Miranda Před 2 měsíci

    Great interview! Very accessible info! 🎉❤

  • @albeit1
    @albeit1 Před 2 měsíci +1

    Creating hardware specifically designed to serve LLMs reminds me of why vertical integration works. Things get created or optimized to serve the mission. The company doesn’t have to adapt to how existing industries are doing things.

  • @savant_logics
    @savant_logics Před 2 měsíci +1

    Thanks! Great interview.👍

  • @jonniedarko
    @jonniedarko Před 2 měsíci

    by Far my most favorite video you have done! ❤

  • @elyakimlev
    @elyakimlev Před 2 měsíci +1

    Good interview. I just wish you hadn't mentioned phones. I really wanted to know if they could create a GPU size hardware for PC, that would outperform RTX 3090 at inference, while being able to run bigger models than the RTX can.

  • @coulterjb22
    @coulterjb22 Před 2 měsíci

    Great interview. I would have loved to hear how they are working on lowering manufacturing costs and when that might happen. My very limited knowledge is these chips are more expensive to make.

  • @netsi1964
    @netsi1964 Před 2 měsíci

    ARM originally was created the same way: design instructions first, then the hardware - it was also originally Acorn Risc Machine, as it was to be used inside the Acorn BBC microcomputer.

  • @scotlandcorpnaics2385
    @scotlandcorpnaics2385 Před měsícem

    Outstanding discussion!

  • @fpgamachine
    @fpgamachine Před 2 měsíci

    Very interesting talk, thanks!

  • @manishpugalia8559
    @manishpugalia8559 Před 2 měsíci

    Too good very good learning. Kudos

  • @JariVasell
    @JariVasell Před 2 měsíci +2

    Great interview! 🎉

  • @nicknick6464
    @nicknick6464 Před 2 měsíci +1

    Thanks for the great interview. I have a question. Since their chip is quite old (14nm), they must be thinking about an updated version based on 5nm or below. When it will be available in the future and how much faster it will be ?

  • @goodtothinkwith
    @goodtothinkwith Před 2 měsíci +1

    Great job Matt! It sounded like it would scale, but might be limited by the die size in the fab..? Is there a limit to how many chips can be chained together like one big chip? I.E., can many Groqs compete with Cerebras’ massive chips? When can we get an agent-based Llama 2 (or 3!) that had this kind of reflexive thinking that Andrew mentioned at the end? Good stuff!

    • @goodtothinkwith
      @goodtothinkwith Před 2 měsíci +1

      Maybe even more provocatively, if a bunch of Groqs were chained together to be the size of Cerebras’ chips, just how large of a LLM could it run?

  • @swamihuman9395
    @swamihuman9395 Před 2 měsíci

    - Fascinating.
    - Thx.

  • @AlexanderBukh
    @AlexanderBukh Před 2 měsíci +2

    well spoken, aaight

  • @frankjohannessen6383
    @frankjohannessen6383 Před 2 měsíci

    The fact that their chip is built on 14nm transistors is insane. That's what Nvidia used for the GTX 10-series back in 2017. Imagine how fast Groq would be with 4nm transistors.

  • @nvda2damoon
    @nvda2damoon Před měsícem

    fantastic interview!

  • @seamussmyth2312
    @seamussmyth2312 Před 2 měsíci +2

    Great interview 🎉

  • @Raskoll
    @Raskoll Před 2 měsíci +1

    These guys are actual geniuses

  • @jpdominator
    @jpdominator Před měsícem

    Simplicity never wins. Simplicity is using someone else’s library. Complexity is writing your own to increase performance. Complexity is going down several layers and working there. Igor did something extremely complex to create something more simple than the conventional.

  • @marktrued9497
    @marktrued9497 Před 2 měsíci

    Great interview!

  • @AncientSlugThrower
    @AncientSlugThrower Před 2 měsíci

    Great interview for a great channel.

  • @KitcloudkickerJr
    @KitcloudkickerJr Před 2 měsíci +1

    wonderful interview

  • @scott701230
    @scott701230 Před 2 měsíci +1

    Grog chip sounds amazing.

  • @ZychuPL100
    @ZychuPL100 Před 2 měsíci +2

    This sound like the LPU is a neuron! They basically created a Artificial Neuron that can be connected to other neurons, so this is like artificial brain. Awesome!

    • @executivelifehacks6747
      @executivelifehacks6747 Před 2 měsíci

      That is the sense I got too. Why is the human brain efficient? Lots of parallel computations, not overly fast. That being said it's not working the whole time at least not all of it AFAIK.

  • @shyama5612
    @shyama5612 Před 2 měsíci +1

    Would love a comparison between groq LPU and TPUv5p

  • @issiewizzie
    @issiewizzie Před 2 měsíci

    Great interview

  • @janewairimu5625
    @janewairimu5625 Před 2 měsíci

    These groups guys need funding in the billion to stop them giving into large corporate bullying..as happening to inflection and stability.
    Their work is so precious..yet tantalising to the big corporations..

  • @rbdvs67
    @rbdvs67 Před 2 měsíci

    I wonder what, if any, are the power requirement differences with the Groq architecture? Are they planning on making this on the more current 4-5 nano silicone? Amazing interview and very exciting.

  • @kostaspramatias320
    @kostaspramatias320 Před 2 měsíci +1

    Darn, that's gonna be epic!

  • @testchannel7896
    @testchannel7896 Před 2 měsíci +1

    great interview

  • @ArnoldJagt
    @ArnoldJagt Před 2 měsíci

    I have such a huge project for groq a soon as it can handle digesting a big chunk of software.

  • @arturoarturo2570
    @arturoarturo2570 Před 2 měsíci +1

    Súper instructive

  • @skitzobunitostudios7427
    @skitzobunitostudios7427 Před 2 měsíci

    Matt, are you going to interview 'Cerebras' next? I would like you to maybe get two chaps from each company in a cast with you and have a little 'Shoot Out' of thoughts.

  • @jayconne2303
    @jayconne2303 Před 2 měsíci

    Very nice model of traffic at an intersection.

  • @ryzikx
    @ryzikx Před 2 měsíci +2

    very good fantastic content 🤯🤯

  • @KCM25NJL
    @KCM25NJL Před 2 měsíci

    Man, the Groq7B PCI-e accelerator card would be such an easy win..... guess we can keep dreaming :)

  • @jayeifler8812
    @jayeifler8812 Před měsícem

    Speed is important but so is energy efficiency.

  • @Artfully83
    @Artfully83 Před 2 měsíci

    Ty

  • @segelmark
    @segelmark Před 2 měsíci

    Cool that they achieve this on a 16nm architecture, without knowing anything it feels like they might be able to get ~2x more performance and ~4x less power usage and size of just moving to the leading edge.

  • @PseudoProphet
    @PseudoProphet Před 2 měsíci +6

    Elon should buy groq and run Grok on it. 😂😂

    • @caseyvallett8953
      @caseyvallett8953 Před 2 měsíci +1

      Oh the poetic justice that would be! At least use their architecture/hardware...

  • @transquantrademarkquantumf8894

    Nice Show

  • @1242elena
    @1242elena Před 2 měsíci

    That's awesome 😎

  • @sherpya
    @sherpya Před 2 měsíci

    essentially their architecture does need a complex scheduler to coordinate "nodes", true parallelism

  • @mithralforge8160
    @mithralforge8160 Před 2 měsíci

    No mention of the Blackwell announcement this week?

  • @CalinColdea
    @CalinColdea Před 2 měsíci +2

    Thanks