Why does this GPU have an SSD? - AMD Radeon Pro SSG

Sdílet
Vložit
  • čas přidán 4. 07. 2024
  • Get $25 off all pairs of Vessi Footwear with offer code LinusTechTips at www.Vessi.com/LinusTechTips
    SmartDeploy: Claim your FREE IT software (worth $580!) at lmg.gg/OTTP7
    AMD announced the Radeon Pro SSG in 2016 combining a GPU and onboard SSD - But when it launched in 2017, practically nobody bought it. Was it simply ahead of its time, or was it truly a dud?
    Discuss on the forum: linustechtips.com/topic/14183...
    Check out the Radeon™ Pro SSG: geni.us/yQ43dMY
    Buy some LTT Store Dot Com Cable Ties: www.lttstore.com/products/cab...
    Buy an Intel Core i9-12900K: geni.us/hrzU
    Buy an ASUS TUF Z690-PLUS WIFI D4: geni.us/mgWYr2
    Buy a Noctua NH-D15: geni.us/vnuvpW
    Buy a Corsair Force MP600: geni.us/TkxIgO
    Purchases made through some store links may provide some compensation to Linus Media Group.
    ► GET MERCH: lttstore.com
    ► AFFILIATES, SPONSORS & REFERRALS: lmg.gg/sponsors
    ► PODCAST GEAR: lmg.gg/podcastgear
    ► SUPPORT US ON FLOATPLANE: www.floatplane.com/
    FOLLOW US ELSEWHERE
    ---------------------------------------------------
    Twitter: / linustech
    Facebook: / linustech
    Instagram: / linustech
    TikTok: / linustech
    Twitch: / linustech
    MUSIC CREDIT
    ---------------------------------------------------
    Intro: Laszlo - Supernova
    Video Link: • [Electro] - Laszlo - S...
    iTunes Download Link: itunes.apple.com/us/album/sup...
    Artist Link: / laszlomusic
    Outro: Approaching Nirvana - Sugar High
    Video Link: • Sugar High - Approachi...
    Listen on Spotify: spoti.fi/UxWkUw
    Artist Link: / approachingnirvana
    Intro animation by MBarek Abdelwassaa / mbarek_abdel
    Monitor And Keyboard by vadimmihalkevich / CC BY 4.0 geni.us/PgGWp
    Mechanical RGB Keyboard by BigBrotherECE / CC BY 4.0 geni.us/mj6pHk4
    Mouse Gamer free Model By Oscar Creativo / CC BY 4.0 geni.us/Ps3XfE
    CHAPTERS
    ---------------------------------------------------
    0:00 Intro
    0:53 What is an... SSG?
    1:27 SSD performance
    2:06 Is this like DirectStorage?
    2:55 The SSG API
    4:11 Enter Adobe
    5:00 But... Why?
    6:02 Can we... Upgrade it?
    7:04 Why is direct-to-GPU storage important?
    7:58 Conclusion - Why it won't come back
  • Věda a technologie

Komentáře • 2,2K

  • @alphapuggle
    @alphapuggle Před 2 lety +4115

    I'm glad there's finally an answer to why that GPU does

    • @32bites
      @32bites Před 2 lety +80

      Has Anyone Really Been Far Even as Decided to Use Even Go Want to do Look More Like?

    • @noamtsur
      @noamtsur Před 2 lety +111

      but no one ever asks how is GPU

    • @brownie2648
      @brownie2648 Před 2 lety +43

      @@32bites r/ihadastroke

    • @brownie2648
      @brownie2648 Před 2 lety +3

      @@noamtsur the gpu iz brocken :((((

    • @tobiwonkanogy2975
      @tobiwonkanogy2975 Před 2 lety +1

      this gpu does well. Gpu fricks.

  • @Legatron17
    @Legatron17 Před 2 lety +5870

    Truly opened my eyes as to why the GPU does

  • @loicgregoire3058
    @loicgregoire3058 Před 2 lety +103

    Those would be crazy useful in AI Application, imagine loading all your dataset in you GPU once without having to reload it for each training iteration

    • @steevem4990
      @steevem4990 Před 2 lety +2

      i actually thought it was where they where going to when antony switched the ssd's

    • @fluidthought42
      @fluidthought42 Před rokem

      Especially now that ML has moved away from NVIDIA proprietary tech

  • @davidg5898
    @davidg5898 Před 2 lety +389

    If the API was actually widely rolled out, something like this would be incredibly useful for science departments at universities (which is a niche market, but not an insubstantial one).

    • @jinxash3580
      @jinxash3580 Před 2 lety +23

      Would have been Ideal to train models for Deep Learning using this GPU

    • @davidg5898
      @davidg5898 Před 2 lety +16

      @Sappho I was doing astronomical simulations (my work was more with globular cluster formation and evolution, 10s of thousands to millions of stars, sometimes coupled with a molecular cloud during early formation) and there definitely would have been a performance boost if the read-out/save time of each time slice could have been sped up by having the GPU dumped it straight to storage.
      Just as you also described, most of my work was done on a Beowolf cluster with a lot of high powered off-the-shelf GPUs.

    • @TheCodyLaxton
      @TheCodyLaxton Před 2 lety +7

      Niche but actually large market haha, I worked in university research at the National Weather Center. Basically any excuse to build something cool for research is the path most travelled by pre-doctoral and under grads.

    • @lazertroll702
      @lazertroll702 Před 2 lety

      meh ... there's only so many times storage malware can give the feelz...
      although .. iffins the gpu executed from that storage on init ... 🤔

  • @pezz1232
    @pezz1232 Před 2 lety +496

    I wonder how long until the title changes lol

  • @mirandahw
    @mirandahw Před 2 lety +578

    Ah, yes, "Why does this GPU"
    Gotta love the original titles...

  • @joshuatyler4657
    @joshuatyler4657 Před 2 lety +380

    Imagine being the engineering team that made all of that work, only for the industry to say "ok. cool."

    • @mnomadvfx
      @mnomadvfx Před 2 lety +15

      The people that actually buy them for working on appreciate it.
      The people who just commentate on it are not "the industry" at all, just glorified journalists.

    • @JonatasAdoM
      @JonatasAdoM Před 2 lety +2

      @@mnomadvfx If the industry needed it it would exist. Just look how much better enterprise tools and systems are.
      Also, a victory against ever more complex and complicated hardware.

    • @paulpoco22
      @paulpoco22 Před 2 lety +3

      Just like VESA Local Bus video cards

    • @lazertroll702
      @lazertroll702 Před 2 lety

      @@JonatasAdoM aye! #KludgeDeath

  • @tranquilitybase8100
    @tranquilitybase8100 Před 2 lety +54

    This technology eventually found a home... AMD used it in the PS5 and Xbox Series. Both systems can load into RAM directly from the SSD, bypassing the CPU.

    • @Pacbandit13
      @Pacbandit13 Před 2 lety +1

      Is that true?

    • @louism771
      @louism771 Před 2 lety +6

      Well, this lead to Direct Storage, that we have today on this consoles, and soon will have on Windows 11. Pretty much the same idea, technologically a bit different i guess.

    • @derpythecate6842
      @derpythecate6842 Před 2 lety +2

      I think bypassing the CPU is difficult/insecure, and I did some research and was right.
      Complete CPU bypass would mean being able to skip the kernel layer, and all the security checks which gives you arbitrary read/write to memory from disk that should be minimized as it provides a loophole.
      What DirectStorage does is simply use the GPU to decompress compressed data sent over the bus from the SSD, which then hands it to the CPU to decode and execute. This basically just speeds up the data retrieval pipeline, but doesn't not expose any loopholes as it is fundamentally the same underlying mechanism all computers use today to fetch data.
      The AMD SSG card in the video can do such caching as the GPU doesn't execute the kernel code, which means that while you can still write malicious code in to target the GPU, it's way more self contained than executing it directly on the CPU which takes control of all your processes including your OS.

  • @mini-_
    @mini-_ Před 2 lety +1385

    Everyone always ask "Why does this GPU?", but never asks "How is this GPU?" 😔

  • @Steamrick
    @Steamrick Před 2 lety +229

    'Why does this' indeed. I think the question the manufacturer asked is 'why not?'.

    • @Worthy_Edge
      @Worthy_Edge Před 2 lety +9

      @Czarodziej stronger than my will to click this link and be greeted by cringe

    • @nikoheino3927
      @nikoheino3927 Před 2 lety

      @Czarodziej stronger than my will to live after reading your comment.

    • @nobody7817
      @nobody7817 Před 2 lety

      @@Worthy_Edge I think you meant to say "greeted" but still that was a HILARIOUS retort!"

  • @Tsudico
    @Tsudico Před 2 lety +74

    I wonder if they had done something like a separate PCIe daughter card with something similar to the SLI or Crossfire interfaces would have worked better. It wouldn't have shared bandwidth across the PCIe bus but still allowed direct access to the SSDs installed.

    • @ryanchappell5962
      @ryanchappell5962 Před 2 lety +3

      Hey this seems like a really cool idea.

    • @jeromenancyfr
      @jeromenancyfr Před 2 lety +3

      I am not sure I understand , aren't the SLI and Crossfire interface very slow by themselves ? The data would move through PCIE... Like any NVME drive

    • @Rakkakaze
      @Rakkakaze Před 2 lety +1

      @@jeromenancyfr I imagine the idea is, GPU calls for data, pass over link, then out through pci... CPU calls for data, pass through pci.

    • @mnomadvfx
      @mnomadvfx Před 2 lety

      That eats into compute density though if you want to have several of them per node.

    • @legendp2011
      @legendp2011 Před 2 lety

      well my understanding is thats basically how nvme direct storage drives are going to work (and are already working like that in the ps5)

  • @cleverclever2317
    @cleverclever2317 Před 2 lety +2

    7:58 hello back mr editor

  • @BurntFaceMan
    @BurntFaceMan Před 2 lety +312

    "This is hilarious" : "You can totally run your system without any additional storage as long as you are ok with the overhead of sharing bandwidth with a GPU." Anthony's sense of humour differs from most. Moss would be proud.

    • @saikanzen1762
      @saikanzen1762 Před 2 lety +9

      I read that in Moss' voice and I cannot agree more.

    • @WarPigstheHun
      @WarPigstheHun Před 2 lety +5

      Goddamn Moss and his old Fetts..

    • @adriancoanda9227
      @adriancoanda9227 Před rokem

      Ah if you have pci-e 64x because most used programs task in ram after that the boot disk is barely used you can use such cards in low profile setups without chiping uou the gpu cooling solution if it had also a cpu socket to handle the graphics management would ve something

  • @watercannonscollaboration2281

    I learned that this was a thing a month ago when doing research on the WX 2100 and I’m surprised no major tech channel did something funny with it

    • @cool-soap
      @cool-soap Před 2 lety +13

      @@joz534 no, run.

    • @cmd8086
      @cmd8086 Před 2 lety +3

      Maybe because it is too expensive?

  • @hdrenginedevelopment7507
    @hdrenginedevelopment7507 Před 2 lety +3

    That kind of reminds me of the old Real3D Starfighter PCI Intel i740 gfx card from wayyyy back in the day. Intel had just released the AGP bus architecture and the i740 was their first foray into the discrete graphics space…probably to help support the launch of AGP 1X, because it wasn’t all that fast otherwise. For the majority of the non-AGP systems, Real3D built a card with an AGP-PCI bridge chip that basically had an AGP bus and dedicated SDRAM AGP texture memory on board, in addition to the i740’s local SGRAM framebuffer RAM like any other graphics card. It was pretty cool at the time. They were sold with 4-8 MB framebuffer plus 8-16 MB AGP texture memory for a max of whopping 24 MB total onboard. They weren’t very fast, but they supported 1024x1024 px texture tile resolution whereas the vast majority of the competition including 3DFX only supported 256x256 pixels max resolution texture tiles. It was slow, but it looked so much better than anything else on the market and helped milk some extra capability from old non-AGP slot systems…perfect tradeoff people like Nintendo 64 players were used to dealing with, lol. 3DFX Voodoo 2 cards had a similar structure with separate RAM for framebuffer and texturing. Ok, now I’m done dating myself 😂

  • @vectrobe
    @vectrobe Před 2 lety +11

    theres also one thing that wasnt really mentioned and that they launched EPYC and threadripper around the same time, which effectively provided the same functionality. This card was in a timeframe where NVMe RAID's were an amazing concept but the PCIe lanes needed for it were often hard to come by, even on the xeon and opteron series

  • @Hobo_X
    @Hobo_X Před 2 lety +743

    I honestly wondered if this GPU was actually hiding a secret, that Microsoft had these to base DirectStorage work off of for all these years while they worked on it. Maybe now that it's finally public and AMD has actual tangible research into this as the product actually exists... well, I don't know... imagine if RDNA3 has this as a surprise feature to work amazingly with DirectStorage?!

    • @nielsbishere
      @nielsbishere Před 2 lety +51

      This is exactly what's needed to blow graphics to a new area, think of the huge scenes you could render

    • @epobirs
      @epobirs Před 2 lety +39

      The beta setup for DirectStorage used Nvidia RTX cards, as Nvidia was already doing work in the same direction for RTX I/O, aimed at the workstation market. Remember, they needed something that was going to work in the PCs people will own in the foreseeable future rather than create something requiring a costly niche hardware design. If Microsoft used them in R&D at all, it was more likely for the Series X/S Velocity Architecture, as a proposed console design was less sensitive to non-standard hardware if the cost was good. Even then, this wasn't very close as the major component there (and in the PS5) is the controller functionality with the dedicated decompression block. Offloading those operations from CPU and GPU are a big factor in letting the console perform optimally.
      I strongly suspect that Microsoft and AMD will try to push an open standard for a PC hardware spec that will bring a version of Velocity Architecture to PC to give DirectStorage the full functionality it has on Xbox. This needs to be a vendor independent spec to get Intel and Nvidia on board, otherwise it's will remain a niche that game developers will be reluctant to use. A recent previous example would be DirectML, which is hardware agnostic and relies on the drivers to bridge the gap between PCs and vendors of ML focused hardware. Thus the ML hardware can live in the CPU, GPU, or a separate device on the PCIe bus, the user doesn't need to know so long as the driver tells the system what to look for and how to talk to it.

    • @kkon5ti
      @kkon5ti Před 2 lety +2

      This would be amazing

    • @erdem--
      @erdem-- Před 2 lety +7

      At this point, i think we don't need CPU's. It is cheaper and better (for gaming) to produce all in one APU designs, like how PS5 and other game consoles are designed.

    • @zaidlacksalastname4905
      @zaidlacksalastname4905 Před 2 lety +2

      According to some market analysts, top RDNA4 could come with 512 gigs of pcie gen 4 memory

  • @EvanMorgoch
    @EvanMorgoch Před 2 lety +212

    With respect to the random read speeds (1:41); Why not test the drives independently from the SSG, or use MP600 drives in the SSG to get a proper apples to apples comparison? The drives firmware themselves may just be crap and account for why the random speeds don't scale nearly as well.

    • @ThranMaru
      @ThranMaru Před 2 lety +3

      Ain't nobody got time for that.

    • @flandrble
      @flandrble Před 2 lety +4

      Because driver overhead for RAID increases latency. on AM4 you're loosing approx 30% of your IOPs even if all your SSDs are connected to the CPU and not chipset. Intel is no where near this bad (same with Windows) but it's still a loss.

    • @ayoubboulehfa3932
      @ayoubboulehfa3932 Před 2 lety +4

      @@ThranMaru well they tested a GPU from 2017 that no one have, so yes they have time,

    • @bigweeweehaver
      @bigweeweehaver Před 2 lety +2

      @@ayoubboulehfa3932 has nothing to do with time and more with uniqueness to interest the viewer into clicking on the video.

    • @I2obiNtube
      @I2obiNtube Před 2 lety +1

      Because then you'd just be testing drive performance which wouldn't make sense. It's end to end testing

  • @andrewbrooks2001
    @andrewbrooks2001 Před 2 lety

    Great video and information presentation! Thank you!

  • @SuperLarryJo
    @SuperLarryJo Před 2 lety

    So good to see Anthony back on Anthony Tech Tips

  • @scorch855
    @scorch855 Před 2 lety +669

    If ML libraries target this platform it seems like it could be a compelling option. Now days models are getting so large that even 24gig of VRAM is not enough. Yes the performance would undoubtedly be worse using SSDs but the alternative is not being able to use the model at all.

  • @CoolJosh3k
    @CoolJosh3k Před 2 lety +293

    Actually handy for when your motherboard does not have enough m.2 slots.
    You can buy these as just the raid 0 cards that will plug in and use PCIx4.

    • @WayStedYou
      @WayStedYou Před 2 lety +30

      They could literally give you more m.2 if they gave you a pcie card with m.2 slots

    • @upperjohn117aka
      @upperjohn117aka Před 2 lety +35

      @@WayStedYou but those dont look cool

    • @Bobis32
      @Bobis32 Před 2 lety +35

      @@WayStedYou as someone who uses an itx system when pcie5.0 comes out i really hope something like this comes out as even gpus barely use the extra bandwidth from pcie4.0 why not put some m.2 slots on gpu's espessialy with MDA coming in the near future

    • @somefish9147
      @somefish9147 Před 2 lety

      @@Bobis32 power and bandwith

    • @virtualtools_3021
      @virtualtools_3021 Před 2 lety +11

      @@somefish9147 oh yeah bc SSD use sooooooooooooooooooooo much power

  • @thesix______
    @thesix______ Před 2 lety

    7:55 thx editor

  • @xaytana
    @xaytana Před 2 lety +3

    I'd be curious to see this concept again once m.2 key f finally sees some use; though if we never see a future where there's high bandwidth busses with tight memory timings, essentially combining what GPUs and CPUs like, this concept should be put off to key H, J, K, or L, to not confuse high bandwidth GPU memory with tight timing CPU memory on key f, assuming a future memory standard ever actually makes the switch. Though with how fast devices are becoming, it'd be cool to see a unified memory-storage platform where the only difference is if the chip itself is considered volatile or not, essentially the original concept of Optane on steroids; this would also be cool if there's semi-volatile chips where a sudden shutdown could retain otherwise volatile data.

  • @jseen9568
    @jseen9568 Před 2 lety +216

    When PCIe Gen 4 first came out, everyone was saying how it wasn't practical because it would be used fully. I said then that what would be more interesting if you saw some instances where multiple uses through a single PCIe 16x slot could take place without any hindering in performance. This would be one of those scenarios. not useful, but pretty cool.

    • @BrentLobegeier
      @BrentLobegeier Před 2 lety +29

      Couldn't agree more. When someone made a car, everyone said horses were better. Without manufacturers trying things outside of the box we would never progress, and I have no idea why everyone is so against innovation. Noone is forcing anyone to become early adopters of anything, and most things people were skeptical about soon became integral to everyday life. With progression comes niche products like this, but at least we can say they are trying.

    • @bojinglebells
      @bojinglebells Před 2 lety +15

      and now we're up to PCIe 5.0 with Alder Lake...there's even consideration to adjust NVMe storage standards from 4 lanes down to 2 because of how much bandwidth 4.0 and now 5.0 offer.
      I would love a product like this if only to gain more NVMe storage without taking up extra slots

    • @CheapSushi
      @CheapSushi Před 2 lety +4

      @@bojinglebells same, I love the dual functionality. I get a pretty decent GPU and 4 NVMe slots in two PCIe slots instead of three if I had to get a separate addon card. I personally love using up all my 7 slots with lots of cards.

    • @jseen9568
      @jseen9568 Před 2 lety +2

      @@bojinglebells And I think about some more niche area like small form factor PCs and even the NUC extreme. With the speed and bandwidth increases, these types of compute cards could make for near instantaneous connections and make those types of products more viable

  • @DasFuechschen
    @DasFuechschen Před 2 lety +212

    I remember the launch event of this this at siggraph. AMD „gifted“ some of those cards to RED which then gave them to some Indian filmmakers which had previously betatested the card in animation and editing one of their movies if I remember correctly. But TBH, I have more memories of the after-party than the event itself.

  • @ravencorvus7903
    @ravencorvus7903 Před 2 lety

    Really needed some Anthony today. Not disappointed.

  • @beythastar
    @beythastar Před 2 lety +1

    I've been waiting for this video for such a long time! Finally, I can see how it performs!

  • @seireiart
    @seireiart Před 2 lety +39

    "Why does this GPU?!!"
    Great question.

    • @Worthy_Edge
      @Worthy_Edge Před 2 lety +3

      Only 11 minutes and there’s already 2 bot replies

    • @seireiart
      @seireiart Před 2 lety

      @@Worthy_Edge These bots can't just chill. Can they?!!

  • @GuusKlaas
    @GuusKlaas Před 2 lety +48

    Man, from what I recall, this thing was baller for Revit/CAD work. Those needed the entire model in VRAM, and it'd be a massive hurdle to do that over SSD > CPU > MEM > GPU. This was pre-host bus controller, which is the 'not as fancy' name for directstorage. Allowing devices 'other' than the main controller in a PCIe network to take control of another device. Like a GPU just... assuming direct control of an SSD (after some mediation obv) to just load stuff off without the big overhead. Obviously since then we also got (first on AMD, later on Intel) SSD's direct on CPU, rather than a PCH in-between (like Intel had until recently when they figured out that just 16 lanes from CPU was not enough).

    • @MrTrilbe
      @MrTrilbe Před 2 lety +6

      I was kinda thinking the same, or using it for parraralised ML or big data applications it is a WS card after all, running an openCL coded ML algorithm direct from 2TB of fast storage on the GPU, that's a lot of test data.

    • @ProjectPhysX
      @ProjectPhysX Před 2 lety +7

      It's very interesting for computational fluid dynamics too. Although there are ways to make CFD codes require less VRAM (demos on my YT channel), you always want maximum resolution possible. You could do colossal resolutions with TBs of storage. But in practice it's unusable, because even the 14GB/s is extremely slow, and you would rewrite this storage 24/7 which would quickly degrade/break it. With VRAM directly, you can do 400-3300 GB/s. So the SSG never really took off.

    • @Double_Vision
      @Double_Vision Před 2 lety +1

      I occasionally deal with massive scenes and mesh or particle caches in Redshift for Maya, and Redshift could use this for sure! The same goes for trying to use Redshift to render massive print images where Redshift's out-of-core technology could benefit from having all this storage connected directly to the GPU core. No more Out-Of-Memory failures!

  • @bertoonz
    @bertoonz Před 2 lety

    Hey. you rocking this video man! Nice hosting

  • @chadlumpkin2375
    @chadlumpkin2375 Před 2 lety +2

    This reminds me of the Intel math coprocessors for the 286/386 CPU's Before floating point unit (FPU) processing became the default for all X86 processors. With the 486 Intel introduced the 486DX with the FPU and the 486SX with the FPU disabled.

  • @fuzzynine
    @fuzzynine Před 2 lety +43

    Boy, this is awesome. I wish you would show more obscure tech. I feel like watching retro computer channels right now. Only with new stuff. :D
    Thanks. This is really awesome!

  • @zoey.steelimus
    @zoey.steelimus Před 2 lety +55

    LTT: "Why does this GPU?"
    Me: "Yes, but have you considered HOW the GPU does?"

  • @NdxtremePro
    @NdxtremePro Před 2 lety +7

    I wonder how hard implementing a Direct Storage layer over the API would be.

    • @mnomadvfx
      @mnomadvfx Před 2 lety

      Probably easier because you are cutting out a middle man - though there might be some latency introduced as they communicate with each other.

  • @JedismyPet
    @JedismyPet Před 2 lety

    whoever did the ad animation i love you for adding Saitama

  • @vgaggia
    @vgaggia Před 2 lety +35

    I wonder how it'd work with deep learning stuff, if the memory capacity would outweigh the speed.

    • @ilyearer
      @ilyearer Před 2 lety +16

      I was surprised there was no mention of that potential application as well.

    • @ZandarKoad
      @ZandarKoad Před 2 lety +6

      @@ilyearer Same. Seriously looking hard at this card now, since memory size is an upper limit on the types of existing neural nets you can fine tune. RTX 3090 has only 24 Gigs compared to this, 2048 Gigs. Yikes.

  • @kaseyboles30
    @kaseyboles30 Před 2 lety +20

    Adding modular storage to a GPU makes sense if it's directly useable by the GPU itself. A game could preload the textures and models to the storage and use them from there similar to how direct storage works, but potentially faster and lower latency.

    • @ProjectPhysX
      @ProjectPhysX Před 2 lety +4

      It's very interesting for computational fluid dynamics too. Although there are ways to make CFD codes require less VRAM (see the demos on my YT channel), you always want maximum resolution possible. You could do colossal resolutions with TBs of storage. But in practice it's unusable, because even the 14GB/s is extremely slow, and you would rewrite this storage 24/7 which would quickly degrade/break it. With VRAM directly, you can do 400-2000 GB/s. So the SSG never really took off.

    • @vamwolf
      @vamwolf Před 2 lety +1

      Yes and no. Games still use sdr texture to the point ...hd assets are no worth it atm.

    • @kaseyboles30
      @kaseyboles30 Před 2 lety +4

      @@ProjectPhysX for applications like that where data is re-written constantly I think just adding sodimm slots for ddr 5 would be ideal. With 4 slots you could add a ton of ram. Not as fast as the gddr ram, but good enough to worthwhile.

    • @ravenof1985
      @ravenof1985 Před 2 lety +1

      @@kaseyboles30 i feel this is the answer for a lot of GPU applications, from low budget cards (4GB VRAM not enough anymore, pop a desktop DIMM in the expansion slot) to the high end, populate all 16+ DIMM slots for maximum AI/machine learning/CFD performance.

    • @aravindpallippara1577
      @aravindpallippara1577 Před 2 lety +1

      @@ravenof1985 aye would be faster and cheaper in the long run - though you aren't breaking grounds in vram unless going for an hedt with threadripper cpu or something

  • @Sencess
    @Sencess Před 2 lety

    0:01 LOL whose idea was that, EPIC intro

  • @SJA962
    @SJA962 Před 2 lety

    You look so much comfortable about sponsors. Good job.

  • @MerpSquirrel
    @MerpSquirrel Před 2 lety +10

    I could see this being used for machine learning or data analysis for Microsoft R. Good usecase for direct storage.

    • @willgilliam9053
      @willgilliam9053 Před 2 lety

      train a model with very limited host CPU usage... ya that would be cool

  • @jfolz
    @jfolz Před 2 lety +20

    Everyone asks "Why does GPU?"
    Nobody asks "How does GPU?"

  • @vladislavkaras491
    @vladislavkaras491 Před 2 lety

    Interesting.
    Thanks for the video!

  • @Michplay
    @Michplay Před 2 lety +4

    it just amazes me that Direct Storage / RTX IO is taking this long for a demo to test with

  • @thatsgottahurt
    @thatsgottahurt Před 2 lety +3

    Hope to see some Direct Storage content soon.

  • @benjaminlynch9958
    @benjaminlynch9958 Před 2 lety +48

    Huge use case for AI training. Anything over 80GB of memory means training has to move from GPU’s to CPU today, and that means a slowdown by multiple orders of magnitude. Unfortunate AMD has never had any real market share in the AI/ML world because their software support - even in 2020 - sucks.

    • @ManuSaraswat
      @ManuSaraswat Před 2 lety +5

      how about in 2022?

    • @WisestPongo
      @WisestPongo Před 2 lety

      @@RyTrapp0 ye but intel bad

    • @fernbear3950
      @fernbear3950 Před 2 lety

      Wearout makes it a nonstarter, for inference though maybe could be a monster in the right circumstances.

    • @ProjectPhysX
      @ProjectPhysX Před 2 lety +4

      AMD has introduced their new MI250X GPU with 128 GB memory.
      But still you can never have enough memory. I'm working with CFD (see my YT channel), and there it's the same problem: You always want maximum resolution possible. You could do colossal resolutions with TBs of storage. But in practice the SSG is unusable, because even the 14GB/s is extremely slow, and you would rewrite this storage 24/7 which would quickly degrade/break it. With VRAM directly, you can do 400-2000 GB/s. So the SSG never really took off.

    • @ZandarKoad
      @ZandarKoad Před 2 lety +1

      @@ProjectPhysX Thanks, I figured as much. It's a shame. Memory in the TB range truly opens up new possibilities for deep learning.

  • @sirfer6969
    @sirfer6969 Před 2 lety

    Love your work Anthony, keep it up =)

    • @text-85367
      @text-85367 Před 2 lety

      Congratulations⤴️contact on claming your prize.

  • @mrkezada5810
    @mrkezada5810 Před 2 lety +3

    I think one of the most productive uses for this GPU is to enable fast Unified memory accesses to memory when programming with OpenCL or something like that. Although that is a really niche and low level use case, mostly investigation-focused.

  • @abhivaryakumar3107
    @abhivaryakumar3107 Před 2 lety +6

    Ngl Anthony is my favourite LTT member and it makes me so happy whenever I see his face in a thumbnail:))

    • @LordYamcha
      @LordYamcha Před 2 lety +5

      Same but these bots goddamnit

    • @abhivaryakumar3107
      @abhivaryakumar3107 Před 2 lety +4

      @@LordYamcha I stg what the absolute fuck is this, commented 2 minutes ago and there are already 2 bots

  • @00kidney
    @00kidney Před 2 lety +6

    Everyone is asking "Why does this GPU?" but I'm just glad to see an upload featuring Anthony.

  • @490o
    @490o Před 2 lety

    I like how you guys kept that title

  • @keldwikchaldain9545
    @keldwikchaldain9545 Před 2 lety +2

    When I saw that board I thought they were gonna have a complex memory controller that'd drive the nvme drives with the normal ddr memory as a cache, not as literal storage devices sitting on the gpu for fast load times.

  • @kevinheimann7664
    @kevinheimann7664 Před 2 lety +7

    Would be intresting if such a Idea would be combined with optane memory that with a driver using it as 2nd level ram

    • @suhendiabdulah6061
      @suhendiabdulah6061 Před 2 lety

      What did you mean with optane? Is optane can store data? Sory if i am wrong

  • @markk8104
    @markk8104 Před 2 lety +6

    Did you try seeing if one of the versions of graphics card powered SQL works well on this? Current issue with this is the data transfer speed with the CPU step involved. So might be worthwhile trying that.

    • @cheeseisgud7311
      @cheeseisgud7311 Před 2 lety +1

      It really wouldn't help unless the sql server used the api this GPU needs to directly access the file

  • @floogulinc
    @floogulinc Před 2 lety +1

    Actually this design is kinda awesome for mini itx machines where storage expansion is very limited and you're already using your only PCIe slot for the GPU.

  • @genesisphoenix00
    @genesisphoenix00 Před 2 lety

    For someone who used to built sff this would be a godsend in 2017 infact even now it still good, i had a dan case a4 sfx and most of it volume is dedicated for gpu and cpu and yes you can cramp in 3 2.5 drive but boy you need custom cable for everything including for mb, cpu, gpu to make space for the drive. even lian li tu105 also had 1-2 drive mount and itx mb low end one come with maybe 1 and high end with 2 m.2, having this would solve so much of space issues for me, my steam library already 6TB

  • @juniperburton7693
    @juniperburton7693 Před 2 lety +3

    I do like this idea. Would be cool to see it come back. This would be really great, actually, for space confined builds. It seems... unique

  • @user-rd3jw7pv7i
    @user-rd3jw7pv7i Před 2 lety +9

    I can see this being used for ONE specific use case. Instead of having a separate SSD-in-one-enclosure and GPU and taking more than 1 or 2 PCIE slot, i.e. 1 LIQID Honey badger and 1 GPU, just use this!
    This card actually make sense and I'm sad to see this tech not taking off because if you know how and why to use it, this is revolutionary!

    • @Craft97pl
      @Craft97pl Před 2 lety

      with direct storage sharing bandwidth with ssd is no problem. Problem is gpu it self. In few yers it will suck.

    • @adriancoanda9227
      @adriancoanda9227 Před rokem

      @GoSite solder a better one flash a model firmware done or you can adapt a Socket like mountings and replace gpu as fast as you want before the gpu is assembled how do you think they test it

  • @Pratalax
    @Pratalax Před 2 lety

    Thanks for wishing me a great day Ed Itor

  • @MushroomKingdoom
    @MushroomKingdoom Před 2 lety

    hey I did not know about this as an option, great idea!!
    Usually there is enough bandwith for both components even in gen 3 pci e.

  • @writingpanda
    @writingpanda Před 2 lety +33

    Anthony is fantastic. Just wanted to say he's doing an excellent job with these videos. Kudos, Anthony!

  • @0tool505
    @0tool505 Před 2 lety +5

    I think brands should be more transparent and start answering the consumers why does the GPU do

  • @ShiroKage009
    @ShiroKage009 Před 2 lety +1

    This would have been awesome for things like genomic alignment and similar applications that lost performance due to latency when attempting to utilize GPUs.

  • @cestialfall84
    @cestialfall84 Před 2 lety

    I love watching linus tech tips while not understanding a single thing, yet enjoying it

  • @lakituwick7002
    @lakituwick7002 Před 2 lety +5

    After years of searching, I finally understood why this gpu does.

  • @Respectable_Username
    @Respectable_Username Před 2 lety +3

    Interesting no discussion of what benefit this could bring to ML on large datasets. Is it the SSDs being that close doesn't provide enough of a benefit to data transfer speeds, or is it the price being too expensive for those doing ML research at places such as universities?

  • @MrPruske
    @MrPruske Před 2 lety +1

    I feel like the only person that could have made use of this was the slow-mo guys in 2017. I'd like to see them try to use it now

  • @howthetechworks3742
    @howthetechworks3742 Před 2 lety

    thanks the editor

  • @JorgeMendoza-qx5bp
    @JorgeMendoza-qx5bp Před 2 lety +6

    Video Idea
    Could we get an updated video for 2022 of your
    "3D Modeling & Design - Do you REALLY need a Xeon and Quadro??" video.
    A cheap computer for 3D CAD modeling.

    • @commanderoof4578
      @commanderoof4578 Před 2 lety

      Blender + EEVEE = you need a potato and will still render multiple minutes of frames before something such as 3DS max even does a dozen

  • @CharcharoExplorer
    @CharcharoExplorer Před 2 lety +4

    5:35 - That is not true. HBM2 is still connected by a 1024-bit memory bus. Its just that 2 stacks of HBM2 = 2048, while 2 stacks of HBM1 ... also means 2048 bit bus. They are exactly the same here. HBM2 brought much higher capacities, higher speeds, and lower latencies, it didnt change the connection it had. The Radeon VII for example and the R9 Fury are both 4096 bit machines, one is just 16GB of HBM2 while the other is 4 GB of HBM 1.

    • @bsadewitz
      @bsadewitz Před 2 lety +1

      Reading your post, for some reason I recalled this:
      en.m.wikipedia.org/wiki/Personal_Animation_Recorder
      I had the PC version. It used its own dedicated IDE bus, had its own framebuffer, etc. Upon its release, there was only one HDD that was capable of the sustained throughput required. The images also don't quite convey how huge these cards were. It is probably the heaviest PC expansion card I have ever handled.
      It did not compress the video whatsoever, and could not use the system's bus/IDE controller--too demanding. Furthermore, IIRC the video was stored as still images, one frame per file. I don't recall whether it used FAT or a proprietary filesystem. It was primarily intended for playing back 3d animation, but you could use it for whatever you wanted. I think it cost at least $1000US.

  • @stefanhoffmann8417
    @stefanhoffmann8417 Před 2 lety

    8:55 I fractured my finger tip once with this "pull tab"

  • @fat_pigeon
    @fat_pigeon Před 2 lety

    6:10 Probably the screws are ferrous, but they're stainless steel, which responds only weakly to a magnet. Try sticking a magnet right onto the screwdriver bit; the stronger magnetic field should pick them up.

  • @lukaaleksic9284
    @lukaaleksic9284 Před 2 lety +3

    LLT always bring a smile on my face.

  • @Owenzzz777
    @Owenzzz777 Před 2 lety +3

    This was the first GPU with M.2 slots, but definitely not the only one today. NVIDIA EGX-A30/40/100 are the new ones designed for a completely different purpose. Although technically they are NICs with a GPU, an ARM SoC, and M.2 SSD slot.

  • @ianemery2925
    @ianemery2925 Před 2 lety

    Pro tip for small, non ferrous screws, use a tiny bit of Blue-Tack to stick the screw driver to the screw head, then a larger blob to remove it if it stays in the screw threads and you want it back.

  • @jmssun
    @jmssun Před 2 lety

    It was used to accelerate large scaled industrial Ray Tracing or simulation. The industrial scene files (of factories with complete parts) are so large that they usually would not fit in regular Ram/VRam, and by having it in SSD within GPU allows random look up to such humongous scene possible

  • @Sweenus987
    @Sweenus987 Před 2 lety +3

    They should add 1TB SSD directly to the board and have it used as a more long term cache that could store data from multiple applications that load things into the GPU memory and then load it from this storage into its global memory when needed instead of going through the CPU at all

    • @mnomadvfx
      @mnomadvfx Před 2 lety

      Once a new memory tech comes along that is less power/heat intensive they may just add it directly to the chip packaging ala HBM.
      In theory they could already just add it to that, but even the much higher endurance SLC NAND has wear limits.
      You don't want to bolt memory that can wear out directly onto the packaging of the processor.

  • @PostalTwinkie
    @PostalTwinkie Před 2 lety

    Reminds me of 3DFX's "upgradeable" GPU they were working on in the 90s.

  • @BromTeque
    @BromTeque Před 2 lety +1

    It might be usefull for select machine learning applications.

  • @frknaydn
    @frknaydn Před 2 lety +4

    Main usecase could be AI research. When we run our application sometimes take too much time to load files for training. This way it could be a lot faster. I wish you guys testing not just games. Computers not just game platforms. Please add some software development tests as well. compile nodejs or golang program. Run simple AI trainings.

  • @LycanWitcher
    @LycanWitcher Před 2 lety +2

    i'd imagine this is where pci gen 4 or especially 5 could have shined if this concept kept going to present. No worries about sharing bandwidth with the gpu as there is plenty to go around, far more than the graphics card and m.2 drives combined could saturate.

  • @phillee2814
    @phillee2814 Před 2 lety

    If they'd rolled out proper driver support for that wee beastie for all OSs, it could have been awesome, with a Hyper M.2 card in another 16-lane slot, and some nice big NVMes in both, you could have a mirrored pair of 32TB VDEVs - with maybe parity added by having a couple on the motherboard as well. Downsize each by a wee bit with partitions to allow for L2ARC, SLOG and boot partitions to be split between lanes/devices and mirrored and striped for the bits you want to (so not swap - that could be stripe only). Stuff it with fast ram and a decent processor and you have a heck of a graphics workstation or gaming rig. All for the lack of decent driver support, which if it came with source code for the Linux drivers, would be easy for game or video software developers to hook into.

  • @theruisu21
    @theruisu21 Před 2 lety

    deep learning could be a nice use case. full batch training!

  • @gertjanvandermeij4265
    @gertjanvandermeij4265 Před 2 lety +38

    Would still love to see an GPU with some sort of GDDR slots, so everybody can choose their own amount of Vram !

    • @coni7392
      @coni7392 Před 2 lety

      It would be amazing to be able to add more VRAM to my card

    • @psycronizer
      @psycronizer Před 2 lety +3

      @@coni7392 why ? your GPU can only access and throw around only so much data, and oddly enough the GPU's are tailored exactly to how much ram they have, might be useful for static images at high res, but high frame rates at higher res ? not so much.

    • @oiytd5wugho
      @oiytd5wugho Před 2 lety +3

      The expense in no way justifies the benefit. The only thing you'd get is limited upgradibility. GPUs have a highly specified memory controller, basically supporting a few variations in volume, like, a chip might support 4, 8 and 16 gigs of discrete memory ICs each holding 512MB and nothing else

    • @lazertroll702
      @lazertroll702 Před 2 lety

      @@psycronizer assuming dynamic physical ram size and that firmware binds the addresses on init, would there really be no advantage in gaming, like having more loaded pre-render objs or prefetch code?
      it seems that the disadvantage is letting the os treat them as global instead of driver-exclusive/defined fs ..? 🤨

    • @psycronizer
      @psycronizer Před 2 lety

      @@lazertroll702 not really, transfer speeds from non display storage to frame buffer are really a non issue now, so at some point adding more ram just makes for higher cost with no benefit

  • @kandmkeane
    @kandmkeane Před 2 lety +23

    This gpu has always made me want to know of Intel’s Optane m.2 could be used? Would it even work? would there be any use cases for that? Any benefits?
    Probably not but it’s just such an interesting opportunity to experiment with mixing different computer technologies…

    • @christopherschmeltz3333
      @christopherschmeltz3333 Před 2 lety +1

      I haven't used Optane much, but the technology is fundamentally more like non-volatile RAM, with higher performance and endurance but a fraction of the capacity as a comparably priced NAND Flash SSD. It's most commonly effective when utilized as a hybrid cache layer, like how it's built into Intel H10 and H20. I don't think the data center grade M.2 have achieved 1.5TB yet, last time I noticed that capacity was reserved for special use server DIMM!
      Therefore, I expect Optane should mostly function in this SSG, but the benefits of upgrading would probably just be how long the M.2 would last with medium sized but frequently changing data sets before wearing out and if you're using it's API to not get performance bottlenecked elsewhere. Perhaps use the API to write your own storage subsystem using two Optane 64GB 80mm long cache drives and two high capacity 2TB 110mm long storage drives... but I'm not aware of when an ordinary M.2 RAID card feeding multiple compute GPUs wouldn't be more practical.

    • @mnomadvfx
      @mnomadvfx Před 2 lety

      @@christopherschmeltz3333 Exactly.
      Optane phase change memory tech hasn't even breached 4 layer device limitations.
      While the state of the art in 3D/VNAND is already up to 170 layers and counting.
      In short Intel bet on the wrong tech foundations to build Optane upon - it simply isn't well suited to 3D scaling which is a necessity for modern memory as area scaling is already reaching problematic limitations.

    • @christopherschmeltz3333
      @christopherschmeltz3333 Před 2 lety

      @@mnomadvfx Intel fabrication definitely bet on the wrong 10mn tech, but Optane will probably hold onto a smaller niche than planned until hardware RAM Disks make a comeback. You know, like the Gigabyte iRAM back in the DDR1 era... there are newer and older examples, but Gigabyte's seemed to have been noticed by the most PC enthusiasts and should be simple to research.

  • @thepolarblair1
    @thepolarblair1 Před 2 lety

    It's fantastic to see Anthony so comfortable. MOAR Anthony!

  • @matikaevur6299
    @matikaevur6299 Před 2 lety

    I feel old ..
    I remember times when discreet soundcard was essential for gaming .. soldering my own LPT DAC ..and bizarre (experimental) situation when soundcard (Sound Blaster AWE32) had more memory than PC (28MB vs 16MB). Gravis Ultrasound equivalent (don't remember exact model) went only to 16MB

  • @VoteOrDie99
    @VoteOrDie99 Před 2 lety +3

    Add a CPU and power supply to the card and it's essentially a gaming console/computer in a card. I wonder what potential this brings

  • @IvanpilotNX1
    @IvanpilotNX1 Před 2 lety +11

    AMD when was creating this gpu:
    AMD: Hmmm... We need a different gpu, something different.
    That one worker: Boss and if we combine storage with a gpu
    AMD: Hmmm... That idea is... PERFECT, another increase James, good job 😀

  • @sexylexy22100
    @sexylexy22100 Před 2 lety +1

    Would be great for cfd on gpu workflows generally if you run out of space in ram it crashes you're multi day set of calculations and you have nothing so can do much larger computes with this card

  • @jimmymifsud1
    @jimmymifsud1 Před 2 lety

    Anthony is a gift that keeps giving; would never of found him without LTT

  • @o9mb
    @o9mb Před 2 lety +3

    Damn

  • @martinlagrange8821
    @martinlagrange8821 Před 2 lety +6

    Well...I would have a use for it. When running Tensorflow through a GPU as a coprocessor for Neural Networks, the SSG would result in supercomputer performance for complex multi-level networks. Its not for apps & games - its for AI !

  • @ahmedmdmahfuz3811
    @ahmedmdmahfuz3811 Před 2 lety

    Wow! Great idea

  • @Czeckie
    @Czeckie Před 2 lety

    4:07 such a pro 'check this out'

  • @tvollogy
    @tvollogy Před 2 lety +3

    "Why does this GPU?"

  • @TheOnlyTwitchR6
    @TheOnlyTwitchR6 Před 2 lety +7

    If I had to guess, this didn't take off because we didn't have OS direct access to GPU storage
    I really want this to become normal in the future to throw m.2's onto the GPU

  • @theredscourge
    @theredscourge Před 2 lety +1

    Actually it wouldn't be too bad if they put say 50GB of flash storage on the card, but it would need to be a type that can withstand a LOT of writes. They'd need some software to allow the user to choose which 1-2 AAA games at a time that you'd like to have their relevant texture files cached directly on the GPU. Or they could try to develop some sort of Windows Prefetch cache type thing, where it aggressively uses the 80/20 rule to try to identify the slowest and most loaded texture files that each game uses as you play it, then start saving a history of which files it will want to slowly pre-load onto the card the next time you go to play it. Perhaps they could pre-calculate what those are on a per-game basis and distribute some sort of map file, sorta like how the GPU drivers these days load different profiles for each game.

  • @nohomeforfreepeople2894

    I want this for my vector editing programs. Loading large graphics files would make my print processing and rip so much faster (if Corel, Adobe, and Roland would take advantage of it)

  • @undeadlolomat8335
    @undeadlolomat8335 Před 2 lety +4

    Ah yes, ’Why does this GPU?’ 😂