Why does this GPU have an SSD? - AMD Radeon Pro SSG
Vložit
- čas přidán 4. 07. 2024
- Get $25 off all pairs of Vessi Footwear with offer code LinusTechTips at www.Vessi.com/LinusTechTips
SmartDeploy: Claim your FREE IT software (worth $580!) at lmg.gg/OTTP7
AMD announced the Radeon Pro SSG in 2016 combining a GPU and onboard SSD - But when it launched in 2017, practically nobody bought it. Was it simply ahead of its time, or was it truly a dud?
Discuss on the forum: linustechtips.com/topic/14183...
Check out the Radeon™ Pro SSG: geni.us/yQ43dMY
Buy some LTT Store Dot Com Cable Ties: www.lttstore.com/products/cab...
Buy an Intel Core i9-12900K: geni.us/hrzU
Buy an ASUS TUF Z690-PLUS WIFI D4: geni.us/mgWYr2
Buy a Noctua NH-D15: geni.us/vnuvpW
Buy a Corsair Force MP600: geni.us/TkxIgO
Purchases made through some store links may provide some compensation to Linus Media Group.
► GET MERCH: lttstore.com
► AFFILIATES, SPONSORS & REFERRALS: lmg.gg/sponsors
► PODCAST GEAR: lmg.gg/podcastgear
► SUPPORT US ON FLOATPLANE: www.floatplane.com/
FOLLOW US ELSEWHERE
---------------------------------------------------
Twitter: / linustech
Facebook: / linustech
Instagram: / linustech
TikTok: / linustech
Twitch: / linustech
MUSIC CREDIT
---------------------------------------------------
Intro: Laszlo - Supernova
Video Link: • [Electro] - Laszlo - S...
iTunes Download Link: itunes.apple.com/us/album/sup...
Artist Link: / laszlomusic
Outro: Approaching Nirvana - Sugar High
Video Link: • Sugar High - Approachi...
Listen on Spotify: spoti.fi/UxWkUw
Artist Link: / approachingnirvana
Intro animation by MBarek Abdelwassaa / mbarek_abdel
Monitor And Keyboard by vadimmihalkevich / CC BY 4.0 geni.us/PgGWp
Mechanical RGB Keyboard by BigBrotherECE / CC BY 4.0 geni.us/mj6pHk4
Mouse Gamer free Model By Oscar Creativo / CC BY 4.0 geni.us/Ps3XfE
CHAPTERS
---------------------------------------------------
0:00 Intro
0:53 What is an... SSG?
1:27 SSD performance
2:06 Is this like DirectStorage?
2:55 The SSG API
4:11 Enter Adobe
5:00 But... Why?
6:02 Can we... Upgrade it?
7:04 Why is direct-to-GPU storage important?
7:58 Conclusion - Why it won't come back - Věda a technologie
I'm glad there's finally an answer to why that GPU does
Has Anyone Really Been Far Even as Decided to Use Even Go Want to do Look More Like?
but no one ever asks how is GPU
@@32bites r/ihadastroke
@@noamtsur the gpu iz brocken :((((
this gpu does well. Gpu fricks.
Truly opened my eyes as to why the GPU does
This*
🌚👍
Registering my comment before this blows up.
Exactly why I dislike the clickbait titles, they don't tell us why the gpu does!
But why?
Those would be crazy useful in AI Application, imagine loading all your dataset in you GPU once without having to reload it for each training iteration
i actually thought it was where they where going to when antony switched the ssd's
Especially now that ML has moved away from NVIDIA proprietary tech
If the API was actually widely rolled out, something like this would be incredibly useful for science departments at universities (which is a niche market, but not an insubstantial one).
Would have been Ideal to train models for Deep Learning using this GPU
@Sappho I was doing astronomical simulations (my work was more with globular cluster formation and evolution, 10s of thousands to millions of stars, sometimes coupled with a molecular cloud during early formation) and there definitely would have been a performance boost if the read-out/save time of each time slice could have been sped up by having the GPU dumped it straight to storage.
Just as you also described, most of my work was done on a Beowolf cluster with a lot of high powered off-the-shelf GPUs.
Niche but actually large market haha, I worked in university research at the National Weather Center. Basically any excuse to build something cool for research is the path most travelled by pre-doctoral and under grads.
meh ... there's only so many times storage malware can give the feelz...
although .. iffins the gpu executed from that storage on init ... 🤔
I wonder how long until the title changes lol
probably within the first hour or 2
Same
Why does that GPU?
I bet they leave it out of spite lol
Why do they do that? Linus does seem to do that a lot and it’s confusing.
Ah, yes, "Why does this GPU"
Gotta love the original titles...
This gpu do because it does
What the GPU doin'
@UCsHt0kYQUaXK7sT8UIyN5RA How’s the gpu doin’?
Not yet lol
It does because it will do, quality title
Imagine being the engineering team that made all of that work, only for the industry to say "ok. cool."
The people that actually buy them for working on appreciate it.
The people who just commentate on it are not "the industry" at all, just glorified journalists.
@@mnomadvfx If the industry needed it it would exist. Just look how much better enterprise tools and systems are.
Also, a victory against ever more complex and complicated hardware.
Just like VESA Local Bus video cards
@@JonatasAdoM aye! #KludgeDeath
This technology eventually found a home... AMD used it in the PS5 and Xbox Series. Both systems can load into RAM directly from the SSD, bypassing the CPU.
Is that true?
Well, this lead to Direct Storage, that we have today on this consoles, and soon will have on Windows 11. Pretty much the same idea, technologically a bit different i guess.
I think bypassing the CPU is difficult/insecure, and I did some research and was right.
Complete CPU bypass would mean being able to skip the kernel layer, and all the security checks which gives you arbitrary read/write to memory from disk that should be minimized as it provides a loophole.
What DirectStorage does is simply use the GPU to decompress compressed data sent over the bus from the SSD, which then hands it to the CPU to decode and execute. This basically just speeds up the data retrieval pipeline, but doesn't not expose any loopholes as it is fundamentally the same underlying mechanism all computers use today to fetch data.
The AMD SSG card in the video can do such caching as the GPU doesn't execute the kernel code, which means that while you can still write malicious code in to target the GPU, it's way more self contained than executing it directly on the CPU which takes control of all your processes including your OS.
Everyone always ask "Why does this GPU?", but never asks "How is this GPU?" 😔
Original
how does this gpu? *
What da GPU doing?
What's tha gpu doing?
Lol I laughed out loud to this.
'Why does this' indeed. I think the question the manufacturer asked is 'why not?'.
@Czarodziej stronger than my will to click this link and be greeted by cringe
@Czarodziej stronger than my will to live after reading your comment.
@@Worthy_Edge I think you meant to say "greeted" but still that was a HILARIOUS retort!"
I wonder if they had done something like a separate PCIe daughter card with something similar to the SLI or Crossfire interfaces would have worked better. It wouldn't have shared bandwidth across the PCIe bus but still allowed direct access to the SSDs installed.
Hey this seems like a really cool idea.
I am not sure I understand , aren't the SLI and Crossfire interface very slow by themselves ? The data would move through PCIE... Like any NVME drive
@@jeromenancyfr I imagine the idea is, GPU calls for data, pass over link, then out through pci... CPU calls for data, pass through pci.
That eats into compute density though if you want to have several of them per node.
well my understanding is thats basically how nvme direct storage drives are going to work (and are already working like that in the ps5)
7:58 hello back mr editor
"This is hilarious" : "You can totally run your system without any additional storage as long as you are ok with the overhead of sharing bandwidth with a GPU." Anthony's sense of humour differs from most. Moss would be proud.
I read that in Moss' voice and I cannot agree more.
Goddamn Moss and his old Fetts..
Ah if you have pci-e 64x because most used programs task in ram after that the boot disk is barely used you can use such cards in low profile setups without chiping uou the gpu cooling solution if it had also a cpu socket to handle the graphics management would ve something
I learned that this was a thing a month ago when doing research on the WX 2100 and I’m surprised no major tech channel did something funny with it
@@joz534 no, run.
Maybe because it is too expensive?
That kind of reminds me of the old Real3D Starfighter PCI Intel i740 gfx card from wayyyy back in the day. Intel had just released the AGP bus architecture and the i740 was their first foray into the discrete graphics space…probably to help support the launch of AGP 1X, because it wasn’t all that fast otherwise. For the majority of the non-AGP systems, Real3D built a card with an AGP-PCI bridge chip that basically had an AGP bus and dedicated SDRAM AGP texture memory on board, in addition to the i740’s local SGRAM framebuffer RAM like any other graphics card. It was pretty cool at the time. They were sold with 4-8 MB framebuffer plus 8-16 MB AGP texture memory for a max of whopping 24 MB total onboard. They weren’t very fast, but they supported 1024x1024 px texture tile resolution whereas the vast majority of the competition including 3DFX only supported 256x256 pixels max resolution texture tiles. It was slow, but it looked so much better than anything else on the market and helped milk some extra capability from old non-AGP slot systems…perfect tradeoff people like Nintendo 64 players were used to dealing with, lol. 3DFX Voodoo 2 cards had a similar structure with separate RAM for framebuffer and texturing. Ok, now I’m done dating myself 😂
theres also one thing that wasnt really mentioned and that they launched EPYC and threadripper around the same time, which effectively provided the same functionality. This card was in a timeframe where NVMe RAID's were an amazing concept but the PCIe lanes needed for it were often hard to come by, even on the xeon and opteron series
I honestly wondered if this GPU was actually hiding a secret, that Microsoft had these to base DirectStorage work off of for all these years while they worked on it. Maybe now that it's finally public and AMD has actual tangible research into this as the product actually exists... well, I don't know... imagine if RDNA3 has this as a surprise feature to work amazingly with DirectStorage?!
This is exactly what's needed to blow graphics to a new area, think of the huge scenes you could render
The beta setup for DirectStorage used Nvidia RTX cards, as Nvidia was already doing work in the same direction for RTX I/O, aimed at the workstation market. Remember, they needed something that was going to work in the PCs people will own in the foreseeable future rather than create something requiring a costly niche hardware design. If Microsoft used them in R&D at all, it was more likely for the Series X/S Velocity Architecture, as a proposed console design was less sensitive to non-standard hardware if the cost was good. Even then, this wasn't very close as the major component there (and in the PS5) is the controller functionality with the dedicated decompression block. Offloading those operations from CPU and GPU are a big factor in letting the console perform optimally.
I strongly suspect that Microsoft and AMD will try to push an open standard for a PC hardware spec that will bring a version of Velocity Architecture to PC to give DirectStorage the full functionality it has on Xbox. This needs to be a vendor independent spec to get Intel and Nvidia on board, otherwise it's will remain a niche that game developers will be reluctant to use. A recent previous example would be DirectML, which is hardware agnostic and relies on the drivers to bridge the gap between PCs and vendors of ML focused hardware. Thus the ML hardware can live in the CPU, GPU, or a separate device on the PCIe bus, the user doesn't need to know so long as the driver tells the system what to look for and how to talk to it.
This would be amazing
At this point, i think we don't need CPU's. It is cheaper and better (for gaming) to produce all in one APU designs, like how PS5 and other game consoles are designed.
According to some market analysts, top RDNA4 could come with 512 gigs of pcie gen 4 memory
With respect to the random read speeds (1:41); Why not test the drives independently from the SSG, or use MP600 drives in the SSG to get a proper apples to apples comparison? The drives firmware themselves may just be crap and account for why the random speeds don't scale nearly as well.
Ain't nobody got time for that.
Because driver overhead for RAID increases latency. on AM4 you're loosing approx 30% of your IOPs even if all your SSDs are connected to the CPU and not chipset. Intel is no where near this bad (same with Windows) but it's still a loss.
@@ThranMaru well they tested a GPU from 2017 that no one have, so yes they have time,
@@ayoubboulehfa3932 has nothing to do with time and more with uniqueness to interest the viewer into clicking on the video.
Because then you'd just be testing drive performance which wouldn't make sense. It's end to end testing
Great video and information presentation! Thank you!
So good to see Anthony back on Anthony Tech Tips
If ML libraries target this platform it seems like it could be a compelling option. Now days models are getting so large that even 24gig of VRAM is not enough. Yes the performance would undoubtedly be worse using SSDs but the alternative is not being able to use the model at all.
THIS, so much this.
would that be much faster than just using swap?
@@artlessbene maybe not faster, perhaps cheaper
Code it for m1 ultra 128gb of vram
Nvidia wants you to buy their NVLink systems instead. Or A100 80GB.
Actually handy for when your motherboard does not have enough m.2 slots.
You can buy these as just the raid 0 cards that will plug in and use PCIx4.
They could literally give you more m.2 if they gave you a pcie card with m.2 slots
@@WayStedYou but those dont look cool
@@WayStedYou as someone who uses an itx system when pcie5.0 comes out i really hope something like this comes out as even gpus barely use the extra bandwidth from pcie4.0 why not put some m.2 slots on gpu's espessialy with MDA coming in the near future
@@Bobis32 power and bandwith
@@somefish9147 oh yeah bc SSD use sooooooooooooooooooooo much power
7:55 thx editor
I'd be curious to see this concept again once m.2 key f finally sees some use; though if we never see a future where there's high bandwidth busses with tight memory timings, essentially combining what GPUs and CPUs like, this concept should be put off to key H, J, K, or L, to not confuse high bandwidth GPU memory with tight timing CPU memory on key f, assuming a future memory standard ever actually makes the switch. Though with how fast devices are becoming, it'd be cool to see a unified memory-storage platform where the only difference is if the chip itself is considered volatile or not, essentially the original concept of Optane on steroids; this would also be cool if there's semi-volatile chips where a sudden shutdown could retain otherwise volatile data.
When PCIe Gen 4 first came out, everyone was saying how it wasn't practical because it would be used fully. I said then that what would be more interesting if you saw some instances where multiple uses through a single PCIe 16x slot could take place without any hindering in performance. This would be one of those scenarios. not useful, but pretty cool.
Couldn't agree more. When someone made a car, everyone said horses were better. Without manufacturers trying things outside of the box we would never progress, and I have no idea why everyone is so against innovation. Noone is forcing anyone to become early adopters of anything, and most things people were skeptical about soon became integral to everyday life. With progression comes niche products like this, but at least we can say they are trying.
and now we're up to PCIe 5.0 with Alder Lake...there's even consideration to adjust NVMe storage standards from 4 lanes down to 2 because of how much bandwidth 4.0 and now 5.0 offer.
I would love a product like this if only to gain more NVMe storage without taking up extra slots
@@bojinglebells same, I love the dual functionality. I get a pretty decent GPU and 4 NVMe slots in two PCIe slots instead of three if I had to get a separate addon card. I personally love using up all my 7 slots with lots of cards.
@@bojinglebells And I think about some more niche area like small form factor PCs and even the NUC extreme. With the speed and bandwidth increases, these types of compute cards could make for near instantaneous connections and make those types of products more viable
I remember the launch event of this this at siggraph. AMD „gifted“ some of those cards to RED which then gave them to some Indian filmmakers which had previously betatested the card in animation and editing one of their movies if I remember correctly. But TBH, I have more memories of the after-party than the event itself.
which movies they were?
Furry
I bet, them furry after parties are insane
@@GeneralKenobi69420 ok thanks 🙂 haven't watched it
Really needed some Anthony today. Not disappointed.
I've been waiting for this video for such a long time! Finally, I can see how it performs!
"Why does this GPU?!!"
Great question.
Only 11 minutes and there’s already 2 bot replies
@@Worthy_Edge These bots can't just chill. Can they?!!
Man, from what I recall, this thing was baller for Revit/CAD work. Those needed the entire model in VRAM, and it'd be a massive hurdle to do that over SSD > CPU > MEM > GPU. This was pre-host bus controller, which is the 'not as fancy' name for directstorage. Allowing devices 'other' than the main controller in a PCIe network to take control of another device. Like a GPU just... assuming direct control of an SSD (after some mediation obv) to just load stuff off without the big overhead. Obviously since then we also got (first on AMD, later on Intel) SSD's direct on CPU, rather than a PCH in-between (like Intel had until recently when they figured out that just 16 lanes from CPU was not enough).
I was kinda thinking the same, or using it for parraralised ML or big data applications it is a WS card after all, running an openCL coded ML algorithm direct from 2TB of fast storage on the GPU, that's a lot of test data.
It's very interesting for computational fluid dynamics too. Although there are ways to make CFD codes require less VRAM (demos on my YT channel), you always want maximum resolution possible. You could do colossal resolutions with TBs of storage. But in practice it's unusable, because even the 14GB/s is extremely slow, and you would rewrite this storage 24/7 which would quickly degrade/break it. With VRAM directly, you can do 400-3300 GB/s. So the SSG never really took off.
I occasionally deal with massive scenes and mesh or particle caches in Redshift for Maya, and Redshift could use this for sure! The same goes for trying to use Redshift to render massive print images where Redshift's out-of-core technology could benefit from having all this storage connected directly to the GPU core. No more Out-Of-Memory failures!
Hey. you rocking this video man! Nice hosting
This reminds me of the Intel math coprocessors for the 286/386 CPU's Before floating point unit (FPU) processing became the default for all X86 processors. With the 486 Intel introduced the 486DX with the FPU and the 486SX with the FPU disabled.
Boy, this is awesome. I wish you would show more obscure tech. I feel like watching retro computer channels right now. Only with new stuff. :D
Thanks. This is really awesome!
LTT: "Why does this GPU?"
Me: "Yes, but have you considered HOW the GPU does?"
How about WHO does this GPU?
I'll do you one better. Why is GPU?
but where does this GPU?!
When does GPU
@@totallynotthebio-lizard7631 2017
I wonder how hard implementing a Direct Storage layer over the API would be.
Probably easier because you are cutting out a middle man - though there might be some latency introduced as they communicate with each other.
whoever did the ad animation i love you for adding Saitama
I wonder how it'd work with deep learning stuff, if the memory capacity would outweigh the speed.
I was surprised there was no mention of that potential application as well.
@@ilyearer Same. Seriously looking hard at this card now, since memory size is an upper limit on the types of existing neural nets you can fine tune. RTX 3090 has only 24 Gigs compared to this, 2048 Gigs. Yikes.
Adding modular storage to a GPU makes sense if it's directly useable by the GPU itself. A game could preload the textures and models to the storage and use them from there similar to how direct storage works, but potentially faster and lower latency.
It's very interesting for computational fluid dynamics too. Although there are ways to make CFD codes require less VRAM (see the demos on my YT channel), you always want maximum resolution possible. You could do colossal resolutions with TBs of storage. But in practice it's unusable, because even the 14GB/s is extremely slow, and you would rewrite this storage 24/7 which would quickly degrade/break it. With VRAM directly, you can do 400-2000 GB/s. So the SSG never really took off.
Yes and no. Games still use sdr texture to the point ...hd assets are no worth it atm.
@@ProjectPhysX for applications like that where data is re-written constantly I think just adding sodimm slots for ddr 5 would be ideal. With 4 slots you could add a ton of ram. Not as fast as the gddr ram, but good enough to worthwhile.
@@kaseyboles30 i feel this is the answer for a lot of GPU applications, from low budget cards (4GB VRAM not enough anymore, pop a desktop DIMM in the expansion slot) to the high end, populate all 16+ DIMM slots for maximum AI/machine learning/CFD performance.
@@ravenof1985 aye would be faster and cheaper in the long run - though you aren't breaking grounds in vram unless going for an hedt with threadripper cpu or something
0:01 LOL whose idea was that, EPIC intro
You look so much comfortable about sponsors. Good job.
I could see this being used for machine learning or data analysis for Microsoft R. Good usecase for direct storage.
train a model with very limited host CPU usage... ya that would be cool
Everyone asks "Why does GPU?"
Nobody asks "How does GPU?"
Interesting.
Thanks for the video!
it just amazes me that Direct Storage / RTX IO is taking this long for a demo to test with
Hope to see some Direct Storage content soon.
Huge use case for AI training. Anything over 80GB of memory means training has to move from GPU’s to CPU today, and that means a slowdown by multiple orders of magnitude. Unfortunate AMD has never had any real market share in the AI/ML world because their software support - even in 2020 - sucks.
how about in 2022?
@@RyTrapp0 ye but intel bad
Wearout makes it a nonstarter, for inference though maybe could be a monster in the right circumstances.
AMD has introduced their new MI250X GPU with 128 GB memory.
But still you can never have enough memory. I'm working with CFD (see my YT channel), and there it's the same problem: You always want maximum resolution possible. You could do colossal resolutions with TBs of storage. But in practice the SSG is unusable, because even the 14GB/s is extremely slow, and you would rewrite this storage 24/7 which would quickly degrade/break it. With VRAM directly, you can do 400-2000 GB/s. So the SSG never really took off.
@@ProjectPhysX Thanks, I figured as much. It's a shame. Memory in the TB range truly opens up new possibilities for deep learning.
Love your work Anthony, keep it up =)
Congratulations⤴️contact on claming your prize.
I think one of the most productive uses for this GPU is to enable fast Unified memory accesses to memory when programming with OpenCL or something like that. Although that is a really niche and low level use case, mostly investigation-focused.
Ngl Anthony is my favourite LTT member and it makes me so happy whenever I see his face in a thumbnail:))
Same but these bots goddamnit
@@LordYamcha I stg what the absolute fuck is this, commented 2 minutes ago and there are already 2 bots
Everyone is asking "Why does this GPU?" but I'm just glad to see an upload featuring Anthony.
I like how you guys kept that title
When I saw that board I thought they were gonna have a complex memory controller that'd drive the nvme drives with the normal ddr memory as a cache, not as literal storage devices sitting on the gpu for fast load times.
Would be intresting if such a Idea would be combined with optane memory that with a driver using it as 2nd level ram
What did you mean with optane? Is optane can store data? Sory if i am wrong
Did you try seeing if one of the versions of graphics card powered SQL works well on this? Current issue with this is the data transfer speed with the CPU step involved. So might be worthwhile trying that.
It really wouldn't help unless the sql server used the api this GPU needs to directly access the file
Actually this design is kinda awesome for mini itx machines where storage expansion is very limited and you're already using your only PCIe slot for the GPU.
For someone who used to built sff this would be a godsend in 2017 infact even now it still good, i had a dan case a4 sfx and most of it volume is dedicated for gpu and cpu and yes you can cramp in 3 2.5 drive but boy you need custom cable for everything including for mb, cpu, gpu to make space for the drive. even lian li tu105 also had 1-2 drive mount and itx mb low end one come with maybe 1 and high end with 2 m.2, having this would solve so much of space issues for me, my steam library already 6TB
I do like this idea. Would be cool to see it come back. This would be really great, actually, for space confined builds. It seems... unique
I can see this being used for ONE specific use case. Instead of having a separate SSD-in-one-enclosure and GPU and taking more than 1 or 2 PCIE slot, i.e. 1 LIQID Honey badger and 1 GPU, just use this!
This card actually make sense and I'm sad to see this tech not taking off because if you know how and why to use it, this is revolutionary!
with direct storage sharing bandwidth with ssd is no problem. Problem is gpu it self. In few yers it will suck.
@GoSite solder a better one flash a model firmware done or you can adapt a Socket like mountings and replace gpu as fast as you want before the gpu is assembled how do you think they test it
Thanks for wishing me a great day Ed Itor
hey I did not know about this as an option, great idea!!
Usually there is enough bandwith for both components even in gen 3 pci e.
Anthony is fantastic. Just wanted to say he's doing an excellent job with these videos. Kudos, Anthony!
Agreed Anthony is the man!
I subscribe because of Anthony
Anthony is the #techgod a fuckin goat.
I think brands should be more transparent and start answering the consumers why does the GPU do
This would have been awesome for things like genomic alignment and similar applications that lost performance due to latency when attempting to utilize GPUs.
I love watching linus tech tips while not understanding a single thing, yet enjoying it
After years of searching, I finally understood why this gpu does.
Interesting no discussion of what benefit this could bring to ML on large datasets. Is it the SSDs being that close doesn't provide enough of a benefit to data transfer speeds, or is it the price being too expensive for those doing ML research at places such as universities?
I feel like the only person that could have made use of this was the slow-mo guys in 2017. I'd like to see them try to use it now
thanks the editor
Video Idea
Could we get an updated video for 2022 of your
"3D Modeling & Design - Do you REALLY need a Xeon and Quadro??" video.
A cheap computer for 3D CAD modeling.
Blender + EEVEE = you need a potato and will still render multiple minutes of frames before something such as 3DS max even does a dozen
5:35 - That is not true. HBM2 is still connected by a 1024-bit memory bus. Its just that 2 stacks of HBM2 = 2048, while 2 stacks of HBM1 ... also means 2048 bit bus. They are exactly the same here. HBM2 brought much higher capacities, higher speeds, and lower latencies, it didnt change the connection it had. The Radeon VII for example and the R9 Fury are both 4096 bit machines, one is just 16GB of HBM2 while the other is 4 GB of HBM 1.
Reading your post, for some reason I recalled this:
en.m.wikipedia.org/wiki/Personal_Animation_Recorder
I had the PC version. It used its own dedicated IDE bus, had its own framebuffer, etc. Upon its release, there was only one HDD that was capable of the sustained throughput required. The images also don't quite convey how huge these cards were. It is probably the heaviest PC expansion card I have ever handled.
It did not compress the video whatsoever, and could not use the system's bus/IDE controller--too demanding. Furthermore, IIRC the video was stored as still images, one frame per file. I don't recall whether it used FAT or a proprietary filesystem. It was primarily intended for playing back 3d animation, but you could use it for whatever you wanted. I think it cost at least $1000US.
8:55 I fractured my finger tip once with this "pull tab"
6:10 Probably the screws are ferrous, but they're stainless steel, which responds only weakly to a magnet. Try sticking a magnet right onto the screwdriver bit; the stronger magnetic field should pick them up.
LLT always bring a smile on my face.
This was the first GPU with M.2 slots, but definitely not the only one today. NVIDIA EGX-A30/40/100 are the new ones designed for a completely different purpose. Although technically they are NICs with a GPU, an ARM SoC, and M.2 SSD slot.
Pro tip for small, non ferrous screws, use a tiny bit of Blue-Tack to stick the screw driver to the screw head, then a larger blob to remove it if it stays in the screw threads and you want it back.
It was used to accelerate large scaled industrial Ray Tracing or simulation. The industrial scene files (of factories with complete parts) are so large that they usually would not fit in regular Ram/VRam, and by having it in SSD within GPU allows random look up to such humongous scene possible
They should add 1TB SSD directly to the board and have it used as a more long term cache that could store data from multiple applications that load things into the GPU memory and then load it from this storage into its global memory when needed instead of going through the CPU at all
Once a new memory tech comes along that is less power/heat intensive they may just add it directly to the chip packaging ala HBM.
In theory they could already just add it to that, but even the much higher endurance SLC NAND has wear limits.
You don't want to bolt memory that can wear out directly onto the packaging of the processor.
Reminds me of 3DFX's "upgradeable" GPU they were working on in the 90s.
It might be usefull for select machine learning applications.
Main usecase could be AI research. When we run our application sometimes take too much time to load files for training. This way it could be a lot faster. I wish you guys testing not just games. Computers not just game platforms. Please add some software development tests as well. compile nodejs or golang program. Run simple AI trainings.
i'd imagine this is where pci gen 4 or especially 5 could have shined if this concept kept going to present. No worries about sharing bandwidth with the gpu as there is plenty to go around, far more than the graphics card and m.2 drives combined could saturate.
If they'd rolled out proper driver support for that wee beastie for all OSs, it could have been awesome, with a Hyper M.2 card in another 16-lane slot, and some nice big NVMes in both, you could have a mirrored pair of 32TB VDEVs - with maybe parity added by having a couple on the motherboard as well. Downsize each by a wee bit with partitions to allow for L2ARC, SLOG and boot partitions to be split between lanes/devices and mirrored and striped for the bits you want to (so not swap - that could be stripe only). Stuff it with fast ram and a decent processor and you have a heck of a graphics workstation or gaming rig. All for the lack of decent driver support, which if it came with source code for the Linux drivers, would be easy for game or video software developers to hook into.
deep learning could be a nice use case. full batch training!
Would still love to see an GPU with some sort of GDDR slots, so everybody can choose their own amount of Vram !
It would be amazing to be able to add more VRAM to my card
@@coni7392 why ? your GPU can only access and throw around only so much data, and oddly enough the GPU's are tailored exactly to how much ram they have, might be useful for static images at high res, but high frame rates at higher res ? not so much.
The expense in no way justifies the benefit. The only thing you'd get is limited upgradibility. GPUs have a highly specified memory controller, basically supporting a few variations in volume, like, a chip might support 4, 8 and 16 gigs of discrete memory ICs each holding 512MB and nothing else
@@psycronizer assuming dynamic physical ram size and that firmware binds the addresses on init, would there really be no advantage in gaming, like having more loaded pre-render objs or prefetch code?
it seems that the disadvantage is letting the os treat them as global instead of driver-exclusive/defined fs ..? 🤨
@@lazertroll702 not really, transfer speeds from non display storage to frame buffer are really a non issue now, so at some point adding more ram just makes for higher cost with no benefit
This gpu has always made me want to know of Intel’s Optane m.2 could be used? Would it even work? would there be any use cases for that? Any benefits?
Probably not but it’s just such an interesting opportunity to experiment with mixing different computer technologies…
I haven't used Optane much, but the technology is fundamentally more like non-volatile RAM, with higher performance and endurance but a fraction of the capacity as a comparably priced NAND Flash SSD. It's most commonly effective when utilized as a hybrid cache layer, like how it's built into Intel H10 and H20. I don't think the data center grade M.2 have achieved 1.5TB yet, last time I noticed that capacity was reserved for special use server DIMM!
Therefore, I expect Optane should mostly function in this SSG, but the benefits of upgrading would probably just be how long the M.2 would last with medium sized but frequently changing data sets before wearing out and if you're using it's API to not get performance bottlenecked elsewhere. Perhaps use the API to write your own storage subsystem using two Optane 64GB 80mm long cache drives and two high capacity 2TB 110mm long storage drives... but I'm not aware of when an ordinary M.2 RAID card feeding multiple compute GPUs wouldn't be more practical.
@@christopherschmeltz3333 Exactly.
Optane phase change memory tech hasn't even breached 4 layer device limitations.
While the state of the art in 3D/VNAND is already up to 170 layers and counting.
In short Intel bet on the wrong tech foundations to build Optane upon - it simply isn't well suited to 3D scaling which is a necessity for modern memory as area scaling is already reaching problematic limitations.
@@mnomadvfx Intel fabrication definitely bet on the wrong 10mn tech, but Optane will probably hold onto a smaller niche than planned until hardware RAM Disks make a comeback. You know, like the Gigabyte iRAM back in the DDR1 era... there are newer and older examples, but Gigabyte's seemed to have been noticed by the most PC enthusiasts and should be simple to research.
It's fantastic to see Anthony so comfortable. MOAR Anthony!
I feel old ..
I remember times when discreet soundcard was essential for gaming .. soldering my own LPT DAC ..and bizarre (experimental) situation when soundcard (Sound Blaster AWE32) had more memory than PC (28MB vs 16MB). Gravis Ultrasound equivalent (don't remember exact model) went only to 16MB
Add a CPU and power supply to the card and it's essentially a gaming console/computer in a card. I wonder what potential this brings
APU
AMD when was creating this gpu:
AMD: Hmmm... We need a different gpu, something different.
That one worker: Boss and if we combine storage with a gpu
AMD: Hmmm... That idea is... PERFECT, another increase James, good job 😀
Would be great for cfd on gpu workflows generally if you run out of space in ram it crashes you're multi day set of calculations and you have nothing so can do much larger computes with this card
Anthony is a gift that keeps giving; would never of found him without LTT
Damn
Well...I would have a use for it. When running Tensorflow through a GPU as a coprocessor for Neural Networks, the SSG would result in supercomputer performance for complex multi-level networks. Its not for apps & games - its for AI !
Wow! Great idea
4:07 such a pro 'check this out'
"Why does this GPU?"
If I had to guess, this didn't take off because we didn't have OS direct access to GPU storage
I really want this to become normal in the future to throw m.2's onto the GPU
Actually it wouldn't be too bad if they put say 50GB of flash storage on the card, but it would need to be a type that can withstand a LOT of writes. They'd need some software to allow the user to choose which 1-2 AAA games at a time that you'd like to have their relevant texture files cached directly on the GPU. Or they could try to develop some sort of Windows Prefetch cache type thing, where it aggressively uses the 80/20 rule to try to identify the slowest and most loaded texture files that each game uses as you play it, then start saving a history of which files it will want to slowly pre-load onto the card the next time you go to play it. Perhaps they could pre-calculate what those are on a per-game basis and distribute some sort of map file, sorta like how the GPU drivers these days load different profiles for each game.
I want this for my vector editing programs. Loading large graphics files would make my print processing and rip so much faster (if Corel, Adobe, and Roland would take advantage of it)
Ah yes, ’Why does this GPU?’ 😂