Unboxing an NVIDIA RTX 6000 Ada Lovelace GPU for Deep Learning

Jeff Heaton

zhlédnutí 21 822

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 6. 09. 2024

Komentáře • 71

@HeatonResearch Před rokem ⁺¹¹
One correction from the video: NVIDIA RTX 6000 Ada Generation actually has the 4th gen Tensor cores and 3rd gen RT cores.
@kolakareem2441 Před rokem
Thank you Prof. Jeff . I have an NVIDIA GeForce RTX 3060. Earlier in the day, I tried to install the NVIDIA TAO Toolkit on my windows PC for some tasks but unfortunately, I discovered that the toolkit only supports Ubuntu. Is there a way to use the toolkit on Windows please?
@AngeloKrs878 Před rokem ⁺⁸
This is awesome. For us deep learning enthusiasts, it feels like being a child opening a new toy
@maximinmaster7511 Před rokem ⁺¹⁸
Hello, it would be interesting to compare the processing time of an RTX 6000 deep learning program with the offers of google colab pro & pro+.
@HeatonResearch Před rokem ⁺⁷
Good idea.
@codepour Před rokem ⁺⁸
Wow. I'm truly jealous! Are you going to create a video about performance (vs. Ampere A6000 maybe)?
@HeatonResearch Před rokem ⁺⁵
Thanks will consider that.
@mr.electronx9036 Před rokem ⁺³
i would love to se gaming performance in 4k Pathtracing
@jedcheng5551 Před rokem ⁺³
I have no idea who comes up with this naming scheme. We have the Quadro RTX6000 (Turing) in our cluster and the documentation has to include a sidenote saying: we are not using the RTX6000 Ada.
If someone publishes a paper and say that our algorithm takes 1 day to run on this dataset on an Nvidia RTX6000. How should one intepretate the performance? The performance improvement from a Quadro RTX6000 to RTX6000 could be up to 4 times.
@HeatonResearch Před rokem ⁺¹
I know! I had to do 2 retakes when I called it an A6000 by mistake. I would have made it an L6000, for Lovelace. But that is just me. I guess its like hurricanes when you run out of letters.
@tHeWasTeDYouTh Před rokem ⁺¹
you know it's serious when the gloves come out
@gtk2k Před rokem ⁺⁴
I would like to see a deep learning speed comparison between A6000 x 2 + NVLink and A6000 Ada x 2.
@keylanoslokj1806 Před 7 měsíci
They removed NVlink from Ada
@jargomanihilda8743 Před 4 měsíci
@@keylanoslokj1806 that's why he's asking
@woolfel Před rokem ⁺¹
must be nice to get freebee RTX A6000 video card. To think just 5 years ago, getting a card with 48G seemed excessive and really expensive. Now it's still expensive, but necessary to work with images that are bigger than 300x300 pixels.
@cypher2030 Před rokem ⁺³
rtx4090 use adalovelace and cuda 12 too. Do you have any tutorial to use or how compile tensorflow with Cuda 12? the only alternative I found is using nvidia NGC container. This works, but I'd like to not use containers or create my own image. thank you!
@AI-xi4jk Před rokem ⁺¹
I previously took ngc container, found the TF src dir and compiled from container, then copied a wheel file outside. Not sure if this is still possible but you can try. I did it cause I needed TF with Tensorrt support.
@jonnymacn9457 Před rokem ⁺⁴
This might be a stupid question, but why is the 4090 a huge beast and this is slim and tiny, i understand the 6000 isnt for gaming, so hypothetically if you are pushing this GPU for Deep learning, would you hear the single and fan and would it thermal throttle?
@HeatonResearch Před rokem ⁺⁵
That is a good question, and if I had an extra $8K I would do a tear down video maybe. My best guess is this is part of the extra engineering (cost) that goes into making it a more dense design, with different cooling. Similarly, the A100, H100, V100 etc, are all fairly slim.
@tortugatech Před rokem ⁺⁷
Games have no multi-GPU support anymore, you better make the biggest, fattest single GPU you can! If you want the best performance you will have to push it to the absolute limits of it's efficiency curve... ML models on the other hand CAN use multi-GPU settups, you can theoretically add 4 of these to a desktop, or 8 to a server, where they're probably gonna be watercooled anyway.
So how is this GPU so much smaller?
Well for one, even though the RTX 6000 and RTX 4090 use the same chip (AD102) and have the same boost clock, 2500 MHz, their base clocks are massively different @ 915 MHz vs 2235 MHz respectively. So as long as the GPU clock in real workloads is above 915 MHz on the RTX 6000, it's performing "in spec", as it should... Meanwhile, the 4090 regularly boosts beyond 2850 MHz without any overclocking, and it has to do this without making too much noise, as it's gonna stay in someones living room or bedroom, where people prefer quieter GPUs. The RTX 6000 is a workhorse, with Professional grade drivers unlocking some of the more obscure capabilities of the AD102 chip. It's primarily intended to go into a server, where noise output or price of the part is the least of the buyers concern, but energy efficiency, reliability and ease of upgrading is. So it has to be in the same 267mm in length by 2 slots thick form factor as the last 10 years of compute cards, as to be possible to fit 8 of them into a server from 2019, for example. So the 4090 uses GDDR6X memory that performs 5-10% better than GDDR6 (and for cheper than top tier GDDR6), but it has a disproportionately higher thermal output and power draw for that miniscule gain, so the RTX 6000 simply uses GDDR6 for an easy win in terms of energy and thermal efficiency. All of this makes the card much more energy and thermaly efficient, allowing for a smaller form factor than its bigger brother, the RTX 4090.
@ProjectPhysX Před rokem ⁺²
@@tortugatech mostly agree. But there is no "unlocking of more obscure capabilities", that is an ancient misconception from Kepler-era, where GK110 gaming GPUs had their FP64 rate cut-down in drivers compared to their workstation counterparts.
RTX 6000 Ada and RTX 4090 are the exact same AD102 chip, except for a couple disabled CUDA cores on the 4090. The AD102 chip is almost entirely incapable of FP64. The 6000 Ada does not have any additional compute features; it is a downclocked 4090 with 2x the VRAM.
Many professional applications are only bound by VRAM capacity and bandwidth, so it makes sense to reduce GPU clocks to not have it burn 450W when it's mostly just waiting on VRAM access. The 2-slot cooler allows packing multiple of these in a workstation or server. This is also the reason why all gaming cards nowadays have ridiculously oversized coolers, otherwise people would use the much cheaper and equally fast gaming cards in their GPU servers.
@tortugatech Před rokem
@@ProjectPhysX I'm not talking about FP64 compute here, I'm talking about application specific driver-level optimizations that are enabled in professional drivers for quadros and titans but are disabled in Nvidia's consumer drivers. This has been last confirmed back in late 2020 by Nvidia, you can check Linus' 3090 review video at 9:30, where Turing Titan beat out the massively more compute capable 3090 in some professional workloads...
@ProjectPhysX Před rokem ⁺²
@@tortugatech There is no such thing as driver-level optimizations, at best it's bug fixes. These performance differences are due to artificial software limiters: Some "professional" closed-source crap software cripples itself if it detects "GeForce" in the device name, for example Siemens NX or CATIA.
I know, because I write software for high-end GPUs.
@babepp2813 Před rokem ⁺¹
Hi professor, I have a question
You’ll make video of these beast on native Linux?
@bdouglas Před 9 měsíci
Love the channel and content! Thank you!
@datapro007 Před rokem ⁺¹
NVLink - Isn't it true that if it had it, you could combine two cards for a seamless 96GB of RAM? Not that I'd complain about 48GB 😆 Great channel Jeff - thanks!
@reynaldozabala9704 Před 7 měsíci
Correct me if i'm wrong but NVLINK support pulled over a year ago on newer drivers since PCIE gen 5.0 will support dual GPU close enough to NVLINK speed
@datapro007 Před 7 měsíci
@@reynaldozabala9704 "Now in its fourth generation, NVLink connects host and accelerated processors at rates up to 900 gigabytes per second (GB/s). That's more than 7x the bandwidth of PCIe Gen 5"
@reynaldozabala4474 Před 7 měsíci ⁺¹
@@datapro007 I stand corrected on the speed, however, point still stands NVLINK support is not available on for Ada/LoveLace gen cards like 4090 and RTX 6000 ADA. Only runs on Hopper/Data Center GPus
@TheLokiGT Před 4 měsíci
This card supports P2P over PCIe, contrarily to 4090s. A bit less bandwidth than NVLink (64 vs 112 gbps), but still sufficient for two cards.
@TheLokiGT Před 4 měsíci
@@reynaldozabala9704 PCIe 5.0 will provide more bw than 3090 or A6000 nvlink (128 vs 112 gbps).
@karolstylok542 Před rokem ⁺¹
Is it better to use 4090 if I am going fo use LSTM network?
@AI-xi4jk Před rokem ⁺¹
This is basically 4090 with double memory. It’s memory clock a bit lower but it has other advantages like encoders/decoders, lower power , etc. it will depend on your model size.
@yueyue6463 Před rokem
I am also looking at this card for vray, but it does not have nvlink anymore, wondering if 48 gb is enought for rendering a very huge scene.
@alishershermatov5953 Před 8 měsíci
Anyone tell me how much average take for fine tuning Stable Diffusion with RTX 6000 ADA?
@451 Před rokem
Are these comparable to an a100 for deep learning? Thinking of getting a lambda labs setup!
@ariel88579 Před rokem
How are the thermals on a full load? Interested to see if this will be viable option long term for 3D rendering.
@Cooper3312000 Před rokem
Don't think I would ever buy a 7 grand gpu. The first gen A6000 is very tempting now with the price drop. Wouldn't use for deep learning but vGPU in a VMware server.
@eprohoda Před rokem
Enjoyed, perfect sharing. catch ya later- Heatonresearch!))
@falsevacuum1988 Před rokem
Is fp32 grade GPU suitable for AI? I always thought what AI training and scientific calculations are fp64 i.e. for Tesla (H100) cards.
@Maisonier Před rokem
Wow this is amazing ... but I don't know why I'm watching this videos, I will never have enough money to buy this gpus ...
@glenyoung1809 Před rokem
Things are going to get even more confusing.
Rumors persisting that NVidia thinking of launching a RTX 4090 Ti and Titan RTX Ada!
Lookup "Nvidia RTX 4090 Ti and Titan RTX Ada: Everything We Know" on Tom's Hardware.
The main difference from the RTX 6000 Ada is that the 6000 has 96MB L2 while the other have 72MB, main mem is 24GB in the 4090 Ti and 48GB for the Titan Ada.
also power consumption (TDP) is as follows
4090: 450W,
4090 Ti: 600W,
Titan Ada: 800W, ie these are fancy space heaters...
Apparently the 6000 Ada has "only" a 300W TDP because it's not meant to push 4K video frames out as fast as possible but meant to live in a data center environment where power consumption and computational efficiency matters(Gflops per watt).
I thought high end gamers had deep pockets till I looked at the 6000 Ada pricing...deep learning GPUs are for those with very deep pockets and a very serious commitment.
Pricing: RTX 4090 $1599USD MSRP
RTX 4090 Ti: Arm+Leg
Titan RTX Ada: Arm + Leg + Kidney.
RTX6000 Ada: as with the Titan but also throw in your spleen...
Of course you could also rent an AI instance from AWS, MS, Google etc by the hour as needed.
@ANIRUDHMURALEEDHARAN Před rokem
Does any other variant of this GPU support PCIe 5.0.
@apek5101 Před rokem
But when does it come back in stock ?
@tl2uz Před rokem ⁺¹
Beautiful
@TheLokiGT Před 4 měsíci
This time at least you won't daisy-chain pcie connectors :D
@Rock1Pop2Punk3Metal4 Před rokem ⁺²
I can play League of Legends with this?
@HeatonResearch Před rokem ⁺¹
If it works with one of the high-end 40-series, the likely.
@jamesbp Před 7 měsíci
great vid, thanks.
@redfield126 Před rokem ⁺¹
Nice gpu. I have a question. Maybe dumb one. Is tensorflow support Ada ? I mean what versions of libs I have to install to make it working 100% ?
@HeatonResearch Před rokem ⁺¹
Yes, it works just fine. Install the latest driver.
@redfield126 Před rokem ⁺¹
@@HeatonResearch thank you for the answer. I started learning and working on ML few months ago. I learn a lot with your channel. Eager to see a next video with this new beast at work !
@XxXx-sc3xu Před rokem
love this vid!
@Biedropegaz Před rokem ⁺¹
video about unboxing expensive gpu authored by person with PhD, the world is going crazy:-(((((((((
@HeatonResearch Před rokem
ikr
@Biedropegaz Před rokem
@@HeatonResearch sorry 4 the comment but videos about unboxing are sth like ceremony of having new precious item, ... best 4 Jeff :-)
@rendermanpro Před rokem
"I'm happy to get just GPU" - if I could get it for free, I'll be happy as well without box :)
@JamesSmith-sw3nk Před rokem ⁺¹
Please give us some gaming/synthetic benchmarks. It's a very uncommon card.
@bikurifacebook4553 Před rokem
your new toy.........so happy
@henryfawcett5447 Před 6 měsíci
your leng
@qjiao8204 Před rokem
I really doubt about this guy knows enough about deep learning. In most of the video he just did open box and installation without showing any examples of the performance.
@jargomanihilda8743 Před 4 měsíci
Oh ya, he's just an a.i professor at a university
@ilovehotdogs125790 Před rokem
Now run gta
@bikurifacebook4553 Před rokem
NVIDIA RTX 6000 Ada Generation
@bikurifacebook4553 Před rokem ⁺¹
humans pay $$$ to see a astronaut in a horse

Další v pořadí

Automatické přehrávání

What is the best GPU, A6000 or the RTX 3090.The truth for rendering, Workstation GPU vs. Gaming GPU