RTX 4090 Chip deep-dive
Vložit
- čas přidán 2. 06. 2024
- Join me on a deep-drive into Nvidia's AD102 chip, the chip powering the RTX 4090! We will discover the true size of this "chip monster" manufactured in TSMCs 4N process node, talk about the large L2 cache, a possible RTX 4090 Ti, discuss pricing and the future of Ada Lovelace.
Follow me on Twitter: / highyieldyt
0:00 AD102: The huge chip inside the RTX 4090
3:03 Process nodes, transistors density & die-size
7:02 Where did all the transistors go?
7:35 RT-Cores & Tensor-Cores
9:26 Memory System / L2 Cache
10:51 GPU Binning
13:41 Power Efficiency
16:27 Pricing & 4090 Ti
ComputerBase Review (german): www.computerbase.de/2022-10/n...
TechPowerUp GPU Database: www.techpowerup.com/gpu-specs/
Der8auer RTX 4090 video: • The RTX 4090 Power Tar... - Věda a technologie
Were you surprised by the true size of the chip inside the RTX 4090? And how much bigger did you guess it was compared to the 3090 Ti?
I was surprised that it is slightly smaller in mm²
For transistor count I guessed >2.2x, since it's about 80% faster in games. The smaller total area was a surprise how tiny 4N is. It couldn't be much larger, 700-800mm² is the maximum size for the process, right?
Keep up the good work, really enjoyed the information you provided and the way you did it .
Makes me think there is no point in more power other than to saturate 8K. For 4K, this is all you need. If they based a console off this chip, it would render reality.
very few channels go this deep inside the chip technology, thanks for the videos.
That's just simply not true, just off the top of my head Asianometry is one of many other creators who goes deep into these subjects. But by all means, HighYield is a really great creator!
@@WAY2PWN thats still very fewz
cose it is not a very popular topic. look at the number of comments in the comment section. "very few" people understands/wish to know what a transistor/chip is. You are probably the only person in your immediate social circle that knows anything about transistor, unless you are a IT engineer. 😄
I keep rooting for your channel to get some exposure. I know it won't mean shit, but the bunch of us out here watching appreciate every frame of your videos. Keep it up man, this is one of the very few "hardcore arch" channels that tickle me just the right way.
The usual excellent job on content, presentation, editing and sound. Keep them coming!
FYI - I always make it to the end, you have good analysis, and perspective. Yes, I was surprised at the silicon size, from other tubers I expected it to be larger. That is a staggering amount of transistors too! Now Im so curious what AMD will release, and how it will be priced.
here, thanks for making these in-detail videos. I really like every video of yours so far for going into real depth. Keep up this Awesomeness!
Found your channel recently, loving the detail and clarity of your analysis! Keep it up 👍
Always so interesting video - Great pace and nice calm voice too!
Great deep dive, thanks!! Not that it is unexpected from you. FYI I did watch all the way through yesterday, I just couldn't comment on my TV and didn't have my phone handy, so I came back to make sure after DLing the new video for Mt drive to get the little one. I'm sure it's great. Thanks again!
Great video, very informative. I wonder if the 88% binning for the 4090 might mean yields are good but great on 4N. I do find it strange that they didnt consider enabling more transistors but clock the chip lower. Would have had similar performance but much less power.
350watt? Then create an unlimited power version and clock it 10% higher and that's your ti or whatever. They Nvidia would have to disable more yields to a 4080 or 4080ti.
The RTX 4090 was tuned down a lot from what was planned, making it pretty efficient.
I was curious about how much of the chip was disabled, given that it's huge size on a cutting edge process. Thanks for the info.
I thought it still retained what the leaks said a hefty l2$ of 96MB, at 72MB it seems as the rumored 4080 class we heard about it in the first rounds of rumours about AD.
I know I'm not the only one who glanced at the the thumbnail, and read AD102 as "ADIOS"... For a second I thought Nvidia was calling it quits.
More like, Adios hasta la vista baby ;)
Love the in depth explanations! keep em coming
Thanks for doing what you do!! Good content are often less hyped but it takes the same or even more effort!!!
What an incredibly interesting deep dive. I am looking forward to future video's!
I really love this type of context. As a render programmer, I find this stuff really fascinating.
these videos are always so well produced!
New to this channel. Really like the fine details of getting into the "nitty gritty" behind the generation on generation
Good work...love the deep dive
Don't usually comment, but as a undergrad student in comp engineering, your channel has taught me so many things our course has left out on. And other tech channels haven't tried as good as you have. thank you very much 🥰. Hope I get to learn a lot more
Best explanation yet! Well done!
Was the media engine changed...ie avx1 decoding hardware added? Enjoyed the watch!
Love this high end stuff! Really, though, I'm most interested to see if this trickles down to lower priced cards this generation.
Thanks for these analysis vids.
Another solid deep dive into a part that lots of gamers talk about without really caring to look into it. Thank you for dissecting that beast, its even more impressive after all the info from your presentation. Also congrats for the excellent production value of your content. Like in football (USA), you got to go ALL THE WAY! Hats off to Green Team.
Good work! New sub🍻🏆
I promise I did not look this up now, but I have seen the counts before, but have maybe misremembered them
My guess for how many more transistors AD102 has:
GA102: 2.7x
TU102: 4x
GP102: 6.5x
Awesome analysis! Great video!
Ihis is first of your videos that i watched, seems Cool!
quality video, excellent! thx you for all these informations
Yes, a comment on watching all the way. The binning examples were fascinating. Now to go search for your other deep dives.
My guess was twice as big. I guess that’s not as bad if you consider how much ends up being dark silicon on the 4090!
Rewatched again great overview
Watched till the end to get the full scope of your review.
Fantastic breakdown, thank you!
Nice one thanks. Was wondering if you could you make a vid comparing AMD and Nvidia Architectures and more specifically CUDA vs CU. I think with RDNA3 such a comparison would be very interesting.
Excellent video!
I’m a simple man- I see a High Yield post? I click
I really love your videos. They are so precise and informative. 👍
I already knew the transistor count and the die-size (I'm such a nerd ... 😂).
I think Nvidia might have binned the RTX 4090 that way to obtain a higher yield rate (remember, the N4 process is still relatively new and TSMC will probably have a higher percentage of faulty transistors than the "old" and well-known 8 nm Samsung process). It doesn't really make sense, to push a die with 100 % active transistors, when the process is not yet ready to deliver such a high yield rate. Nvidia will buy some time with the RTX 4090/4080 for TSMC to streamline the new process and they will deliver a fully activated AD102 when the yields justify such a step.
It doesn't have to be this way, just some thoughts on it. 👌
Thank you so much for making this video, it's refreshing to see specialised insight with the perfect balance of existing knowlege assumptions and detailed explanations.
This video is amazing, I'm watching all the way through.
I’m enjoying the content you’re putting out, keep up the great work!
I appreciate it!
Great analysis!
Subbed ! Great channel/
There's a lot of tech channels but you stand out as one of the better in-depth knowledgeable channels. You're not just vomiting unconfirmed news or repeating the same thing everyone else does. I'll watch your channel until you stop bro, and I hope you don't stop.
That wa great video, thank you 👌
Good video!
Great content. My kind of spoon-fed knowledge dump. Keep feeding me these awesome videos.
thanks for the video , very intresting
Good video, but I have some points:
1. From die shots available, RT & Tensor units use very little die space (
The 4090 has the most over engineered cooler we will probably every see. Well, wait.. the 4080 16gb will use the same one!!! Damn, crazy. I’m capped at 120fps at 4k in destiny 2 at max settings and I don’t see more than 50°c peak heat and 58c hotspot. Peaks at 60% power draw (hwinfo). It’s insane efficient
Finally, someone going into DETAIL about leading edge tech, I always learn something fun every video
I made it that far :) interesting video, thank you
I feel Nvidia's decision to max out the 4090 is based upon their believe OC users are a big part of their top of the line customers.
Impressive analysis.
Leaving the comment as asked lol. I've been binging your videos because I guess i love my job animated
.
I watched the whine video ☺️
I love deep-dives into hardware 🥰
And the reason for all this seems to be the monster RDNA3, we'll see.
I would love more videos like this 💝 detailed yet simple words ✨
wish u all the success 💖🏆
Driver improvements could probably make use of that extra TDP
Very interesting. Can't wait to see your take on Radeon's 7000 offering
Still watching - fascinating stuff, especially back to back with the AMD presentation
The power draw limiting from my experience isnt quite as good as some make out. I ran some benchmarks at 90%PL and noticed some small wobbles in clock speed, the end result of the bench had the same ave fps and just 1 less max. But it should be noted that at 100% PL, the card wasnt pulling near 100%, so limiting it to 90% wasn’t really a 10% drop in power draw.
I then tested again at 80%PL, the clocks were very noticeably wobbling. I cant remember now what the fps drop was - it was small, but there was a drop. Which is fine, you are now actually cutting power (in this instance) by the full 10% below the 90% marker. Probably does increase its efficiency, but I dislike such wobbling.
I wonder if they could bring back the multiple gpus on one card approach like their older 90 series cards used
I've been very curious to understand better this Binning business, would the same chip go into a 4080 as 4090 but the difference being how many "bad" transistors were found? Love the vids, tech geek nerds all the way!!!!
Well that was informative
7:41 The use of "exponential" as stated is plainly incorrect. Doubling the base number of transistors on the chip results in double the scaled number of transistors, likewise, quadrupling the base quadruples the scaled number of transistors. The factor on the base and scaled transistor count is the same, which describes a *linear* mapping.
watched to the end 😀
Rare quality analysis!
My guess is that the area is 430mm² and that it's 75% larger than the GA chip.
I am both excited & excited !!!!!!!!!!!!!!!!!!!!!!!!
I am surprised that people start the video and then do not finish it. Unless interrupted of course. This is great stuff!
I'm happy you liked it!
Can they make 700 mm + area with 4nm node? If yes this would be more transistors?
Strictly looking at density, N4 is 2.4-3x denser than 8LPP. So, I’d go with 2.7x.
How does it relate to their deep learning accelerator?
Do they use the same chip with other features activated or a completely different chip?
Nvidia has different architectures, for AI acceleration they have "Hopper" instead of "Ada Lovelace". Parts of the AD102 chip use tensor cores (mainly to accelerate DLSS), which is also used on Hopper for AI stuff (like ChatGPT). So its related, but Nvidia has a different chip with a much larger AI focus.
Made it to 13:42… gonna watch it all
Every product I see using TSMC's 4nm node performs with an extreme uplift in performance and efficiency. Just shows how strong tsmc's fab processes are.
I don't have to move the power slider. It is anyway using less power in the games I play.
Watched to the end.
Idk much about those super high end setups some people use, but spending transistors on new and evolving technologies rather than just scaling up seems a much better use of resources. I'd really expect diminishing returns on scaling up these beasts even more an irritating Moore's law
Is there a die that isn't binned and is fully used? what power would it draw and what would it be like?
Theres a version thats only very slightly binned (about 1.4%) and used for the RTX 6000, a professional GPU. The RTX 6000 has much lower clock speeds tho and a TDP of only 300W. A 4090 Ti based on a similar chip, with higher clock speeds, should be around 500W TDP.
Maybe I'm gonna talk about it in my next video ;)
@@HighYield damn
So what is the commercial product for the top AD102 bin?
Currently its the RTX 6000 professional card and in the future there might be a similar 4090 Ti or Titan. Check out my 4070 / 4090 Ti fact checking video if you are interested :)
@@HighYield sweet, thanks.
Turing Top die has around 3000+ cuda cores, Not sure about 1080ti but i know it is similar to 2070super at ,2560cuda cores
Are you saying that nvidia makes only 4090 chips and depending on the defect inside the chip its sold as other type 4080, 4070?
Great Channel
Nvidia produces different chips for 4090, 4080 and 4070, but for example the 4070 Ti and the 4070 will be based on the same chip, only the 4070 has more defects. And if Nvidia releases a 4090 Ti at some point, it will be based on the same chip as the 4090, only with less defects.
I'm curious to see how far the 5000 series cards will go. Will the rtx5090 run Cyberpunk 2077 at a full 4k (no DLSS) at 120fps+?
Did watch all the way LOL
I agree, that NVidia could and should have released a much more attractive product by using a lot less power and being a LOT smaller.
It would have been an instant buy for me.
that's incredible
I reckon at some point they will release a unbinned version of the GPU as the 4090 Ti once they improve their defective chip %
Fully agree! I've talked about it more in my RTY 4070 & 4090 Ti video: czcams.com/video/BpM6zcusweY/video.html
Still here!
With quality videos like this, I hope everyone watches till the end.
Thanks for the kind words!
Maybe the TDP is high because they’re gonna reuse the same board and cooler for 4090ti.
I have a 4090, and I overclocked it, resulting in about a 10% improvement in render times. This is huge when considering how long some renders might take.
What program are you using for rendering?
@@HighYield Octane Render. The improved raytracing performance has decreased render times by 2x compared to 3090Ti; unlike gaming, this engine sees a much bigger leap in performance. You can go and check out Octanebench results and exclude multi-GPU setups to see the scores; you can also set it to display average or maximum scores.
I am watching full video. You are genius CZcamsrs. Best and honest tech CZcamsr
This video is *fantastic* . You have no idea how great it is to just "appreciate" how much work goes into these GPU designs rather then having people mope and bitch about prices and bad marketing on NVIDIA's part. I plan to buy a RTX 4090 FE in the coming months to start making YT content and I have *no* doubt that I will not have to change my GPU for some years to come!
Consumers will always demand more for less thats just how it works if we stopped demanding better it would be bad for market
9bn trans are disable, that means 4090ti would be very interesting, because there are a lot of room to improve, unlike older generation
I watched all the way
I have a 4090 FE. I for one am glad that they gave us 450w and up to 600w with the proper PSU connections. The 4090 is an ultra enthusiast class card. Ultra enthusiasts like to tweak and play with their hardware and see what it's capable of. I would take a power limit slider and 600w capability even if it's hot and super inefficient and have the option to limit it myself, or push it hard to see what it can do 10 times out of 10 over Nvidia bios limiting it to 350 watts and forcing us to have to flash the vbios or worse, do some kind of hardware mod. I think the fact that it is so efficient and can easily just be power limited to 350w while losing practically no performance is just a bonus.
I left a comment down below. What you going to do about it?
On more serious note, I can see a full theoretical die and amount of transistors for a particular chip? I never knew that this information is available. I need to figure our where I can find this info.
TechPowerUp has a amazing GPU data base, check it out!
@@HighYield That was the first thing I had checked. I even went for an individual die and Nvidia's PDF on a ship itself. There wasn't such information anywhere. I found how many transistors there are on a chip, but I could not find how many transistors were are on an individual chip.
Competition is the answer on how Nvidia was able to give 2.7 times more transistors.. It is always what drives innovation forward. They probably could have done it before except they didn't have any competition.
Nvidia 4N is a custom optimized N5 node (5nm). The N in 4N is Nvidia.
It doesn't use 4nm which is N4
N4 is also just a optimized 5nm node, TSMC doesnt have a "real" 4nm process node.