This Server CPU is so FAST it Boots without DDR5

Sdílet
Vložit
  • čas přidán 30. 05. 2024
  • This server CPU has 64GB of HBM2e memory onboard like a GPU or AI accelerator (e.g. the NVIDIA A100 or Habana Gaudi2) that lets it do so many cool things. We take a look at the supercomputer CPU and find that it can be used for a number of other use cases. The Intel Xeon Max 9480 is a really cool server processor.
    STH Main Site Article: www.servethehome.com/intel-xe...
    STH Top 5 Weekly Newsletter: eepurl.com/dryM09
    Note: Intel loaned us not just the CPUs, but also the system we used for this piece. The system has already been returned. We are saying Intel is sponsoring this video.
    ----------------------------------------------------------------------
    Become a STH YT Member and Support Us
    ----------------------------------------------------------------------
    Join STH CZcams membership to support the channel: / @servethehomevideo
    STH Merch on Spring: the-sth-merch-shop.myteesprin...
    ----------------------------------------------------------------------
    Where to Find STH
    ----------------------------------------------------------------------
    STH Forums: forums.servethehome.com
    Follow on Twitter: / servethehome
    Follow on LinkedIn: / servethehome-com
    Follow on Facebook: / servethehome
    Follow on Instagram: / servethehome
    ----------------------------------------------------------------------
    Other STH Content Mentioned in this Video
    ----------------------------------------------------------------------
    - 4th Gen Intel Xeon Scalable Launch Video: • $17K Sapphire Rapids S...
    - 4th Gen Intel Xeon Scalable Launch Article: www.servethehome.com/4th-gen-...
    - Intel Xeon W-3400 3-system Builds: • Building BIG and QUIET...
    - AMD EPYC Competition (Genoa-X and Bergamo): • FANTASTIC 128 Core and...
    ----------------------------------------------------------------------
    Timestamps
    ----------------------------------------------------------------------
    00:00 Introduction
    01:47 Explaining Intel Xeon Max and HBM2e Memory
    05:23 Using Intel Xeon Max
    09:53 Performance
    14:42 Power Consumption
    17:00 Key Lessons Learned
    19:05 Wrap-up
  • Věda a technologie

Komentáře • 370

  • @DigitalJedi
    @DigitalJedi Před 8 měsíci +340

    I worked on this CPU! Specifically the bridge dies between the CPU tiles. I figured I'd share some fun facts about those CPU tiles here for you guys:
    Each CPU tile has 15 cores. Yes, 15. The room that the 16th would occupy is instead taken up by the combined memory controllers and HBM PHYs.
    There is not one continuous interposer. Instead, each CPU tile sits on top of EMIB "bridge" dies as I call them. this strategy is more similar to Apple's than AMD's, or even Meteor Lake's. This is because Sapphire Rapids is so enormous that it exceeds the reticle limit of the machines that make normal interposers.
    There are 4 CPU tiles, but and 10 bridges. The tiles each have 5 connections, 3 on one edge and then 2 on the neighboring edge. 2 of the tiles are mirror images of the other 2. You can get a diagonal pair by rotating one about the center axis 180 degrees, but the other 2 have to be mirrored to keep the connections in the right place.

    • @ummerfarooq5383
      @ummerfarooq5383 Před 8 měsíci +15

      Can it play starfield

    • @marcogenovesi8570
      @marcogenovesi8570 Před 8 měsíci +6

      @@ummerfarooq5383 can Starfield play?

    • @DigitalJedi
      @DigitalJedi Před 8 měsíci +53

      @@ummerfarooq5383 There is enough PCIE and RAM for 7 players to each have the P-cores of a 12900K and their own full bandwidth 4090.

    • @johnmijo
      @johnmijo Před 8 měsíci +13

      @@DigitalJedi thanks for you insight, always nice to see engineers talk about the work they do ;)
      I'm busy playing Starfield and porting it to my C128, why because I think that Z-80 will work as a nice co-processor to the 8510 CPU, ha :p

    • @GeekProdigyGuy
      @GeekProdigyGuy Před 8 měsíci +2

      any special reason why there's an asymmetric 3+2 bridges instead of having 3 on both sides?

  • @stefannilsson2406
    @stefannilsson2406 Před 8 měsíci +151

    I hope they evolve this and bring it to the workstation xeons. I would love to have a unlocked xeon with built in memory.

    • @jondadon3741
      @jondadon3741 Před 8 měsíci

      Yo same

    • @stefannilsson2406
      @stefannilsson2406 Před 8 měsíci +19

      @@startrekkerll5635 What do you mean? You still have memory slots that you can put memory in...

  • @L0S7N01S3Deus
    @L0S7N01S3Deus Před 8 měsíci +39

    Considering new AMX instructions and all that bandwidth afforded by HBM, it would be very interesting to see benchmarks for AI tasks, like running stable diffusion or llama models. How would they stack up against GPUs performance wise, or power and cost efficiency wise? Would be very relevant in current datacenter GPU shortage!

  • @maxhammick948
    @maxhammick948 Před 8 měsíci +37

    Without the RAM slots taking up width, you could pack a HBM-only server incredibly dense - maybe 3 dual socket modules across a 19" rack? Not many data centres could handle that power density, but it would be pretty neat to see

    • @RENO_K
      @RENO_K Před 8 měsíci +5

      💀💀 the cooling on that bad boy is gonna be insane

    • @sanskar9679
      @sanskar9679 Před 2 měsíci

      @@RENO_K with 3m's liquid that boils at almost 50 celcius you could maybe pack almost a thousand per rack

  • @Mr76Pontiac
    @Mr76Pontiac Před 8 měsíci +12

    One of the nice things about "Serve the HOME" (Emphasis on HOME) is that we get to have a glimpse to see what we'll be running in our HOMES as low end servers in 30 years....
    I'm 5 minutes in and I can't imagine the cost of those things when they come to market, not to mention the REST of the hardware costs.

  • @Gastell0
    @Gastell0 Před 8 měsíci +16

    Damn, that localized memory is incredible for SQL instance/shard, web server cache and so much more.
    HBM memory runs at lower wattage than DDR memory, with significantly higher bus width and lower frequency required to achieve high bandwidth (afaik).
    p.s. Didn't show the bottom of it even once =\

    • @aarrondias9950
      @aarrondias9950 Před 8 měsíci

      Bottom of what?

    • @Gastell0
      @Gastell0 Před 8 měsíci

      @@aarrondias9950 the cpu module/pcb

    • @aarrondias9950
      @aarrondias9950 Před 8 měsíci

      @@Gastell0 1:01

    • @Gastell0
      @Gastell0 Před 8 měsíci

      @@aarrondias9950ooh, that was in introduction, I looked over again everywhere but that, thanks!

  • @shammyh
    @shammyh Před 8 měsíci +17

    Great content Patrick!! Been waiting to hear about these for a while... And you always get the cool stuff first. 😉

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  Před 8 měsíci +5

      This one took a long time. Partly due to the complexity but also moving STH to Scottsdale and doing like 40 flights over the summer. I was hoping to get this live before Taiwan last week.

  • @edplat2367
    @edplat2367 Před 8 měsíci +7

    I can't wait to 5-10 years from now when see this come to high end gaming machines.

  • @gsuberland
    @gsuberland Před 8 měsíci +14

    On the topic of 1000W power draw, I believe these are the same CPU power delivery topology that Intel showed a while back during some of the lab tours (e.g. I believe one of der8auer's videos in the extreme OC labs showed this off), where you have a relatively small number of VRM phases on the motherboard providing an intermediate package voltage, followed by a massive number of on-die power stages (100+) parallelised into a huge segmented polyphase buck converter, which helps reduce ohmic losses and PDN impedance by moving the regulation closer to the point of load on the die. The combined continuous output current of the on-package converters appears to be 1023A, logically limited by the number of bits in the relevant power management control register. This kind of current delivery would be unworkable with a traditional VRM, but since the phases are physically distributed around the package the average current density is heavily reduced.

  • @sehichanders7020
    @sehichanders7020 Před 8 měsíci +18

    8:53 I always figured HBM was the endgame for the entire Optane thing. Too bad it never really panned out since it had mad potential and could have changed how we think about, for example, database servers all together. Intel sometimes is so far ahead of themselves even they can't catch up to them (and then something like Arc happens 🤦‍♀)

    • @TheExard3k
      @TheExard3k Před 6 měsíci

      HBM gets wiped as any other memory on power loss. It has nothing to do with Optane and persistent memory

    • @sehichanders7020
      @sehichanders7020 Před 6 měsíci +1

      @@TheExard3k It's not about persistency. But when your persistent storage is so fast and low latency as Optane was supposed to be you can get away much much smaller memory pools, hence you can use faster HBM.
      The entire promise behind Optane was that it is so fast (especially IOPS wise) that you don't need to keep your entire application's data in memory.

  • @Strykenine
    @Strykenine Před 8 měsíci +8

    Love a good datacenter CPU discussion!

  • @BlackEpyon
    @BlackEpyon Před 8 měsíci +7

    Some of us remember when CPUs had L2 cache external to the CPU. Then the Slot 1 had the cache integrated onto the same card as the CPU, and when the Pentium III came out, L2 cache was completely internal to the CPU die. I don't see external RAM going away any time soon, just because of how useful it can be to just add more RAM, but this seems to be following the same evolution, and the performance it brought. Perhaps one day we'll see internal RAM on consumer CPUs as well!

    • @RENO_K
      @RENO_K Před 8 měsíci

      That's seriously cool

    • @fangzhou3235
      @fangzhou3235 Před 6 měsíci +1

      No the original Pentium III (0.25um Katmai) does not have on die L2. It only comes in the 0.18um Coppermine version, which was super cool. The 500Mhz coppermine can OC to 666MHz without a sweat.

    • @maxniederman9411
      @maxniederman9411 Před 3 měsíci +1

      Ever heard of M-series macs?

  • @cy5911
    @cy5911 Před 8 měsíci +7

    Can't wait to buy these 5 years from now and use it for my homelab 🤣

  • @BusAlexey
    @BusAlexey Před 8 měsíci +4

    Yes! Waited long time for this monster cpu

  • @thatLion01
    @thatLion01 Před 8 měsíci +1

    Amazing content. Thank you intel for sponsoring this.

  • @CobsTech
    @CobsTech Před 8 měsíci +11

    While I work with virtualisation a lot compared to specific high performance workloads, this has always begged the question for me, even when playing around with a legacy Xeon Phi 5110p CoProcessor, how would a chip like this handle memory failure? Nowadays whenever we have memory failure, ECC kicks in as a first resort and then you have options such as Memory Mirroring so your workloads can continue with a reduced amount of available memory.
    How would a chip like this handle it, say, one of the HBM packages was defective or outright didn't work, does the BIOS of the system have any form of mirroring? Considering this is four seperate packages working as one, would this prevent the chip from booting up at all?
    Great coverage though, always fun to see what new products in the HPC sector brings to the table.

    • @skunch
      @skunch Před 8 měsíci +2

      if the memory fails, throw it out. This is the way now, integration of core components at the sacrifice of modularity and repairability

    • @autohmae
      @autohmae Před 8 měsíci +1

      I don't know if this system supports it, but CPU hotplugging exists. Maybe the least useful way to do it, but that would be 1 way

  • @hermanwooster8944
    @hermanwooster8944 Před 8 měsíci +4

    I remember you telling me this episode was coming a few weeks ago! The idea of memory-on-a-chip would be sweet for the consumer audience. It was worth the wait. :)

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  Před 8 měsíci +2

      Took a little longer than expected because of a trip to Taiwan. I hope you have a great week

    • @BlackEpyon
      @BlackEpyon Před 8 měsíci

      Similar to how L2 cache used to be external to the CPU, then moved adjacent to the CPU with the Slot 1 and Slot A, and then moved completely internal to the CPU die, gaining performance with each evolution.

  • @shiba7651
    @shiba7651 Před 8 měsíci +12

    Pfff the cpu in my server is so fast it boots with ddr3

  • @chaosfenix
    @chaosfenix Před 8 měsíci +13

    I hope this is something that filters down to consumer parts. Especially for APUs with integrated graphics we are pretty clearly getting to the point where they are being limited by memory bandwidth. The Z1 extreme with 8 CPU cores and 12 GPU cores is only about 5-30% faster than the Z1 with only 6 CPU cores and 4 GPU cores. These two chips are meant to operate in the same power limits and are running the same architectures. Given all that you would think that something with 3x as many GPU cores would be much faster but that just isn't the case and it is my guess that it is probably due to memory bandwidth. GPUs are bandwidth hungry and there is a reason GPUs pack their own specialized memory. I wonder if combining this with an APU couldn't let that iGPU stretch its legs to its full potential. Here is hoping.

    • @ummerfarooq5383
      @ummerfarooq5383 Před 8 měsíci

      I want to someone run starfield on it just for show. Of course let the cpu be overclocked to 5ghz

    • @chriswright8074
      @chriswright8074 Před 8 měsíci +1

      Amd instinct

    • @DigitalJedi
      @DigitalJedi Před 8 měsíci +12

      This issue is that HBM is very expensive, and doing HBM right means a pretty much ground-up design for your chip to not only fit in the PHYs for the kilobit+ bus, but also the differences in controllers and possibly dual controllers if you still want DDR5 options.
      I've worked with HBM, and when you get to the class of connection density it requires, you need to spend the big bucks for a silicon interposer. Radeon Fiji did this, Vega and VII, and The Titan V come to mind. That is a whole massive die you need to make and then stack at least 2 other dies on top of.
      An HBM APU sounds awesome I agree, we even saw a glimmer of it with the i7 8809G, which had a 24CU Vega MGH GPU and 4GB of HBM. The more practical approach for right now though would be something with a dedicated GDDR controller, even just 128-bit 8GB would be plenty, as that is already around 288GB/s of bandwidth you aren't fighting the CPU over.

    • @-szega
      @-szega Před 8 měsíci +2

      Meteor Lake has hundreds of megs of L4 cache in the interposer, presumably mostly for the iGPU and as a low-power framebuffer (somewhat like the M1).

    • @chaosfenix
      @chaosfenix Před 8 měsíci +2

      @@DigitalJedi Yeah I know there are definite issues. HBM has a 4096 bit bus which is gigantic compared to anything else and is why you need the complex interposer. Intels EMIB looks interesting and may help in that respect but we will have to see. Personally I would not have the option for additional DDR5. This would be replacing it. Many systems already use soldered memory so this would simply be an extension of that. I would dare say 90% of consumers don't bother upgrading the RAM on their computers anyway so if it is balanced properly it wouldn't be much of an issue.

  • @Superkuh2
    @Superkuh2 Před 8 měsíci +10

    64GB is kind of small for any AI workload that would take advantage of the memory bandwidth.

    • @GeekProdigyGuy
      @GeekProdigyGuy Před 8 měsíci +2

      Compare it to GPU VRAM - sure top of the line GPUs have slightly more but H100 is pretty industry standard and has 80GB. Considering CPUs are definitely going to have way lower throughout than GPUs it doesn't seem like capacity would be the issue.

    • @ThelemaHQ
      @ThelemaHQ Před 8 měsíci

      its a HBM2e also works like VRAM, its superfast btw my P40 24GB tesla with GDDR5 gets 2,50 sec in stable diffusion, while P100 16GB with HBM get 0,8 - 1,5 now imagine i use double P100

    • @Superkuh2
      @Superkuh2 Před 8 měsíci

      @@ThelemaHQ stablediffusion isn't really memory bandwidth limited. Things like, say, transformer based large language models are.

  • @--JYM-Rescuing-SS-Minnow
    @--JYM-Rescuing-SS-Minnow Před 8 měsíci +1

    thanks 4 the tech vid Patrick!! wowee 4 Intel Xenon Max!! gota get a few!! giddy up!!

  • @georgeindestructible
    @georgeindestructible Před 8 měsíci

    The ventilation in these looks great.

  • @waldmensch2010
    @waldmensch2010 Před 8 měsíci

    I had testet Xeon Max a few months ago for kvm/vmware and did not performed well. this is only for hpc useful, nice video

  • @gheffz
    @gheffz Před 8 měsíci +1

    Thanks!! Subscribed, All.

  • @OVERKILL_PINBALL
    @OVERKILL_PINBALL Před 8 měsíci +5

    Interesting CPU for sure. All about finding the best use case. I was thinking this CPU might also be used to drive faster networking if it is using the HBM memory. Not sure if that was tested.

  • @RR_360
    @RR_360 Před 8 měsíci +1

    I would love to have one of those old servers in your studio.

  • @ytmadpoo
    @ytmadpoo Před 8 měsíci +2

    I'm wondering how it would do running Prime95. With multiple cores per worker, it can hammer the memory pretty hard so the throughput of HBM should significantly boost the per-iteration speed, assuming the clock rates of the cores are decent. Tuning the worker threads to stick with the NUMA nodes would give the ideal performance (4 worker threads, each using all 14 cores on the same NUMA node). We did some similar tests way back when on a Xeon Phi and it was pretty decent although the HBM on there was much smaller so it still had to go out to "regular" memory quite often which slows things down. I've found that going over regular DDR4, it only takes a couple of cores in a worker to saturate the memory bus, although you do still get marginal improvements as you add cores. By the time I got above 10-12 cores per worker though, you can actually see a degradation as the individual cores are just sitting there waiting for RAM so the overhead can make iteration times drop.

  • @MrHav1k
    @MrHav1k Před 8 měsíci +3

    Good call out of the Intel Developer Cloud there at the end. It's so important to try these kinds of systems out to see if you'll even benefit from these features before you go out and drop a massive bag of $$$ on procuring one.

    • @magfal
      @magfal Před 8 měsíci +2

      Does AMD have a similar service?
      I've been wondering about the benefits of buckets of L3 cache.

    • @MrHav1k
      @MrHav1k Před 8 měsíci +1

      @@magfal AMD doesn’t offer anything like the IDC to my knowledge. Just another edge Intel’s size and resources can deliver.

    • @shanent5793
      @shanent5793 Před 8 měsíci

      ​@@magfal Supermicro has their Jumpstart remote access, they can lend you an AMD server. Bergamo was even available pre-release

  • @berndeckenfels
    @berndeckenfels Před 8 měsíci +3

    Not running ddr5 to save on cooling sounds not very realistic - Who would want to run 100 cores with no additional memory

  • @exorsuschreudenschadenfreude
    @exorsuschreudenschadenfreude Před 8 měsíci +1

    sick bro

  • @EyesOfByes
    @EyesOfByes Před 8 měsíci +2

    8:52 My thought is why Apple didn't try to aquire the Optane ip and patents. Then we wouldnt have to worry about write endurance, and also an even lower latency SoC in combination with the massive amount of L2 Cache Apple has

    • @uncrunch398
      @uncrunch398 Před 8 měsíci +2

      Optane drives have failed to write endurance being exceeded. Being used as DRAM extensions IIRC. Its best placement is as a large swap space or cache for tiered storage to preserve endurance and power on time of other tiers. Intel stopped production / development and sold it due to it not selling well enough. The purchaser IIRC was a company primarily focused on memory. Enterprise and high end prosumer SSDs serve sufficiently where it fits best for a tiny fraction of the cost per cap.

    • @Teluric2
      @Teluric2 Před 8 měsíci

      Because Apple knows they have no chance in the HPC biz , Apple rules where the looks matter.

  • @stevesloan6775
    @stevesloan6775 Před 8 měsíci +1

    I’m keen to see full high performance computers on die utilising a derivative of this tech.

  • @nobodhilikeshu4092
    @nobodhilikeshu4092 Před 8 měsíci

    My computer boots without DDR5 too. Nice to see they're starting to catch up. ;)

  • @thomaslechner1622
    @thomaslechner1622 Před 8 měsíci +2

    What is the cinebench results, single and multi? That is all that counts at the end of the day....

  • @GeoffSeeley
    @GeoffSeeley Před 8 měsíci +6

    @2:23 Ah, so Intel isn't above "gluing" together chips like AMD eh? Ya Intel, we remember.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  Před 8 měsíci +3

      You know I was sitting in the front row when that presentation was given in Oregon back in 2017

    • @billymania11
      @billymania11 Před 5 měsíci

      Kind of a long time ago. Things can change in that length of time right Patrick?@@ServeTheHomeVideo

  • @jmd1743
    @jmd1743 Před 8 měsíci +3

    Honestly it feels like once AMD did their monster sized CPU chip everyone stopped caring about keeping things conventional like how it took on couple to make everyone start dancing at the school dance.

  • @user-ed1pb1qs7r
    @user-ed1pb1qs7r Před 8 měsíci

    you are saying words faster than this processor can handle. I wanted to see traditional tests of this processor in aida64, senibench, 3dmark

  • @EyesOfByes
    @EyesOfByes Před 8 měsíci +2

    So, GDDR6X has higher latency than standard DDR5. How is HBM2e in this sense?

  • @jlficken
    @jlficken Před 8 měsíci +1

    I love enterprise hardware!
    I'm still rocking E5-26XX V4 CPU's at home though 😞

  • @matthiaslange392
    @matthiaslange392 Před 8 měsíci

    With the tiles it looks a little like the chip, that's pulled out of Schwarzeneggers head in Terminator 2. 😎

  • @lamhkak47
    @lamhkak47 Před 8 měsíci

    Is it possible to apply such design to GPU? A bit like HBCC for AMD but you can install DIMM modules on the GPU to give extra RAM for various purpose, such as running large AI models, running heavily modded KSP and try novice shitty program that memory leaks for no reason.

  • @davelowinger7056
    @davelowinger7056 Před 8 měsíci +1

    You know I imagine the CPU of the future. It would be a CPU sandwich. With 4 to 64 firewire ports. first Northbridge. Now system memory

  • @LaserFur
    @LaserFur Před 8 měsíci +1

    I wonder how long it will be before the system boots up with just the cache and then a ACPI message tells the OS when the main memory is online. This would help with the long DDR5 training time.

    • @bradley3549
      @bradley3549 Před 8 měsíci

      Something like that would be valuable in the consumer market I reckon. Servers are already notorious for long boot times so I don't think there is a lot of incentive at the moment to enable a fast boot.

  • @gl.72637
    @gl.72637 Před 8 měsíci +1

    Is this comparable to the Nvidia Grace ARM based CPU with 144 cores that Linus tech tips showed 3 months back? Or just Intel trying to catch up? Would like to see a video about comparing the server against server.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  Před 8 měsíci

      This has been in production and is being installed into the Aurora supercomputer which will likely be the #1 in the world in November. Grace Superchip you cannot buy yet (we covered it on the STH main site) despite the hype.

  • @matsv201
    @matsv201 Před 7 měsíci

    I use to work developing telecom servers that was ultra efficient. Just run on normal intel i series CPU:s.
    The one we had go down to 10W for the whole board with a full intel Xenon CPU if the memory was removed. With the memory they draw like 40 watts. (This was quite a while back, like sandy bridge era)

  • @gusatvoschiavon
    @gusatvoschiavon Před 8 měsíci +2

    I would love to have an arm CPU with hbm memory

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  Před 8 měsíci

      That is powering the former #1 supercomputer: www.servethehome.com/supercomputer-fugaku-by-fujitsu-and-riken-revealed-at-no-1/

  • @whyjay9959
    @whyjay9959 Před 8 měsíci +2

    Do you think DIMMs could disappear in favor of embedded DRAM and CXL memory?

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  Před 8 měsíci +1

      I think CXL memory in the PCIe Gen6 generation will have more bandwidth and be more interesting, but some applications will still like locally attached. More interesting is if there is optically attached memory.

  • @Veptis
    @Veptis Před 8 měsíci +2

    Isn't that also the kind of Xeon where you pay to "unlock" some of the accelerators and frequency curve?
    Also it's not really a workstation part sadly. Intel is marketing their Xeons for workstation, while I want a GPU Max 1100 (PVC-56) as a workstation card. I got hopes for announcements next week. Intel is demoing it on InvelDecCloud and I had a chance to try it.
    I believe my workstation will still get a i9 14900K with custom look cooling (slight chance of tec)

  • @El.Duder-ino
    @El.Duder-ino Před 7 měsíci

    Enterprise and personal chips will continue to be even more tightly integrated and they'll mimic more a motherboard than chips we see today (also with the size). Just check Cerebras chip... memory system is still way behind the compute.

  • @PingPong-em5pg
    @PingPong-em5pg Před 8 měsíci +1

    "HBM memory" resolves to "High Bandwidth Memory memory" ;)

  • @ravnodinson
    @ravnodinson Před 21 dnem +1

    What kind of place would be using something like this and what would they be running on it? This kind of tech is fascinating to me and I don't even know what it's used for.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  Před 21 dnem +1

      Often supercomputer clusters. See the new Intel Aurora supercomputer as an example.

    • @ravnodinson
      @ravnodinson Před 20 dny

      @@ServeTheHomeVideo It is amazing!! 2 billion billion calculations per second. One thing that interests me that was mentioned being done by Aurora was Dr's studying neurology and mapping out the brains neurological pathways. What does the program running that even look like and also that it needs such mind bending computational power? I know I'm in way over my head, but to me it's such awe inspiring work.

  • @ZanderSwart
    @ZanderSwart Před 8 měsíci +1

    as a Xeon 2650v2 daddy this makes me proud

  • @shoobidyboop8634
    @shoobidyboop8634 Před 8 měsíci

    When will this be available for desktop PCs?

  • @TheAnoniemo
    @TheAnoniemo Před 8 měsíci

    Can't wait for ASRock to create a mini-ITX board for this and just have no DDR5 slots.

  • @ted_van_loon
    @ted_van_loon Před 8 měsíci

    sleep states probably are a early version problem, since it likely has to do with the memory needing constant power. in the future with a motherboard which supports 2 seperate cpu voltages at the same time(based on pin groups) or if the cpu's have some added in logic then it should probably work.
    ofcource they might not have given it priority since honnestly a cpu like this right now makes most sense in a server. while it is also great for videoeditting and 3d moddeling and rendering and simulating, most such softwares likely don't support it well enough yet,
    and while good and well maintained FOSS software like blender might support it quite rapidly and quite well.
    many companies who have shown to be very slow and ignorant in adopting new tech like adobe(even though they seem to accept AI pretty well now), and things like solidworks which still don't understand modern computers have more than 1 cpu core.

  • @tomstech4390
    @tomstech4390 Před 8 měsíci +7

    Imagine if AMD started adding HBM2E or HBM3 (that samsung connection they have) onto their Epyc.. aswell as the 1152MB of L3 cache and the 96 fast cores.

    • @IamBananas007
      @IamBananas007 Před 8 měsíci +4

      Mi300 APU

    • @tomstech4390
      @tomstech4390 Před 8 měsíci

      @@IamBananas007 24 cores, but yeah fair point. :D

    • @post-leftluddite
      @post-leftluddite Před 8 měsíci +2

      Well, Phoronix published reviews of the Newest AMD Epycs inluding Bergamo and they literally destroyed even the HBM version Sapphire Rapids chips....so apparently AMD doesn't need HBM

    • @VideogamesAsArt
      @VideogamesAsArt Před 7 měsíci +1

      @@tomstech4390 their MI300C has no GPU cores at all and is 96 Zen4 cores with HBM, but it's unsure whether they will release it since there might be not enough demand for it since their V cache already gives them a lot of memory on-die

  • @majstealth
    @majstealth Před 8 měsíci +2

    damn these 2 cpus alone have half the ram each of my esx have - wow

  • @aarcaneorg
    @aarcaneorg Před 8 měsíci

    I called it! Less than a year after I asked when we would be able to boot servers without even needing to add RAM right here on one of your videos, and here we are! Somebody saw my comment and made it happen!

    • @bradley3549
      @bradley3549 Před 8 měsíci +1

      Hate to burst your bubble, but the CPU design timeline is such that they would have been actively working on this CPU design for *years* prior to review samples being available.

    • @aarcaneorg
      @aarcaneorg Před 8 měsíci

      @@bradley3549 be that as it may, a lot of things, like the ability to use the onboard cache like system ram, are minor revisions that can be made in firmware or opcodes, the kind of tweaks that can happen at the end. The extra cache was planned for years. Booting from it was my idea.

    • @bradley3549
      @bradley3549 Před 8 měsíci

      @@aarcaneorg You're definitely not the first to think of integrating ram and CPU and then booting from it. That's been a feature of CPUs for a LONG time. Just not x86 CPUs. Sorry.

  • @shanent5793
    @shanent5793 Před 8 měsíci +1

    What is so difficult about the integration that Intel does but AMD does not? Why is this harder to do than AMD Instinct HBM or Versal HBM? If HBM is used as cache how many sets does it support and how long does it take to search 16GB of cache for a hit?

    • @lukas_ls
      @lukas_ls Před 8 měsíci +1

      It’s "3D" Stacking, that makes it much more expensive. It’s similar to HBM Packaging (but still different) and not just a couple of Dies glued together on the same package. AMD could so it but they want lower costs.
      AMD uses these packaging techniques but not in Ryzen/EPYC CPUs

  • @shadowarez1337
    @shadowarez1337 Před 8 měsíci +1

    Hmmm Nvidia should take a stack of that HBM2e for a new shield console. And they are sorta hybridizing the next consumer cpu with on-die ram like apple did with the M1-2 SoC's interesting times ahead I can get a frequency Tuned Epyc with enough cores and cache to build out a nice fast NAS.

  • @MNGermann
    @MNGermann Před 8 měsíci

    “I will use this photo that I took at Intel event and I look awesome “ :) :P

  • @matthiaslange392
    @matthiaslange392 Před 8 měsíci +1

    This Xeon's will Serve The Home - all homes at once 😎
    But who needs this power? Usually the storage is the slowest part of a system and you better invest in faster storage than in faster CPUs. Most of the time several cores are idling.
    But i'm sure there are some strange physics-simulstions as a usecase... simulating earthquakes, weather or nuclear fusion... or simply having the fastest minecraft-server of all 😉

  • @michaelmcconnell7302
    @michaelmcconnell7302 Před 8 měsíci +1

    How cool

  • @lordbacon4972
    @lordbacon4972 Před 8 měsíci

    Actually i was wondering if Intel Xeon Max would be a good gaming CPU?

  • @ted_van_loon
    @ted_van_loon Před 8 měsíci

    Ram in a APU would eventually also greatly reduce cost. HBM ofcource is expensive and such.
    but it might become normal to see APU's become the new general CPU's and make APU's more like SOC's, essentially it allows to add in many more features and ram in the cpu allows for much simpler and cheaper motherboards and such.
    meaning that ram integration in low end chips allows to make super cheap and power efficient chips(more normal memory modules).
    that said despite HBM being much more expensive on these high end systems it is great actually many years ago when HBM and HBM2 where still cheap to make(cheap enough to be used in mid tier gaming gpu's) I also recommended doing essentially the same using something like hbm directly in a cpu.

  • @m5a1stuart83
    @m5a1stuart83 Před 8 měsíci

    But how long does it take to compile C++ Project?

  • @Clobercow1
    @Clobercow1 Před 7 měsíci

    I'm curious how well this thing can run Factorio. It might set records. That game needs hella cache and memory bandwidth / latency.

  • @uncrunch398
    @uncrunch398 Před 8 měsíci +1

    I foresee people trying this with everything they'd ever do, at least within 64GB DRAM, without DRAM.

  • @SP-ny1fk
    @SP-ny1fk Před 8 měsíci +1

    Yeah yeah yeah but when can I expect this in my homelab? lol

  • @EyesOfByes
    @EyesOfByes Před 8 měsíci

    But can it run Crysis or Minecraft max renderdistance?

  • @applebiter69
    @applebiter69 Před 8 měsíci

    where is the cinebench score

  • @ThelemaHQ
    @ThelemaHQ Před 8 měsíci

    i still wait Xeon comeback, ive been stick with xeon till GOLD 6140, before switch to red team EPYC 7742 dual

  • @velo1337
    @velo1337 Před 6 měsíci +1

    are you doing a follow up with this cpu?

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  Před 6 měsíci +1

      We will have a video with Xeon Max in it later this week.

    • @velo1337
      @velo1337 Před 6 měsíci

      @@ServeTheHomeVideo would be nice to get some superpi, cpuz, 7zip and geekbench benchmarks for the 9480

  • @Stoinksky
    @Stoinksky Před 8 měsíci

    So it use the storage as ram?

  • @JohnKhalil
    @JohnKhalil Před 8 měsíci

    First official Windows cpu!

  • @richfiles
    @richfiles Před 8 měsíci

    I wish Apple would adopt this memory style for their Apple Silicon SoCs. No current Mac has upgradable memory. You buy the SoC configured with a memory capacity from the factory, and that's it... Sure would be nice to have off the factory floor fast RAM, and _user expandable_ memory expansion slots for future upgrades.
    I really am liking the direction Intel is going with these!

    • @billymania11
      @billymania11 Před 5 měsíci

      Everybody thinks Apple is being stingy or playing games with RAM. Memory of that type can't be slotted. Because of timing and signal propagation, the LPDDR memory has to sit close to the CPU and be soldered. Which in a way leads to HBM memory. I think that will happen and Apple might do that in the consumer or PRO space at some point.

    • @richfiles
      @richfiles Před 5 měsíci

      @@billymania11 what are you even talking about. Numerous laptops and desktops have slotted RAM. Your high sleed RAM remains factory determined, as part of the SoC, and "slow" RAM can be slotted in at a later date by the user.
      Many computers have used Fast/Slow RAM configurations. Every modern computer already does this, to andegree, with Cache. This is merely adding one more later between. SoC fast RAM, and slower socketed RAM.

    • @billymania11
      @billymania11 Před 5 měsíci

      Sure Rich, whatever you say.@@richfiles

    • @richfiles
      @richfiles Před 5 měsíci

      @@billymania11 i am literally describling what is inside laptops. _today..._ I work in a PC repair shop. I have been building and repairing computers most of my life. My first computer repair was in 1989. Look up how Cache memory works. Computers have had different amounts of different speed memory on and off die for decades. Most CPUs have at least 2 or 3 levels of cache memory, plus the external RAM accessed through the memory controller (also on die with modern CPUs). Some computers (mostly long ago) had both fast and slow RAM, accessed directly by the CPU for the fast RAM and through a memory controller for the slow RAM. The Amiga did this. Even many modern PCs can do this. If you have a matched pair of faster RAM modules in a pair of DIMM sockets on one channel, and a slower matched pair of RAM modules in the DIMM sockets of a separate memory channel, then many CPUs will be able to run each channel at it's best speed. There is no reason you can't have a high speed memory controller with some channels directed to on SoC chiplet RAM (HBM or HBM like), while _ALSO_ having some memory channels reserved for slower slotted RAM (either in SODIMM or the newly developed CAMM socket). There is literally no reason a computer manufacterer can't do this, particularly in lower factory memory configurations, where less high speed Factory installed chiplet SoC RAM is installed.
      You say "sure", like it's something unbelievable... I work on laptops every weekday. More have slotted RAM than don't, and some already solder some ram on board, and have a secondary slot for expansion. No reason you cant have some higher speed RAM on the SoC, as configured from thenfactory, and use other memory channels for slower socketed RAM.
      I'd LOVE to have sockets in my Mac Studio, so I could add to the already present 32GB of high speed RAM... But YES, Apple is being stingy, because they are profiting on people buying the RAM they _expect to use someday_ right now, while it's still expensive, rather than just buying the RAM they know they need to be high speed, and adding slower RAM in the future to aleviate usage for miscellaneous tasks, freeing up the high speed RAM for more intensive tasks.

  • @rweninger
    @rweninger Před 8 měsíci +1

    HBM is the future. I wonder how long it takes until it reaches consumer CPU's. Though upgrading RAM wouldnt be possible then anymore.

    • @whyjay9959
      @whyjay9959 Před 8 měsíci +1

      CXL could allow upgrading RAM then.

    • @rweninger
      @rweninger Před 8 měsíci +1

      @@whyjay9959 HBM2 has got a Bandwidth from 420 GB/sec. There is quite some way to go for PCIe to allow CXL Ram Expansion at that speed.
      PCIe7 x16 only manages 240 GB/sec. PCIe7 isnt even out yet, and HBM3 is already beginning rollout 2024 with a whopping 512GB/sec Bandwidth.
      Even the latency on the Bus would be way too high, even if the Bandwidth would be reached.
      With HBM memory expansions die out. CXL only helps for "slow" DDR5 and DDR6. The HBM standard even states that RAM must be on the processing logic die.

    • @whyjay9959
      @whyjay9959 Před 8 měsíci +1

      ​@@rweninger I think you mean bytes? Found a chart showing 128 gigabytes per second for PCIe gen6 x16. But sure, it's all a tradeoff. CPU-integrated chiplets get inherent performance advantages from having the shortest simplest connections but cannot be changed, so they will probably continue to be combined with slower, more flexible types of memory as preferred.

    • @rweninger
      @rweninger Před 8 měsíci

      @@whyjay9959 I dont get your point. You are right with PCIe 6 x16 and 128 gigabytes. I wrote about PCIe 7 that comes out 2025.
      You are right, I mean Bytes. Sorry about that. If I can correct it, I will.
      Anyways, not considering HBM3 in 2025, it means HBM2 runs on 25% or maybe 50% speed. Thats not a tradeoff, that is ... unusable for such a memory.

  • @tostadorafuriosa69
    @tostadorafuriosa69 Před 8 měsíci

    For what would someone use this much power?

  • @Marc_Wolfe
    @Marc_Wolfe Před 8 měsíci

    Run a game on it, damn it.

  • @maou5025
    @maou5025 Před 4 měsíci

    Can you do some gaming benchmark with HBM only? To see infinite money performance lol.

  • @pete3897
    @pete3897 Před 8 měsíci +3

    115 pounds?! Wow, that's really cheap ;-)

  • @miigon9117
    @miigon9117 Před 8 měsíci

    I think "without ramstick" is a better title than "without DDR5"

  • @hgbugalou
    @hgbugalou Před 8 měsíci

    This is the future. Its inevitable all ram will be on the CPU.

  • @kenzieduckmoo
    @kenzieduckmoo Před 8 měsíci +1

    So what I’m seeing here is apple complaining they couldn’t add ddr5 slots to the Mac Pro cause of unified memory was just their engineers not being allowed to do it

  • @sykoteddy
    @sykoteddy Před 8 měsíci

    I just find it hilarious that the dies looks like the windows logotype, no I'm not a Windows or Microsoft fanboy, rather the opposite.

  • @Artoooooor
    @Artoooooor Před 8 měsíci

    It has more on-chip memory than my computer has RAM. Darn.

  • @simonhazel1636
    @simonhazel1636 Před 7 měsíci

    Question on video quality, everything looks fine except Patrick's face is super red, but everything else looks fine, and pictures in the video of Patrick looks fine.

    • @simonhazel1636
      @simonhazel1636 Před 7 měsíci

      Just to note it's only on the 4k youtube setting, if I bump it down to 1440p or 1080p, the issue disapears

  • @Mihonisuto
    @Mihonisuto Před 7 měsíci

    DSG PCIe5 Accelerator?

  • @Dweller12Videos
    @Dweller12Videos Před 8 měsíci

    A whole lotta glue

  • @SB-qm5wg
    @SB-qm5wg Před 8 měsíci +1

    115lbs in a 2U. That's a thick boi 💪

    • @concinnus
      @concinnus Před 8 měsíci +1

      Seriously. And it's not even water cooled! IME, 115# would be ~5U. 2U was ~60#.

  • @czolus
    @czolus Před 8 měsíci

    So, like the now-defunct Xeon Phi?

  • @AlexandruVoda
    @AlexandruVoda Před 8 měsíci +2

    Well, that is certainly a chip that will not serve the home, but is very cool however.

  • @CyberdriveAutomotive
    @CyberdriveAutomotive Před 8 měsíci +1

    I like how Intel made fun of AMD for using chiplets, saying they're "glued together" and now they're doing it lol

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  Před 8 měsíci +1

      You have to remember I was one of the people in the room when that presentation was made (and we had EPYC 7601 already in the lab at the time)

  • @davelowinger7056
    @davelowinger7056 Před 8 měsíci

    You know if you were born 50 years earlier you would be a horse race caller. Oh wait a minute you still are

  • @Azureskies01
    @Azureskies01 Před 8 měsíci

    MI300 would like to have a word...
    *Intel on suicide watch*

  • @KiraSlith
    @KiraSlith Před 4 měsíci

    I'm usually an Intel hater, but man Threadripper and the GPU mining boom destroying Phi completely messed up superscaler development, and AMD just never filled that market niche back out again for Epyc (they had this nice locked down with Opteron), so there was just a pit there ARM was slowly trickling into like groundwater. Database hosting apps really needed these bulk core chips ages ago, but it's good we're at least getting something comparable now in the form of Xeon Max.

  • @alastor2010
    @alastor2010 Před 7 měsíci +1

    Isn’t using HBM to cache DDR5 just like using DRAM to cache DRAM?

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  Před 7 měsíci

      In a way, yes. But think of it more as caching slower/ higher latency/ higher trace power far DRAM to faster/ lower latency/ lower trace power close HBM. There is a big difference between access over a few mm on package and going out of the package, through the socket, through the motherboard, through the DDR5 socket, onto the DDR5 module and so forth.

  • @aarcaneorg
    @aarcaneorg Před 8 měsíci +1

    These RAM-on-CPU or L4-as-RAM (whichever you want to call it) solutions would be excellent for storage-only or Ceph OSD nodes. Excellent options for low-cost computing to save cost and power when all you really need is a few PCIe lanes and some compute

    • @berndeckenfels
      @berndeckenfels Před 8 měsíci +2

      That totally fits the strategy to make ceph as expensive as possible to make it useable .)

    • @aarcaneorg
      @aarcaneorg Před 8 měsíci

      @@berndeckenfels the idea is to save cost by getting a cheap motherboard with only 0 or 1 LRDIMM slots per CPU and use these as low-cost high-thread chips to power the cluster.

    • @berndeckenfels
      @berndeckenfels Před 8 měsíci +1

      @@aarcaneorg they are not low cost (neither are there Cheap mainboards for them).

    • @aarcaneorg
      @aarcaneorg Před 8 měsíci

      @@berndeckenfels yes, as with all fancy new hardware that solves low cost problems, it launches at obscene prices, then eventually comes down in price. Eventually, these, or their ancestors, will become affordable and mainstream.

    • @berndeckenfels
      @berndeckenfels Před 8 měsíci

      @@aarcaneorg that sounds rather unlikely, that’s a specialized hpc model with an expensive production process and special socket and bios.. not like a mass market D-Xeon

  • @benardmensah7688
    @benardmensah7688 Před 8 měsíci +1

    Apple silicon, intel max cpu!! I feel amd will be in trouble next year when this goes mainstream

  • @uncrunch398
    @uncrunch398 Před 8 měsíci +1

    No sleep states are needed for any platform, though preferred when running on battery. A workstation or gaming PC benefits from disabling them. Except for power choking of unused cores to boost those heavily used. Or cooling is insufficient, so sleep states are needed to help for that. Lacking them is not a reason for not using the same CPU as in this video for those workloads. What is always relevant is the performance per cost. Or just performance if cost doesn't matter.