So if Hardware RAID is dead... then what?

Sdílet
Vložit
  • čas přidán 10. 05. 2024
  • Its been over a year now and the fact still remains. Hardware RAID is Still Dead. At least according to Wendell... I'm still just the editor. So... What do we do now?
    0:00 Background
    2:18 Hardware RAID is evolving
    4:23 Latency
    7:09 Blocks
    11:29 Hypothetical Performance
    14:53 Theoretically...
    16:02 Solution?
    18:18 Intel
    19:24 Parody Calculation
    20:08 Tangent
    21:11 Hot Plugs
    21:41 Vroc
    23:59 The future?
    25:19 Vroc and Linux
    26:39 Not Desktop
    27:37 Conclusion
    ********************************
    Check us out online at the following places!
    bio.link/level1techs
    IMPORTANT Any email lacking “level1techs.com” should be ignored and immediately reported to Queries@level1techs.com.
    -------------------------------------------------------------------------------------------------------------
    Music: "Earth Bound" by Slynk
    Edited by Autumn
  • Věda a technologie

Komentáře • 560

  • @timetuner
    @timetuner Před 3 měsíci +367

    Damn. Man's lost some weight since that 2022 vid. good on him

    • @Level1Techs
      @Level1Techs  Před 3 měsíci +262

      First a bug bite, now cardio. new proper fitting shirt by lttstore.com Woooo

    • @boanerges5723
      @boanerges5723 Před 3 měsíci

      ​@@Level1TechsI know you guys have a relationship with ltt but Linus shills more than enough that you don't have to help the creep out. I haven't forgotten that he asked "do you watch beastiality?" In an employee interview. What a gross and immature person.

    • @johnbeeck2540
      @johnbeeck2540 Před 3 měsíci

      Glad for you - your health is an important consideration for us to keep Wendell around longer! @@Level1Techs

    • @-FOXX
      @-FOXX Před 3 měsíci +41

      You look great Wendell.
      Sorry for the circumstances, but you seem like you feel better today than you were too.​
      Best wishes.
      @@Level1Techs

    • @soldiersvejk2053
      @soldiersvejk2053 Před 3 měsíci +5

      I was thinking the same!

  • @TheJkilla11
    @TheJkilla11 Před 3 měsíci +80

    Cost would be the reason people still use spinning rust, not everyone can afford NVMe ssds for 40+ tbs.

    • @blackdevil72
      @blackdevil72 Před 2 měsíci +8

      The reality is more like very few businesses can afford AND need NVME SSDs, a surprising high number still use tapes for long-term storage and backup because the $/TB is so low, same with SAS HDD, and when there is a need for some speed SATA and SAS SSD will do the trick.

    • @TheJkilla11
      @TheJkilla11 Před 2 měsíci +5

      @@blackdevil72 Right like this is almost one of the few times I feel like Wendell is almost out of touch with reality?

    • @Heathmcdonald
      @Heathmcdonald Před 2 měsíci +4

      No, he is still employed in the field I think he just has higher end clients or maybe it's his own server farm. It's not typical for someone who knows something to own the servers and ppl who dont know anything except for $dictate we stay on the stone age but that's just my opinion I don't claim to be an authority to be sure.

    • @TheJkilla11
      @TheJkilla11 Před 2 měsíci +4

      @@Heathmcdonald I'm not saying this isn't where it's going but I just feel like at this time it's not practical for 98% of the businesses and or individuals out there cuz the cost to ownership is too high.

    • @mytech6779
      @mytech6779 Před 2 měsíci

      ​@@TheJkilla11 Not every storage server is providing simple bulk data backup, some are being used for high-demand databases eg. (Sometimes with a couple TB of RAM for caching.)
      For simple bulk data, magnetic is obviously the way to go, but nobody doing that is particularly concerned with latency and throughput for that and raid is just used for its short term psuedo-redundancy. Spinning for online failover and warm archives or smaller quantities; tapes for massive quantities of long term cold archives ...like hundreds of terabytes per year to make the added equipment and process worthwhile though shipping tapes offsite is lower risk and cost than spinning drives, and mostly the sort of data kept to meet legal requirments and will have 2 days of latency in the rare case it is ever accessed again.

  • @skirnir393
    @skirnir393 Před 3 měsíci +13

    Paradoxically Hardware RAID creates a single point of failure that affects directly your precious data. In case of failure you need an identical card or even another identical server if the mobo have built in RAID. Also two identical cards can be incompatible if they dont have the same firmware version. What a nightmare!

  • @JeffGeerling
    @JeffGeerling Před 5 měsíci +126

    But if I'm using Ceph... and each node is an OSD... then in a sense, is that hardware RAID? 🤔

    • @ziggo0
      @ziggo0 Před 5 měsíci +23

      I'd recommend the Chewbacca Defense - the Internet is fierce.

    • @marksnethkamp8633
      @marksnethkamp8633 Před 3 měsíci +15

      Only if you run it on a pi

    • @omegatotal
      @omegatotal Před 3 měsíci +5

      distributed raid

    • @pewpewpew8390
      @pewpewpew8390 Před 3 měsíci +1

      perfect ceph node! 1 massive, cpu, 1 gen 5 nvme, 2x100gbps. o

    • @jttech44
      @jttech44 Před 3 měsíci +2

      I mean, by that measure, all raid is hardware raid.

  • @5urg3x
    @5urg3x Před 2 měsíci +13

    Every SSD is, by itself, a mini raid controller. There are tons of NAND chips, along with DRAM cache, and a controller IC. Just like a hardware RAID controller. The controller takes the NAND and presents it to the host as a virtual disk, or LUN, and then manages garbage collection, etc. intelligently in the background.

  • @eldibs
    @eldibs Před 3 měsíci +38

    "That's cool, sign off on this form here that says you're aware of that..." Oh man, that is excellent CYA, Wendell definitely has experience working for corporate.

    • @OCONTECH
      @OCONTECH Před 3 měsíci +1

      0:03 Shut it Wendell, Hardware Raid Will Never Die, unlike a crappy 007 movies... Peace and don't let it...

    • @eldibs
      @eldibs Před 3 měsíci +8

      @@OCONTECH Fine, if you could just sign this form that you were made aware of the problems and decided to continue using that setup?

    • @OCONTECH
      @OCONTECH Před 3 měsíci

      I'll get back too you :P @@eldibs

  • @paulblair898
    @paulblair898 Před 3 měsíci +44

    Before anyone adopts VROC for any reasonably sized setup, look into the deferred parity calculation issue. Basically VROC gives up at calculating parity and the volume will become inconsistent if you use more than 4 drives. This issue is so egregious It makes GRAID look good.

    • @marcogenovesi8570
      @marcogenovesi8570 Před 3 měsíci +4

      that's neat.

    • @Dan-uk5vv
      @Dan-uk5vv Před 3 měsíci +2

      @paulblair898 where can I find more information on this deferred parity calculation issue.

    • @paulblair898
      @paulblair898 Před 3 měsíci

      @@Dan-uk5vv There isn't much info available about the issue because it's basically an undocumented deficiency, but STH forums has about the most information on it you'll find, just search "VROC deferred parity" and it will be the first thing that comes up.
      The fact that more people haven't shed light on the problem tells me VROC is not popular for even modest sized arrays.

    • @paulblair898
      @paulblair898 Před 3 měsíci +6

      @@Dan-uk5vv There isn't much info available about the issue because it's basically an undocumented deficiency, but STH forums has about the most information on it you'll find.
      The fact that more people haven't shed light on the problem tells me VROC is not popular for even modest sized arrays.

    • @Anonymous______________
      @Anonymous______________ Před 2 měsíci

      Just use mdadm + lvm or refs + storage spaces direct instead of Intel's VROC.

  • @TayschrennSedai
    @TayschrennSedai Před 3 měsíci +33

    14:22 fun fact: even Microsoft and PureStorage weren't aware that in hyper-v with CSV Read cache enabled (the default) you get significantly worse performance even in writes but especially reads. We had a Microsoft Premier review done, and they asked us why we had it disabled. It's because when you have an nvme SAN even running on 32gb FC (not nvme-oF, just scsi FC) you get better performance fully bypassing all memory cache technology.
    To the point that the Guest OS on hyper-v by default enables cache and won't let you disable it, thus hurting performance. Very likely due to cache miss, honestly.
    We complained about the fact we couldn't disable it, but they said it was for our safety (which makes no sense). Yet, they then said SQL automatically bypasses cache....
    Thus, we moved to vmware which the Guest OS totally allows disabling cache on.

    • @jaideepunique
      @jaideepunique Před 3 měsíci

      Can you explain it like I'm 11

    • @cpadinii
      @cpadinii Před 2 měsíci

      Where is that setting in VMware?

    • @TayschrennSedai
      @TayschrennSedai Před 2 měsíci +2

      @@cpadinii It's at the Microsoft layer, disk cache on the drive in disk management. You don't need to touch vmware, it just lets you do it. Msft doesn't in hyper-v. Though they claim that SQL bypasses it.

    • @TayschrennSedai
      @TayschrennSedai Před 2 měsíci

      @@jaideepunique hyper-v sucks at high performance FC, and as of yet doesn't support any nvme-over fabrics. Though I'm pretty sure they'll go nvme over RoCE or TCP, not FC. They're likely using that under the covers already in S2D and azure stack HCI.

    • @X0M9JKEEE
      @X0M9JKEEE Před 2 měsíci

      And now they are moving to subscription model plus raising prices. Time to move on. Again :)

  • @musiqtee
    @musiqtee Před 3 měsíci +50

    Small business, low budget reality: I’ve chosen to use ZFS for some time. Twice, some fluke destroyed system HW parts. I was still able to move the drives to a birds nest of usb, sata and power, and rebuild the whole thing.
    Let’s just say that said small business didn’t go replication 3-2-1, so had they used a HW raid, no spare controller would be around either…

    • @boomergames8094
      @boomergames8094 Před 2 měsíci +8

      That's what I like about zfs also. I've moved drives from one hardware box to another many times.
      I've found ZFS more reliable and overall better than my old Adaptec or motherboard based raid, and better than Windows server software raid.

  • @jsebean
    @jsebean Před 3 měsíci +12

    Just to be clear for someone who does extensive testing of btrfs for my own projects: btrfs does not solve the write hole issue yet on raid5/6 as substripe writes are not atomic as they should be. It does solve the consistency issues with regards to raid1/mirrors/raid10, just not parity. This is due to design flaws early on in filesystem development, however recent kernels is seeing a major redesign to get zoned (HM-SMR) device raid support that will also solve the raid5/6 write hole issue. It's almost coincidental that zoned will also fix raid5/6 since SMR devices can't overwrite in place without write amplification, so the idea is to basically make all writes COW and then reclaim unused blocks later by rewriting things sequentially as old blocks get freed up. Bit of a write amplification threat for certain workloads when you need to start reclaiming blocks (btrfs will do this via a "balance" of the "chunks" in question), but is a solution nonetheless.
    So ZFS remains the only good solution now for this imo, it has variable width stripes to keep writes atomic with raidz, unless you count MD the MD RAID journal which is it's own form of write amplification; you need to write everything twice to keep it consistent. Hopefully very soon these changes will fix the btrfs raid5/6 write hole issues (it's called the raid stripe tree for anyone following development). There's also bcachefs which has its own solution though it too is very early days. The bcache approach is to write in replicas (copies) first until a full stripe is written, then write out parity blocks when a full stripe is complete and dump the second copy. I can't speak to what it does when holes in files appear or what sort of reclaim functionality it has, the erasure coding support isn't even included in the latest releases last I heard.
    Then there's the issue of poor scrub performance on btrfs raid5/6 but that's a separate issue lol.

    • @marcusjohansson668
      @marcusjohansson668 Před 2 měsíci

      AFAIK btrfs was not intended to be a raid fs to begin with, so nothing of what you type should be very surprising.
      btrfs is getting there though, but I am not sure you ever want to use that for raid since it was not created for that intended use.
      For max stability (enterprise use) ZFS is the choice.
      The strength of btrfs is the snapshotting and COW not raid.

    • @satunnainenkatselija4478
      @satunnainenkatselija4478 Před 2 měsíci

      Hardware RAID is still dead because OneDrive is more secure and reliable.

  • @abavariannormiepleb9470
    @abavariannormiepleb9470 Před 3 měsíci +73

    My whole desire for RAID is a system where its operation isn’t interrupted by a drive hardware failure, that includes the OS drives if you choose to use Windows.
    Not wasting the performance of modern NVMe drives would be a nice bonus.

    • @timothygibney159
      @timothygibney159 Před 3 měsíci +3

      The big reason is to prevent corruption in a power or system failure. Raid cards are slower than mobo raid since that is not integrated into the CPU but it has batteries so a transaction in an Oracle database can still finish or be undone. It's not for PCs.
      Today though esan with iscsi hosts important enterprise stuff and a simple hard drive replacement can fix the OS on the server with the data intact

    • @abavariannormiepleb9470
      @abavariannormiepleb9470 Před 3 měsíci +3

      @@timothygibney159 Have never had any file system corruption since switching over to ECC memory in 2015 and a little later SATA AHCI as well as PCIe NVMe SSDs (U.2/3 easy, M.2 harder) where the drives themselves have individual powerloss protection.

    • @bobingabout
      @bobingabout Před 3 měsíci +6

      Completely agree. my primary desire to use RAID is for the redundancy protection against hardware failure. While "Software" raid can do the job, I'd much rather have a controller doing it all in hardware, so the CPU is free to just do it's job.

    • @bosstowndynamics5488
      @bosstowndynamics5488 Před 3 měsíci +14

      ​@@bobingaboutThe problem with this is that hardware RAID winds up creating another single point of failure - the card itself. An Epyc processor with ECC RAM, twin PSUs and a reasonable UPS with software RAID will give you excellent uptime already, I'm not convinced adding an additional singular controller in between can actually improve that...

    • @whateverrandomnumber
      @whateverrandomnumber Před 3 měsíci +6

      It all depends on the scale.
      For home data hoarders and small offices, one or two computers running ZFS is still the best option (controller - and almost system - agnostic, with redundancy. Zfs nowadays also has optimisations for hybrid rust and flash drives).
      For bigger projects, SAN is the solution. SAN uses some software witchery to decide physically the best place to put redundancy, but you need at least three servers - five being the recommended minimum for performance.
      Hardware raid has been dead for a long time.

  • @huboz0r
    @huboz0r Před 3 měsíci +28

    These type of oversight and deep-dive video's are just the best content. I bet half of the storage community is anxious and the other half is excited whenever one comes out. More of these deep insight / explainer / field update video's please! Been waiting for this follow-up since 2022. :)

  • @spiralout112
    @spiralout112 Před 5 měsíci +7

    Great video, must have been a lot of work putting all that together!

  • @vincei4252
    @vincei4252 Před 3 měsíci +11

    0:20 The South Park'esquee animated raid is dead speech shows the editor is on fire 🔥🔥🔥:)

  • @joelnrs
    @joelnrs Před 3 měsíci +59

    That RAID graphic was hilarious! 0:23

    • @nathanddrews
      @nathanddrews Před 3 měsíci +13

      I had no idea RAID was Canadian!

    • @gregz83
      @gregz83 Před 3 měsíci

      czcams.com/video/zEmfsmasjVA/video.html@@nathanddrews

  • @rodhester2166
    @rodhester2166 Před 3 měsíci +9

    As long as those TPS reports are on my desk first thing in the morning.

  • @kahnzo
    @kahnzo Před 3 měsíci +3

    What about DPUs? Will Data Processing Units change how PCI-e lanes are used at some point?

  • @JohnSmith-yz7uh
    @JohnSmith-yz7uh Před 3 měsíci +7

    So we solved raid on a local server, but how do we connect in to multiple hosts? Is iSCSI or fibre channle still relevant? Would love to see a video comparing FC32/64 to iSCSI or a cluster like ceph

  • @FrenziedManbeast
    @FrenziedManbeast Před 5 měsíci +2

    Great video that answered many questions I had as a follow-up to the previous video on hardware RAID, thanks Wendell! I have no professional hardware enterprise experience but I like to learn about these technologies as often we plebs get some trickle down tech for HomeLab/Gaming use.

  • @bill_and_amanda
    @bill_and_amanda Před 3 měsíci +23

    Some of us can't afford fancy U.2 SSDs, WENDELL :D

  • @timmywashere1164
    @timmywashere1164 Před 3 měsíci +9

    Also look on the bright side of death, at least the hardware raid side hasi Monty Python references.

  • @ivanmaglica264
    @ivanmaglica264 Před 3 měsíci +1

    Hi Wendell, one question, is there an external JBOD enclosure solution for fast U.2? You can after all put only 24 drives in 2U chasis. More in general, what is a solution if you outgrew your 2U box? Ceph on top of U.2 seems like a waste of money.

    • @mytech6779
      @mytech6779 Před 2 měsíci

      I have seen chasis with 2 or 3 drive deep caddies so you could have 48 or 72 drives. Can't recall where I saw that chasis though.

  • @johndoh5182
    @johndoh5182 Před 3 měsíci +4

    So, a point I was trying to make elsewhere, and it's really about latency, but it's about latency and limitations when the AMD CPU-chipset link which uses PCIe lanes, currently PCIe gen4 X4 and you have very fast NVMe connected via the chipset. In particular, moving to gen5 would double the bandwidth, BUT it would also cut latency because both PCIe gen4 and gen5 use NRZ, so in moving to gen5, the clock speed doubles. With a doubling of clock speed, IF that were your only latency you would reduce the latency by half.
    This isn't much of an issue with many users because they aren't hitting multiple disks, but if you have 2 NVMe off the chipset and are hitting both of them at the same time my feeling is, depending on how fast these disks are, the latency is going to start to matter. Maybe it's noticeable, maybe it's not. But if you have 2 gen5 drives which would be operating at gen4 speed, the more expensive disks are going to handle those intermittent requests faster, so at some point the latency of a gen4 link is going to have an effect on data transfers to the CPU.
    How fast the NVMe disks have to be before that would happen, I don't know, but certainly if you're moving large blocks of data with 2 NVMe disks off the chipset, the limitations of the link being slower than the disks, when each disk can exceed that speed (link being 8GB/s, 2 NVMe rated for 12GB/s) there is CERTAINLY a bottleneck there and now if anything else has to move across the CPU-chipset link, your system can be unresponsive.

    • @FutureChaosTV
      @FutureChaosTV Před 3 měsíci

      Have you actually tested this thesis or are you talking out of your behind?

    • @CHA0SHACKER
      @CHA0SHACKER Před 3 měsíci

      These drives are connected directly to the CPU though and not the chipset

    • @johndoh5182
      @johndoh5182 Před 3 měsíci

      @@CHA0SHACKER Uhhhh no.
      X570 can have 3 NVMe disks. 1 runs directly to the CPU and the other two via the chipset. They are both gen4 slots, you know like ever since 2019 when X570 came out? The same can be true for X670 and X670E. Depends on MB configurations, but almost all X570 boards that have 3 NVMe ports have 1 off the CPU (it's impossible for there to be more unless the MB maker takes away lanes from the PCIe slots) and two off the chipset.
      I can give you a short list and you can validate off their specs or look at a block diagram for the boards:
      Gigabyte X570 AORUS Master
      Gigabyte X570 AORUS Ultra
      Gigabyte X570S AORUS Elite
      Gigabyte X570S AORUS Master
      Plus any of the boards in a Wifi version
      MSI X570 Unify
      MSI X570 Godlike
      MSI X570 ACE
      Asrock X570 Taichi
      There are more, I'm not going to bother looking them up as I don't build with any other brands and I typically don't use Asrock.
      Basically ANY X570 board that I have chosen to work with or do builds is that configuration, 3 NVMe ports, 1 off the CPU and 2 off the chipset.
      Back when gen4 first got into the mainstream market, it wasn't much of a concern that two NVMe ports that are PCIe gen4 have to communicate to the CPU via a SINGLE PCIe gen4 X4 link. But that was SO 2019 and now that gen5 NVMe can EXCEED 8GB/s and even their queue depth of 1 speeds are much faster, at least the better quality ones, the statement/question I made is a valid one.
      It should be pretty apparent that if you're moving data from 2 NVMe disks off the chipset where both can run at around 8GB/s, max for the PCIe gen4 X4 interface, if both disks off the chipset are moving data to the CPU or elsewhere this would exceed the chipset's bandwidth when dealing with very large files since that's when disks tend to be at their fastest since the request for many pieces is being made at the same time.
      In fact it would be VERY simple for this to happen. Simply put those two disks into a RAID, and read a very large 4K video file from that RAID. If both are newer disks that exceed 8GB/s, they're being bottlenecked. However, and this is also key, the destination has to be able to receive data fast enough for it to make a difference. If a person were to use one X16 slot for the GPU and the other X16 (8 physical lanes) for an NVMe adapter, with once again 2 NVMe that exceeds 8GB/s each AND you also put those in another RAID, both the source and destination will now exceed the speed of the CPU-chipset interface.
      I personally won't validate it. I have a dual OS system and one NVMe off the chipset is used for Windows gaming and the other holds Linux. As I said this was a discussion elsewhere. I didn't want people to come here and drill me about it though, I was curious to see if someone actually have experience with this. It's apparent that both comments so far do not.
      And just FYI, before Zen 4, there was only a single NVMe connected to the CPU. I believe all the way back to first gen Ryzen there has been 24 lanes, 16 to the PCIe slots, 4 for an NVMe, and 4 for the chipset. Moving to Zen 4 AMD added 4 lanes for 28 total. SOME MBs use all 28 lanes, but when I looked at the launch boards that came out with Zen 4, many didn't use those extra 4 lanes and STILL use the chipset for more than 1 NVMe. I KNOW there are some boards (AM5) that use both 4 lane groupings off the CPU for 2 NVMe. Don't feel compelled to list some. I KNOW it, I've read their specs.

    • @CHA0SHACKER
      @CHA0SHACKER Před 3 měsíci +1

      This video is about enterprise server SSDs though, which are most likely used in Epyc or Xeon systems. Epyc systems don't even have a chipset anymore, so all SSDs are going to be connected to 4 of the 128 available PCIe Gen 5 lanes.
      The RAID discussion isn't about the client platform of any manufacturer.

    • @CHA0SHACKER
      @CHA0SHACKER Před 3 měsíci +2

      @@FutureChaosTV dude is for some reason talking about client systems which the discussion about hardware vs software RAID doesn’t even apply to

  • @TinyHomeLabs
    @TinyHomeLabs Před 3 měsíci +51

    Love the holy grail reference 😂

    • @user-sd3ik9rt6d
      @user-sd3ik9rt6d Před 3 měsíci +6

      Run a raid, run a raid, run a raid.

    • @johnmijo
      @johnmijo Před 3 měsíci +2

      It is the Rabbit Hole, Look at all the BLOCKS :p

    • @xXx_Regulus_xXx
      @xXx_Regulus_xXx Před 3 měsíci +1

      ​@@johnmijoit's got I/O a MILE wide!

  • @jdg7327
    @jdg7327 Před 3 měsíci +2

    I've fought tooth and nail with my higher ups about going away with hardware raid and pushed flash cluster storage. It was so hard to get the point that NVME is so much faster than SATA/SAS is and RAID absolutely abysmmal in our next gen storage project.

  • @MazeFrame
    @MazeFrame Před 3 měsíci +6

    I like my BTRFS setup, even though it is only SATA. Feels way better than "hopefully this used RAID-card does not die"
    Would be bad if the backup-target was in need of backing up all the time.

    • @kellymoses8566
      @kellymoses8566 Před 3 měsíci

      SATA is obsolete

    • @bosstowndynamics5488
      @bosstowndynamics5488 Před 3 měsíci

      ​@@kellymoses8566Not until spinning rust starts shipping with NVMe interfaces (or flash gets *a lot* cheaper (expected to get a lot more expensive this year due to fab bottlenecks resulting in shortages)) and consumer platforms start getting 100+ PCIe lanes. Until then, unfortunately, SATA is still state of the art for bulk storage in home labs and smaller scale systems

    • @gordslater
      @gordslater Před 3 měsíci

      @@kellymoses8566 exactly - everybody should just grow extra money to make them rich

    • @abel4776
      @abel4776 Před 2 měsíci

      @@kellymoses8566 What supersedes it for spinning disks?

  • @user-co8vc5nd7l
    @user-co8vc5nd7l Před 3 měsíci +1

    I just built a raidless storage spaces windows server with all flash storage.
    It’s grotesque how fast this thing is I absolutely love it.

  • @zodwraith5745
    @zodwraith5745 Před 3 měsíci +13

    I've been using Raid 0 for decades, but switched to software several years ago. It's just so easy to setup I don't miss dealing with hardware raid drivers every time I need to reinstall Winders.

    • @-ADACOR-
      @-ADACOR- Před 3 měsíci +1

      I don't know if I just got lucky or if I misunderstood but if you need raid drivers then it's not real hardware raid.

    • @zodwraith5745
      @zodwraith5745 Před 3 měsíci +1

      @@-ADACOR- Most likely you'll need raid drivers for hardware raid if you install Windows *_ON_* said raid array. How else would Windows start loading if the motherboard doesn't know to pull from multiple disks?
      My old motherboard based hardware raid always flashed a quick message where it loaded the raid drivers before loading Windows, and I had to preinstall those drivers from the motherboard disk any time I needed to reinstall a fresh copy of Windows. Maybe there were boards with BIOS where raid drivers were built in, but I haven't bought a motherboard with hardware raid in a bit.
      Haven't you ever noticed the option to preload drivers during an OS install? That's for stuff exactly like this. Besides that, why would you ever need _drivers_ for software raid? _Anything_ that requires drivers should automatically mean hardware. So obviously it _would_ be "real" hardware raid.

    • @-ADACOR-
      @-ADACOR- Před 3 měsíci

      @@zodwraith5745 I'll have another look at my old servers. I only ever recall having driver problems with motherboard software raid, all my lsi cards have been plug and play other than setting them up in the preboot environment. I figured windows shipped with drivers for real hardware raid cards and that built in or software raid cards were just a gamble because of how they offloaded overhead to the CPU through their own drivers.
      Want DAS with zfs running on top and can't afford a proper SAN,

  • @cdoublejj
    @cdoublejj Před 5 měsíci +2

    excited to watch this, also still using hardware raid 10 in the home nas with a dell perc card.

  • @ElliottVeares
    @ElliottVeares Před 3 měsíci

    What do you think the way is for big players like NetApp with their propiarity RAID systems like Raid DP, what is like RAID 6 but with two dedicated parity disks, one for horizontal striping and one for diagonal striping?

  • @n.stephan9848
    @n.stephan9848 Před 3 měsíci

    What do you do if you want raid within one machine (so no NAS) and still want to be able to access it on a system that has 2 operating systems?
    I do realize a NAS would probably be best, but I you don't exactly have space for another machine, what do you do?

    • @sceerane8662
      @sceerane8662 Před 3 měsíci +1

      If the two operating systems are Windows and Linux, You can format the disks using Windows software raid and use ldmtool to recognize and mount the array on the Linux side of things.

    • @Mr.Leeroy
      @Mr.Leeroy Před 3 měsíci +2

      Virtualize both systems in 3rd (Proxmox) and as a bonus gain ability to run them simultaneously.

  • @paulwratt
    @paulwratt Před 3 měsíci

    When Wendel did the initial "Raid I Dead .." video, I was looking at filesystems for other needs, and came up with a 520 byte filesystem that worked on 512 byte & 4K block sizes, where the driver could (also) be implemented in hardware (FPGA?). technical details provided in this video say that "hardware driver" should be paired with a PCI bridge, to allow for direct U.2 to CPU for reads. The only real question, is there FPGA / PLC that is fast enough to _not_ bottleneck PCI5 throughput on write. if there were other _optane-like_ devices onboard memory / cache would be a non issue - It seems the recent stink on Kernel list about _i-nodes_ and the fact that Linux is moving to _user space filesystem drivers_ should dictate a different way for Kernels to provide _data-to-apps_ and _filesystem-structure-to-app_ while still allowing for _device-direct-to-memory_ type operations

  • @Spiral6SM
    @Spiral6SM Před 3 měsíci +10

    I certify the drives for a living at a certain major vendor working with a certain major hypervisor, and I'll mention this. Drives are just getting better, and better, and better. IOPS are also going up and up, and what you've mentioned about the interrupts of physical interaction being faster than software are very true. In the enterprise space, hotplug support is the biggest thing that people want, particularly from NVMe. Easy to maintain and replace. With certain enterprise chassis offerings extending to even just 36 E3S drives in a single chassis, the density, hotplug and price is enough to really cement that NVMe is the future over SAS and SATA. Anyway, generally hit the nail on the head of just no more hardware RAID because of RAID support being hamstrung by the drives reporting errors correctly (unless it's ZFS, etc.)

    • @kevinerbs2778
      @kevinerbs2778 Před 3 měsíci +1

      My problem is the low depth random Q depth speed that isn't that better than a spinning plater drive. It's only about double on most modern NVMe SSD's, while 3DXpoint optane drives are insanely fast on random up to 10x fast on random read & write in the low/est Q & small/est depth.

    • @timramich
      @timramich Před 3 měsíci +2

      In my eyes, SSDs are getting crappier. When they first came out they said in like 10 years that they would be cheaper than spinning disks per GB. That hasn't happened, and they're getting crappier, what with upping the levels per cell. It doesn't matter one bit for enterprise people with money to piss away, to have to just replace them all the time.

    • @ericneo2
      @ericneo2 Před 3 měsíci

      Out of curiosity what file system and software RAID would you use?

  • @edwarddejong8025
    @edwarddejong8025 Před 3 měsíci +1

    My LSI Raid Cards have run for 8 years flawlessly. When a drive fails, i pop in a new one and about 6 hours later the RAID 60 cluster is rebuilt. It is awfully convenient not to have to do anything at all with the OS; no CLI operations are usually needed. It is automatic. I understand the new NVME drives are super fast compared to my mechanicals. But they don't last as long at all; i have written 36k TB to my hard drives, and i would have burned out all of the SSD's in a few years vs. the 9 years (and counting) my mechanicals deliver. The real bottleneck i have is the 1 Gbit ethernet.
    The beauty of a hardware controller like the LSI ( now owned by Avago or Broadcom), is that you may lose some performance but you gain convenience of maintenance, so that failures can be handled by remote hands; with a red blinking light on the bad drive.

    • @eDoc2020
      @eDoc2020 Před 3 měsíci

      With software RAID you can also rebuild automatically, you just need to configure the system for it. I'm sure there are tons of software packages which help set this up. Same with lighting up the failure indicator.

  • @satsuke
    @satsuke Před 3 měsíci +2

    I've got 4PB of NVME in one cluster, they're still organized as hardware RAID because there was a need for encryption at rest, which has a performance penalty in software and because it takes a lot of system dimensioning tasks easier when I can predict exactly how it will behave, even if an entire enclosure is down or if more than one stripe group is rebuilding from a failure.

    • @NatesRandomVideo
      @NatesRandomVideo Před 3 měsíci +1

      Almost all businesses who take anything seriously have an encryption at rest requirement. You’re not wrong. The number of places that ignore it is significant. They believe they’re safe because they think they have physical access security - which fails under attack over and over and over again.

    • @cpadinii
      @cpadinii Před 2 měsíci

      What is the hardware you are running that on?

  • @richardheumann1887
    @richardheumann1887 Před 3 měsíci +2

    Maybe a stupid question, but is data rot a thing in normal day to day computing? And would something like ZFS help there? Obviously I know nothing about servers and Enterprise computing....

    • @jsebean
      @jsebean Před 3 měsíci +3

      Absolutely, disks lie and have issues all the time. Using a checksumming solution to verify data, whether it's ZFS, Btrfs, or DM-integrity as Wendell mentioned is the only way to detect this before it's too late. However keep in mind for most home users, the most common cause of "bitrot" comes from bad memory, cosmic rays, etc. Almost every bus and connector in your PC has some sort of verification to detect errors (ie CRC), except memory. Using ECC memory is the only solution to prevent this even in the presence of something like ZFS. Unfortunately Intel in their infinite wisdom still doesn't allow ECC use with consumer level hardware, only AMD.
      And btw DDR5 doesn't solve the issue even with it's on-die ECC. It's not ECC from the CPU/memory controller up, it's only on the ram chip itself. So don't think your DDR5 system somehow solves this issue, you still need proper ECC.

    • @jsebean
      @jsebean Před 3 měsíci +1

      I will say though dm-integrity has consistency issues where the checksums it generates need to either be journaled or they can't be inconsistent after a crash. It does have a bitmap solution where it will recalculate csums after a crash for a range of "dirty" blocks it was working on calculating, but this means you'll be missing crash consistency in your CSUMs which isn't an issue with something like ZFS or Btrfs.

    • @lunalicrichard
      @lunalicrichard Před 3 měsíci

      @Jsebean ; Thank you!

  • @hallkbrdz
    @hallkbrdz Před 3 měsíci +1

    Great update on what's going on in storage. I'm always interested from the database side where fast read IOPS from storage and lots of RAM solve most problems. Writes are rarely the problem.

  • @brainthesizeofplanet
    @brainthesizeofplanet Před 2 měsíci +1

    Ok, stupid question:
    .wehn I have a 24 bay nvme chassis - where do I pllug in all the cables? Which Mainboard has so many connectors?

    • @Level1Techs
      @Level1Techs  Před 2 měsíci +1

      Check out the tyan server review we did for amd epyc. Takes all 100 pcie lanes right to the front of the chassis

  • @MiddleSiggyGames
    @MiddleSiggyGames Před 2 měsíci

    First great video... I think (and people have already said this) but cost is the new issue. For example I just built a 240TB raid array for around 3k to do that same thing with the drives you mentioned it would run around 24k which puts it out of the budget when speed is not the issue, but mass storage of data that will most likely sit for an eternity. I am also curious what the life span of the new SSD/NVMe drives are, I have hard drives that are now 30 years old, and I struggle with the CPU working while the drive keeps ticking on. Maybe one day we all can get the Microsoft Glass technology. As an old time Geek.. loved the video keep up the great work!!!

  • @puschelhornchen9484
    @puschelhornchen9484 Před 3 měsíci +37

    I am disappointed this video was not sponsored by Raid Shadow Legends😂

    • @ICanDoThatToo2
      @ICanDoThatToo2 Před 3 měsíci +2

      I mean he just said Raid is dead.

    • @handlemonium
      @handlemonium Před 3 měsíci

      Pssst..... they're secretly using hardwaire RAID

  • @markkoops2611
    @markkoops2611 Před 3 měsíci +2

    What about OS independent redundancy?

  • @-Good4Y0u
    @-Good4Y0u Před 3 měsíci +2

    I will say with hardware raid it's harder to deal with failure. The raid controller can be a single source of failure. So it's adding one extra part that could fail than software raid which can allow that part to be cut out. It's all cool stuff.
    Now what I wish people would stop saying is that data centers are dead. 1. All cloud providers are running hyper scaling data centers. 2. If you're storing static data long term it's cheaper to run your own DC. 3. Cloud resources for AI jobs is expensive, which is why a lot of companies are running their own servers for it.

  • @ZPanic0
    @ZPanic0 Před 3 měsíci +1

    As a layman, I'm struggling with latency stacking. Why is checking the cache and the storage happening in series? Why isn't this a first across the finish line situation? Does each individual read have to happen in series on some level, making a parallel check blocking?

    • @marcogenovesi8570
      @marcogenovesi8570 Před 3 měsíci

      most controllers and drives have multiple queues for read/write so you can run multiple operations in parallel. The performance is obviously shared

  • @pt9009
    @pt9009 Před 3 měsíci +2

    For those who don't need a solution that accomplishes higher speed or redundancy, but rather simply the appearance of one big volume, MergerFS is great!

  • @willkern6
    @willkern6 Před 3 měsíci +6

    Wendell I usually listen to y'alls videos so it's been a while since I actually saw you on screen. I got a chance to watch this one on my lunch break and was surprised at how good you are looking. Glad you are doing well.

  • @vgernyc
    @vgernyc Před 3 měsíci

    Does AMD have an equivalent to VROC for EPYC?

  • @brianburke808
    @brianburke808 Před 3 měsíci

    What about hardware controllers for SATA hdd's? Am I missing something?

  • @BrianKellyA2
    @BrianKellyA2 Před 3 měsíci +1

    I can’t remember the last time I learned so much in a tech video that I was compelled to comment _solely_ to thank the creator. Wendell, thank you. You’re a terrific communicator: Concise, thorough, practical, and funny! I hope you keep making these.

  • @blackwell68
    @blackwell68 Před 2 měsíci

    Great video, above my head, yet I felt like i was following it.
    Editing comment.
    When talking to the 2 cameras. On cuts that we see you turn away from the other camera, it feels like you are turning away from us, like you dont want to talk to us anymore. On the cuts, when you turn towards the camera, it feels like you are turning to us to engage with us .
    Again, great video.

  • @mcmormus
    @mcmormus Před 3 měsíci

    Can anyone tell me about the speed and latency of an SSD cached HDD NAS versus direct access to the HDDs within the same device?
    I'm considering either building a HomeLab that combines everything in one device or alternatively two devices: A NAS with 1x SSD + 5x HDD and a ThinClient with strong CPU/GPU power but I'm worried about the data connectivity of the separate device solution.

    • @marcogenovesi8570
      @marcogenovesi8570 Před 3 měsíci

      if you move data through a network, you add large amounts of latency just because of the networking step. Also consider that with say 10Gbit networking means you are limited to 1.25 GB/s of actual throughput. Networking is bad and should be avoided if you can.
      If you have a device that has a strong CPU and a bunch of RAM anyway, it's not a big performance impact tojust have it run the storage too.
      Imho in your case you should make a single big device with storage and CPU/GPU

  • @pleappleappleap
    @pleappleappleap Před 2 měsíci

    How much does a CM7 cost per gig vs spinning rust?

  • @dawbrapl
    @dawbrapl Před 3 měsíci +1

    So buying Adata (as they cheap andt have cache) and do raid from it , for gaming space is usless?

    • @TheDuzx
      @TheDuzx Před 3 měsíci +3

      Why do you want redundancy for your Steam games? Or is it a speed issue? For speed I've personally opted to get SSDs on sale. They're like 3 times as expensive per GB compared to the sales price on HDDs, but they're also 10 times faster than HDDs.

    • @FutureChaosTV
      @FutureChaosTV Před 3 měsíci +2

      Short: yes.
      Todays NVME SSD's are so fast that compiling shaders and uncompressing files takes more time than delivering the data to the CPU.

  • @deafno
    @deafno Před 3 měsíci

    16:10 What kind of DBMS solutions offer this kind of configuration? I know elasticsearch can be configured with list of data paths, but it's not really optimal solution, because they are not balanced (data:path:[list_of_paths] config option)

  • @flyguyes
    @flyguyes Před 3 měsíci

    He said towards end, dont need hardware raid for high speed devices. Does that mean for older slower drives, hw raid still has benefits?

  • @Ironic-Social-Phobia
    @Ironic-Social-Phobia Před 3 měsíci +5

    Wendell fibs, I see him carrying 125 blocks... 08:56 :D

  • @markski7716
    @markski7716 Před 3 měsíci

    Does AMD have any answer for raid?
    By raid card, does that include HBA cards? If so what kind of hardware would you use to be the backplane for 24+ drives?
    Does Wendell have a video on cluster storage for say home lab? Say for persistent storage for DBs on Kubernetes or s3 equivalents like minio or just general clustered NAS? The thing I find most complex and confusing with clustered setups is the level of redundancy and knowing which solutions to choose at each level (file system, ceph/glusterfs, or the application level like minio or hdfs storing replicated shards or something). To make it even more frustrating sometimes different solutions support only one kind of "write mode" such as blob vs block storage even further making a robust easy to manage setup harder (using less hardware/machines).

  • @chrisslaunwhite9097
    @chrisslaunwhite9097 Před 3 měsíci +2

    Put that Drive in a regular ATX system, so we can see how its done. then run some benchmarks on it and maybe some big games like SC :)

  • @c128stuff
    @c128stuff Před 3 měsíci

    This has been true for quite some time now, even on older systems with a whole bunch of SATA SSDs, Linux md can outperform any hardware raid solutions. Since kernel 4.1, you can use a write journal for md writes, which closes the write hole. The one 'issue' with it is that md's write journal and md's write intent bitmap are mutually exclusive. This means you either get to close the write hole, or get fast resyncs in case of a temporary drive unavailability. I've moved to closing the write hole quite a while ago, as a resync between a bunch of SSDs is fast enough as it is.
    Ideally, you have a write journal, and use a stripe cache in system ram. The system ram stripe cache helps with reads for data which has been written to the journal, but not yet to the underlying array, to prevent having to go to the journal device for those.
    Using a nvme device as write journal is obviously preferable, using nvme as underlying storage is nice as well, but if you only have pcie 3.x or 4.x, sata devices may work just fine as long as you can have a dedicated pcie lane for each device.

  • @remghoost
    @remghoost Před 3 měsíci

    3:20
    This is quite a fascinating way to think of it.
    One of the limitations of modern locally hosted AI models is throughput.
    I'd be really curious to see how well an LLM or SD could perform just reading the model off of one of these CM7 drives (without having to load the model into RAM/VRAM for inference).

  • @HarunAlHaschisch
    @HarunAlHaschisch Před 3 měsíci

    I wonder, I'm an admin in a small company where other small (10-100 employees) businesess outsource their IT to. I'm really interested in this stuff but it's way beyond what me or colleagues or my customers deal with. We do 1-4 on premise hardware servers and virtualize with vmware to maybe get a HA cluster going and throw in a small NetApp FAS and I never even see anyone think about these kinds of issues. So my question is: in what kind of environment do you start thinking about these issues scale wise?

  • @TheKev507
    @TheKev507 Před 3 měsíci +2

    Only place I still use it is for boot drives. OS mirroring of boot drives is sketchy or entirely unsupported by folks like VMware.

    • @TheKev507
      @TheKev507 Před 3 měsíci

      Intel also tried to kill off VROC last year so I can’t trust it long term

  • @RandomTorok
    @RandomTorok Před 3 měsíci

    I like my raid for data protection, I've recently had 2 drives fail in my raid and I simply bought another drive and popped it in and we are good to go. If we get rid of raid what do we do about data protection?

  • @fwiler
    @fwiler Před 3 měsíci +1

    One of the best episodes I've seen. Thank you.

  • @KuramaKitsune1
    @KuramaKitsune1 Před 3 měsíci

    hey wendell,
    what would you say the ULTIMATE gaming system would be ?
    fully virtual windows install completely in a ramdisc ?
    bare metal install on Xpoint boot drive ?
    14GB/s gen5 nvme boot ?

  • @nathanddrews
    @nathanddrews Před 3 měsíci +9

    I mostly enjoy my current setup, a Windows 10 PC running 12-ish drives managed through Stablebit Drivepool. It's so easy to use, select which folders to duplicate across multiple drives and how many times. Been running with the same setup for many years and it has successfully migrated files off of dying drives using the SMART warnings, pulling out old drives, throwing in new drives... it's really been nice. That said, I'm thinking of my next build being unraid, not sure yet.
    Also, the chapter about parity calculations is labeled "Parody Calculations". 🤣

    • @Bob-of-Zoid
      @Bob-of-Zoid Před 3 měsíci +2

      Windows 10, or Windows in general is something I could never be satisfied with.

    • @marcogenovesi8570
      @marcogenovesi8570 Před 3 měsíci +6

      Unraid is very good and user-friendly

    • @nathanddrews
      @nathanddrews Před 3 měsíci +1

      @@marcogenovesi8570 It's come a long way for sure. My favorite feature of Stablebit is that any drive I remove from the pool can be read normally by another PC, likewise any drive with files and folders on it already can be added to the pool and duplicated with one click.

    • @nathanddrews
      @nathanddrews Před 3 měsíci

      @@Bob-of-Zoid After I basically nuked all of its ability to update and feed back telemetry, it's been extremely reliable for my home server/Blue Iris/Plex setup needs.

    • @Bob-of-Zoid
      @Bob-of-Zoid Před 3 měsíci

      @@nathanddrews I can't remember how many times I have heard that "disabling and removing" all the spying and what not, is just people following false flags and it doesn't actually work, and you cannot really turn off or disable much of anything! Even if you could, updates can and often do undo all of those changes and turn it all back on again! BTW: Telemetry is only a very small part of it, like literally uploading the entire content of your drives to Microsoft (It's in your contract that they will, but of course you didn't read it, like most Windows users).
      I nuked Windows altogether over a decade ago, and that took a few minutes! It's a matter of principle, and you are not only wasting valuable time trying to keep M$ out of your business, and now even control of your hardware, but by using it encouraging them to do more of the same and even worse!
      Besides all that: Linux is the de facto networking OS, and what you can do with it out of the box, goes way beyond everything you can do with Windows and a bunch more additional proprietary software, and that software's breaches of security and privacy, including the fact that you are paying them to abuse you! Shit, even security software for Windows spies on users!!
      Sorry, but there's no way you can convince me that there is any advantage to using Windows!

  • @xXJNTXx
    @xXJNTXx Před 3 měsíci +1

    Is it dead for Sata SSDs too?, I have a bunch of low capacity Sata SSDs and thought, before I would sell them, that I might get a PCIE raid controller to run them in Raid 0 for maximum Random read/Write performance.
    :-D Or does it just relate to Server hardware? with those fast NVME SSDs.

    • @marcogenovesi8570
      @marcogenovesi8570 Před 3 měsíci +3

      No it's not dead for Sata because those are limited to 550MB per disk due to Sata protocol so you can join a bunch of Sata SSDs before it's a problem

    • @zodwraith5745
      @zodwraith5745 Před 3 měsíci +1

      I ran hardware RAID0 for a decade. I was skeptical of software RAID0 but now I run it on several systems and Microsoft is shockingly competent at it with plain old Windows. I use it for mass storage of a grip of games that don't really care about speed and can easily be redownloaded, while I put new demanding games on a 4.0 NVMe. You'd be surprised how many games don't care about anything over 500mb/sec because the system isn't asking for more while loading. I saw ZERO reduction in load times in GTA5 moving from the 550ish raid array to a near 5000mb/sec NVMe.
      I've yet to have a raid failure but that may be because I use refurbed enterprise drives. Raid 0 can even keep your old spinny boys useful.

  • @forbiddenera
    @forbiddenera Před 3 měsíci +1

    ZFS would be awesome with it's featureset but, it too was designed for spinning rust and hurts NVMe performance, however they're working on it so perhaps soon it will be a feasible option.

    • @forbiddenera
      @forbiddenera Před 3 měsíci

      21:15 how can you list ZFS as an option in the context of this video which seems to be performance with the current limitations of ZFS? like, by default ZFS literally batches writes every 5sec because it was designed for spinning rust, sure they're working on it but in the context of performance strictly, I'm not sure that (today) it's much more feasible of an option than hw raid. Performance aside, it's definitely a more flexible option, especially since it could be run on MD or even hwraid and once they fully update it for NVMe performance you'd be ready but not acknowledging ZFS has the same core problem as hw raid (being that it was designed for hard drives) seems odd here.

  • @TheAnoniemo
    @TheAnoniemo Před 3 měsíci +1

    Was wondering if you were going to mention the VROC debacle where Intel tried to cancel the whole thing beginning of last year and then swiftly turned around that decision after backlash.

  • @Michaelmaertzdorf
    @Michaelmaertzdorf Před 3 měsíci

    Are you planning on testing the speed (nvme disks) also with Windows Server 2025? Apparently they are using/going to be using a new 'native nvme' driver that can perform up to 190% better then current Server 2022 variant.

  • @frantzs1077
    @frantzs1077 Před 3 měsíci +1

    Hi. What about budget sata ssd raid? At the moment 4Tb ssd are 250 to 300 eur. Just ordered 2 more. Already running 2x 4Tb ssd software raid0. So far works great. Cheap nvmes are not working for me. After catch is full they drop to 80MBs.

    • @marcogenovesi8570
      @marcogenovesi8570 Před 3 měsíci

      Sata SSDs have caches too but they are slower because Sata port so maybe you can get away with more. If you want SSDs that don't drop performance like that you need to buy used server nvme drives (or SAS)

    • @frantzs1077
      @frantzs1077 Před 3 měsíci

      @@marcogenovesi8570 It is the other way around. Cheap (not the cheapest) ssd in Raid0 can sustain 1GBs transfers, while cheap nvmes don't.

  • @brainthesizeofplanet
    @brainthesizeofplanet Před 2 měsíci

    What about AMDs Raid2Xpert - anyone has tried it?
    Sounds similar to vROC

  • @JMetz
    @JMetz Před 3 měsíci +1

    @Level1Techs RAID for NVMe is about redundancy and protection against failed drives more than performance, so HW RAID doesn't make much sense. Unfortunately, your "worker bee" graphic about queue and block transfers is completely wrong. Queue depth refers to NVMe *commands*, which send a request for either physical region pages (PRP) or scatter-gather lists (SGLs), which are then transferred via PCIe or NVMe-oF data transfer mechanisms. There is no "1 sec pause" between commands. A Completion Queue Entry (CQE) is returned to the host driver that updates the Submission Queue Head Pointer that indicates the queue entry in the submission queue should be vacated and ready for another entry. But that's completely different than block data transfer.

  • @cwspod
    @cwspod Před 3 měsíci

    what about Stablebit Drivepool with m.2 L2 PrimoCache ?

  • @gamingmarcus
    @gamingmarcus Před 3 měsíci +1

    I don't do anything with storage. Why do I still find it fascinating to hear you speak about this?
    14:56 you mean...creating a RAID RAID? RAIDCEPTION

    • @boomergames8094
      @boomergames8094 Před 2 měsíci

      A recommended method is to mirror all drives, then stripe those mirrors, making a 1+0. That gives you 1/2 capacity and 4x read speed with 4 drives. ZFS does this easily.

  • @bluekkid
    @bluekkid Před 2 měsíci

    Hey said ReFS right! I’m so amazed

  • @bgeneto
    @bgeneto Před měsícem

    You've always praised Intel Optane for its excellent low-queue random I/O performance. The Kioxia CM7 seems to be the opposite, with poor RND4K Q1T1 results. Now, imagine the potential RND4K Q1T1 performance drop with four CM7 drives in RAID.
    I'm curious about how you trace/track your server (or desktop) disk access patterns. A common scenario for boot drives involves 90% reads / 10% writes, 90% random / 10% sequential access, with approximately 50% of all operations being 4K R/W. Given this, it's hard to see how this VROC would significantly improve the performance of such a boot drive, perhaps only in extremely particular (mostly sequential IO) circumstances.

  • @JohnClarkGaming
    @JohnClarkGaming Před 2 dny

    looking good mate!

  • @declanmcardle
    @declanmcardle Před 3 měsíci +3

    Redundant Array of Inexpensive Parrots...

  • @GreensladeNZ
    @GreensladeNZ Před 3 měsíci +3

    I don't even know why I'm watching this. I'm running used eWaste SAS drives in a G8 DL380p
    Probably getting about 69 IOPS

    • @eDoc2020
      @eDoc2020 Před 3 měsíci

      I noticed the cache card in my DL380p died literally the night before this video came out. This lowered the performance and made it impossible to change the logical drive configuration. I used the opportunity to back up everything, switch the P420i to HBA mode, and rebuild it with LVM and ZFS. With this setup I can use some of my SSD space for HDD caching but also use some space for dedicated fast storage, something the P420i does not natively support. I also get the resiliency features of ZFS.
      Perhaps the biggest reason not to use hardware RAID is that your array isn't locked to one vendor. If your server dies you _need_ another HP card to recover your data. With HBA mode and software RAID I can plug my drives into any brand's controller and access all my data.

    • @Nostalgia_Realm
      @Nostalgia_Realm Před 2 měsíci

      nice

  • @shaunhall6834
    @shaunhall6834 Před 2 měsíci

    This brought me back to the days when I had my C64 and we used cassette tapes for our data storage. I'm blown away how far we have come in my lifetime.

  • @davidgrishko1893
    @davidgrishko1893 Před 3 měsíci +1

    The best kind of videos right here ladies and gentlemen. We need more of these @Level1Techs

  • @cph5022
    @cph5022 Před 3 měsíci

    The PERC H965i in my new Dell PowerEdge servers with NVMe drives would like a word

  • @kenzieduckmoo
    @kenzieduckmoo Před 3 měsíci

    Id like to see ya do a video on ReFS and what its about, why to use it, all that stuff. Last time i went to install my desktop it asked if i wanted to use it and i said no, but if i can take 2-3 m2 drives and stich them together for space like with ZFS thatd be great.

  • @nathangarvey797
    @nathangarvey797 Před 3 měsíci +1

    As someone who works in an enterprise datacenter as an architect of various infrastructure, I would say that Wendell is mostly right, but still partially wrong in 2024. Hardware RAID is DYING, but not fully dead. I can say from first hand experience that I have still spec'd out hardware RAID in late 2023, because NVMe is not appropriate for every situation due to cost or function. Sometimes, even today, spinning rust is the right solution (just like AWS Glacier is a thing). Likewise, SAS SSDs are kind of that crossover point where you really need to know your workload to determine if hardware RAID is the right call, and sometimes you want those CPU cores to handle OTHER workloads (like massive data processing and IO, or de-duplication, etc).
    I honestly think that it will take the NEXT technology beyond NVMe that somehow makes NVMe the slow, high capacity option before hardware RAID can truly be buried. Something like putting storage directly on the infinity fabric, or whatever, that can make hardware RAID no longer a real option. Just my 2 cents.

  • @1MTEK
    @1MTEK Před 2 měsíci

    I'm interested in the Kioxia CM7 for my personal workstation (AM5/7950X), but its connector/form factor is frustrating. A PCIe Gen5 U.3 adapter is $695 and I'm not desperate or brave enough to go that route.

  • @j340_official
    @j340_official Před 3 měsíci

    Great wealth of knowledge bro. Thanks!

  • @lukemcdo
    @lukemcdo Před 3 měsíci

    Pardon my borderline stupidity, but is the premise here that the host OS disk is configured in such a way that it is easily brought back up, so it doesn't need true redundancy? Boot from network? Hardware RAID 1 for just that? My presumption is that the EFI partition has to live somewhere.
    Thanks for constructive comments and/or links to resources.

    • @eDoc2020
      @eDoc2020 Před 3 měsíci

      You can have duplicate EFI partitions on each drive. This is what Proxmox does with ZFS installs.

  • @solidreactor
    @solidreactor Před 3 měsíci

    How about implementing CxL devices and use CxL + system RAM as *Pools* of "Tiered Storage" (or Tiered Cache) for the latency and throughput aspects?

  • @shaung2965
    @shaung2965 Před 3 měsíci

    This channel deserves way more subscribers. Awesome content always.

  • @grizant
    @grizant Před 3 měsíci +1

    I'm pretty interested to see how bcachefs development progresses now that it's been merged into the mainline 6.7 kernel. It's currently missing some key features from ZFS I feel are necessary (scrub, compression, etc), but they're on the roadmap.

    • @marcogenovesi8570
      @marcogenovesi8570 Před 3 měsíci

      in another 5 years it might get close to what Btrfs is now.

    • @grizant
      @grizant Před 3 měsíci

      @@marcogenovesi8570 Maybe! Fortunately, there's room enough in my heart to love more than one filesystem.

  • @dangerwr
    @dangerwr Před 3 měsíci +1

    I never got to experience hardware RAID, but I know that software RAID has come a very, very long way from what it used to be, and I'm all for it getting even better. The need for a RAID PCIe card not being needed is fine by me. I would still like to see SAS get passed down to consumer hardware.

    • @BobBobson
      @BobBobson Před 3 měsíci

      This is the exact issue I had. Moved from a Dell T7500 with native SAS support to consumer B550. Had to get a RAID card or lose out on the majority of my storage. Definitely not a good solution.

  • @AcidSugar1414
    @AcidSugar1414 Před 3 měsíci +2

    A++ work, Wendell!

    • @Bob-of-Zoid
      @Bob-of-Zoid Před 3 měsíci

      I remember wanting to get A+ certification and downloaded the learning material: It was so outdated, that instead of following through with it I asked potential employers if they were looking for someone to refurbish old computers, or someone to work on the latest and greatest in tech. Meanwhile I found out the whole thing is just a money grabbing scam by an "Organization" who only organized a way to funnel money to themselves by creating a false sense of benefits for computer repair and maintenance businesses.
      Of course I also found out that most computer "Repair" is no such thing but rather undoing user errors and correcting for their ignorance, only to have them come back a week later with 8 browser tool bars, viruses, malware, messed up file systems...so all the crap I removed, and the antivirus software and updates disabled because they were annoyed by the popups... and trying to blame me for it, despite the fact that I explained everything in minute detail and even gave them a printed list of do's and don'ts, and procedures of when and how to do things right, so I decided it was not the right job for my mental health!
      If Linux was a viable option back then it would have been my first recommendation, because even for myself, in Windows you had to be on top of things like flies on shit, and spend more time and with much frustration maintaining the OS than using it to get work done, and that's aside the high cost to just have a PC, an OS, and a bunch of proprietary bloatware n' shit that wasn't cross compatible fighting each other on my hardware, with total disregard for me as the user!

  • @joshuaspires9252
    @joshuaspires9252 Před 2 měsíci

    for that parity useing gpu,, then server cpu's coming with a gpu section for fast math work.

  • @kingyachan
    @kingyachan Před 3 měsíci

    I might be misremembering but I'm pretty sure that was a Discworld 2 reference at the start there 😅

  • @pXnEmerica
    @pXnEmerica Před 3 měsíci

    Been stuck with lizardfs, raids suck when you need to rebuild them.

  • @fteoOpty64
    @fteoOpty64 Před 3 měsíci +1

    Only Wendell and Patrick (from STH) are the ones doing such hardcore technicals on high-end hardware. The rest of the PC YTbers just do your normal testing and bench-marking. But Linus now doing a LAN Centre (on a badminton court!).

  • @MStrong95
    @MStrong95 Před 3 měsíci +1

    Mostly out of curiosity, what stops someone from doing a RAID 55 or 66? I'm not saying it's a practical thing just curious since it's common enough to have a RAID 10

    • @blahorgaslisk7763
      @blahorgaslisk7763 Před 3 měsíci +1

      RAID 5 & 6 is more about getting cost effective storage with higher chance to keep the redundancy efficient on a storage drive failure than what you get with RAID 0 & 10. Having said that I've seen RAID 50 mentioned some times, and if that's doable then RAID 60 should be a possible configuration. Are they reasonable? I don't know, but someone probably has a use case where they thought those were a good choice.
      All RAID levels is a compromise. RAID 0 doubles the chance that you loose all data in a drive crash. RAID 1 halves the available capacity while providing a one drive redundancy. RAID 10 is a compromise halving the capacity and theoretically doubling the read/write performance while providing a one drive redundancy and a possible second failed drive redundancy if you are lucky. RAID 1 actually has a possible latency improvement, but that's really a spinning rust thing. RAID 3 & 4 isn't often mentioned and generally superseded by RAID 5 and later 6. R5 require a minimum of 3 storage devices, but operates better at higher numbers. It provides a total storage of the number of drives -1 in total and provide guaranteed redundancy for one drive. A dead device in a R5 array means lower read and write performance as the missing data has to be reconstructed from the remaining devices. R6 is the same, using mostly the same techniques, but it uses the capacity of two devices for parity storage. So two devices can fail and he array can still be used and the devices replaced and reconstructed.
      So, in my opinion, it's hard to see a case where a R55 or R66 array construction would be a good idea. That doesn't mean that it's nothing that someone has tested. I once built a R00 array. That is two R0 arrays coupled in a R0 array. This was a long time ago and I think it was 32 drives and I managed to produce a benchmark of over a gigabyte per second. It doesn't sound amazing today, but it had our contact at Adaptec very enthusiastic. Of course it wasn't for practical use but it was technically interesting as a test of how much data the cards could push to the spinning rust drives.

    • @marcogenovesi8570
      @marcogenovesi8570 Před 3 měsíci +1

      Raid 50 and Raid 60 exist and most decent raid controllers support them, but they are a size where dedicated storage appliances start to look appealing
      Doing a raid 55 or a raid 66 is insane, you are doubling the performance impact from parity (or double parity for raid6), and I've never seen cards support that.

    • @eDoc2020
      @eDoc2020 Před 3 měsíci +1

      With hardware RAID you can't do RAID55 because nobody is going to write the functionality in the firmware for such a cursed feature. With software RAID (or software on top of hardware) it's easy.
      On a "practical" note 9 drives is the smallest for RAID 55, giving you 4 drives worth of data. No two drives failing could result in data loss and I really don't want to calculate the chances of a third failure causing data loss.

  • @TechyBen
    @TechyBen Před 3 měsíci

    Basically RAID is being done in the flash/chips now?

  • @benwest5293
    @benwest5293 Před 2 měsíci

    This looks fantastic. I'm hoping it turns out to be the faithful adaptation implied by the trailer

  • @kelownatechkid
    @kelownatechkid Před 3 měsíci

    I love ceph, the latency perf isn't great but it's well worth the tradeoffs