Getting data of my failing RAID array

Sdílet
Vložit
  • čas přidán 27. 06. 2024
  • My main server had a failing ZFS raid array and in this vlog style video I try to fix the array and get all the data from it. I go over my process and all the little issues I run into in the process.
  • Věda a technologie

Komentáře • 35

  • @idle_user
    @idle_user Před 29 dny +16

    The knowledge you have of the steps you take is astounding.
    I personally have to look up every step of the way with every new error.

    • @ricsip
      @ricsip Před 27 dny

      the problem here is that if it wanted to be an educational video rather an entertaining video, all steps could have been explained with some small pause and maybe a drawing / diagram. Otherwise it remains entertaining to other zfs experts, and would serve no education content to ones who arent zfs experts.

    • @ElectronicsWizardry
      @ElectronicsWizardry  Před 26 dny +4

      I think I was aiming for entertaining, and as you pointed out people with ZFS knowledge is a pretty slim crowd. Teaching how to fix these types of issues can be difficult as there can be multiple things wrong, and general guide is difficult to make. I thought this video was a interesting case study into one specific issue I had for some.

    • @Mr.Leeroy
      @Mr.Leeroy Před 24 dny

      @@ricsip how about pausing video on each unfamiliar moment and doing your own homework?
      If you are unable or unwilling to do at least that, no guide/tutorial ever will download critical thinking diskette into your brain as they did in The Matrix.
      This video has got all you need to provide with coherent info on the case study.

  • @marc3793
    @marc3793 Před 28 dny +4

    Man, you have a lot of drives (and other hardware) hanging around! 😄

    • @ElectronicsWizardry
      @ElectronicsWizardry  Před 26 dny +1

      Yea I have really accumulated a pile of drives over the years. I probably should get rid some of it, but I am a bit of a hoarder.

  • @Skukkix23
    @Skukkix23 Před 28 dny +4

    ElectronicsWizardry: I lost 2 drives, it's a raidz2 it's fine!

  • @johnmerryman1825
    @johnmerryman1825 Před 28 dny +2

    Great video, love the vlog style! You inspired me to replace the degraded boot-pool on my homelab Truenas server

  • @gg-gn3re
    @gg-gn3re Před 29 dny +2

    "0:28 I been a bit lazy with my personal stuff" you and me both brother. Years ago I ended up just syncthing several of my family members stuff all together giving us each access to only their respective stuff and calling it a day.

  • @MStrong95
    @MStrong95 Před 29 dny +2

    Glad it mostly worked out and you had a backup copy as well for some extra data protection and redundancy. Currently trying to work with customer service for a Seagate Barracuda 8TB hard drive that failed inside the warranty period. It's expiration date for warranty is sometime in 2025.

  • @lifefromscratch2818
    @lifefromscratch2818 Před 29 dny +1

    Definitely cool seeing real world stuff like this.

  • @magmaxgus
    @magmaxgus Před 29 dny +2

    Great Stuff. Keep it up!

  • @silversword411
    @silversword411 Před 29 dny +1

    More of these videos, good to see some tips.
    sudo !!
    That was a new one to me
    You also said some tools at 17:48 what are those? Don't forget to mention names of stuff! :) More pls

    • @ElectronicsWizardry
      @ElectronicsWizardry  Před 29 dny +9

      That tool I used then was btop. I should do a video on different usage monitoring tools on linux.
      I have to balance between video length and amount of content, but glad to hear people like the extra info.

  • @moebius2k103
    @moebius2k103 Před 27 dny +2

    With so many spare drives, just swap swap the problematic ones straight away next time. Copy the whole data off first though. Read IO making the copy is preferable to rebuild IO. If the rebuild goes bad you've got the copy. Then you can read the writing on the wall and throw some cash at it with a whole new pool of drives and retire the old ones for good.

  • @Tumleren
    @Tumleren Před 28 dny +4

    I may have missed it but do you have a video on monitoring and alerts? Like for knowing when a disk fails or there's a problem with the pool

  • @truckerallikatuk
    @truckerallikatuk Před 29 dny +3

    The 3TB drives with known issues were the big red flag there. Personally, I'd have been slowly swapping them out for 4TB+ drives over time, just to avoid this exact situation.

    • @ElectronicsWizardry
      @ElectronicsWizardry  Před 29 dny +5

      The strange part to me is both of my working ST3000DM001 drives(I had 5 originally back in the day) didn't have any issues it seems during this rebuild, and pass badblock tests now. I'll keep a close eye on them, but I'm curious how much longer they will last. I think the drives in this array were averaging 50k hours, and 'trying to use up the old drives' isn't a good idea for the main fileserver.

    • @Van-l2r
      @Van-l2r Před 29 dny +1

      A few years ago, I tossed any mechanical drives under 10TB. I wish I still had those now!

    • @truckerallikatuk
      @truckerallikatuk Před 29 dny +1

      @@ElectronicsWizardry The typical bathtub curve of failures... they die most at the beginning and end of life.

  • @peteradshead2383
    @peteradshead2383 Před 27 dny +1

    My synology ds918+ said to me the other day "drive 2 to as failed" , I/O errors and said it had 906 bad sectors replace the drive .
    So switched the NAS off a ordered a new drive , the new drive came next day so switched back on the NAS and all drives healthy , so told drive 2 to do a smart short test and it passed , extended test passed , iron wolf test passed, and a scrub .
    So I gone from a sky is falling in to it passing all tests by just a switch off , I didn't pull the drive or anything .

  • @jeromehage
    @jeromehage Před 19 dny

    Thank you for this nice video

  • @byrd203
    @byrd203 Před 25 dny

    the Drives going crazy on SMR raid it happens too they will make other drives fail

  • @byrd203
    @byrd203 Před 27 dny

    you probly had SMR drives they fail on rebuild yes you want to check the spec sheet and use CMR Drives in raid like iron wolf pros

    • @ElectronicsWizardry
      @ElectronicsWizardry  Před 26 dny

      I'm pretty sure my drives were SMR looking at the models of the drive I had. I also have done a few rebuilds with SMR drives in the past and there normally just extremely slow, they don't fail the rebuild.

    • @byrd203
      @byrd203 Před 25 dny +1

      @@ElectronicsWizardry failing on rebuild is a common issue on SMR Drives WD & seagate got sued over this look it up it was a hole mess even videos on youtube regarding this do not use SMR ever.

  • @darthkielbasa
    @darthkielbasa Před 29 dny

    The moniker is true.

  • @Van-l2r
    @Van-l2r Před 29 dny

    My computer decided to dump my Windows Storage Space on me. I was able to copy about half of it. I went back to cold storage.

  • @shetho1
    @shetho1 Před 28 dny +1

    I couldn't be bothered with that command line crap I would rather have a gui so much easyier that command just seems like to much hassle

    • @ElectronicsWizardry
      @ElectronicsWizardry  Před 26 dny

      Yea the command line is a lot of learning, and I get why many people want a easy to use GUI. Unfortunately a lot of tools are command line only.