ZFS Deduplication in TrueNAS

Sdílet
Vložit
  • čas přidán 21. 10. 2021
  • Thanks to Manscaped for sponsoring today's video. Head over to Manscaped.com/CraftComputing to get 20% off your Performance Package 4.0 + Free International Shipping!
    ZFS Deduplication is a technology that is difficult to find useful information about. Those who use it won't share their secrets. Those who think they know everything tell you NOT to use it in the most demeaning and uninformative ways possible on every internet forum. So, what's the real story? How does ZFS Deduplication work, and should you use it?
    But first... What am I drinking???
    From Terrapin Brewing (Athens, GA) comes the Wake N Bake Chai Latte Imperial Oatmeal Coffee Stout, and every single one of those words accurately describes what is in this delicious, breakfast inspired brew.
    *Links to ite*ms below may be affiliate links for which I may be compensated
    Check out some of the parts from my storage servers:
    8TB HGST He8 Helium SAS: amzn.to/2X7Xb7C
    HGST 7K4000 4TB SAS Drives: amzn.to/3g6MysB
    HP Gen8 - Gen10 3.5" Drive Trays: amzn.to/3egChsf
    Xeon E5-2660v3 - ebay.to/3i8JmMG
    Xeon E5-2678v3 - ebay.to/3eh1yTb
    TrueNAS iSCSI + Steam Library Setup: • Use your NAS as a Stea...
    Get yourself a pint glass over at craftcomputing.store
    Follow me on Twitter @CraftComputing
    Support me on Patreon or Floatplane and get access to my exclusive Discord server. Chat with myself and the other hosts on Talking Heads all week long.
    / craftcomputing
    www.floatplane.com/channel/Cr...
  • Věda a technologie

Komentáře • 237

  • @TrueNAS
    @TrueNAS Před 2 lety +250

    Best dedupe video on the net...

    • @FlaxTheSeedOne
      @FlaxTheSeedOne Před 2 lety +4

      Maybe you have the answer:
      What happenes if the dedupe information gets lost? If the dedupe devices fail for example?
      Can the information be restored from the actual data?
      Or am I going to lose file information?

    • @sku2007
      @sku2007 Před 2 lety +9

      @@FlaxTheSeedOne it is a vdev. so like all vdevs, if it gets lost, the pool is lost.

    • @roberant7
      @roberant7 Před 2 lety

      LMFAO!!!

    • @TheBinklemNetwork
      @TheBinklemNetwork Před 2 lety +4

      @@FlaxTheSeedOne thanks for asking, I didn't know either
      @sku2007 Thanks for answering!

    • @bfth121
      @bfth121 Před 6 měsíci

      Best thumbnail on the net lmao

  • @TechHut
    @TechHut Před 2 lety +97

    That intro segment was amazing

  • @RubyRoks
    @RubyRoks Před 2 lety +250

    That might be the best intro to a tech video i've ever seen

    • @jamess1787
      @jamess1787 Před 2 lety +5

      And he didn't skip a beat. (Or beat his meat) 🤜🍖

    • @DigitEgal
      @DigitEgal Před 2 lety +1

      i would like to give a like but i cannot change that 101

  • @tormaid42
    @tormaid42 Před 2 lety +65

    One of the better thumbnails I’ve ever seen

  • @mike__durrett
    @mike__durrett Před 2 lety +51

    10/10 thumbnail. The Jeff-verse is real and no one can tell me otherwise.

    • @5Breaker
      @5Breaker Před 2 lety +2

      Now there's two of them!

    • @binarymatrixlol8197
      @binarymatrixlol8197 Před 2 lety

      @@5Breaker but one is just a vector pointing to the real one

    • @5Breaker
      @5Breaker Před 2 lety

      @@binarymatrixlol8197 No the real one is behind the camera.

  • @jafizzle95
    @jafizzle95 Před 2 lety +56

    I watch a lot of 'techtuber' channels, and this might be the only one that nails clever comedy.

  • @jjdawg9918
    @jjdawg9918 Před 2 lety +33

    You have done your job well and convinced me that I have absolutely no need for dedup and the memory, CPU, and ssd costs that go with it. Thanks!

  • @jason-budney7624
    @jason-budney7624 Před 2 lety +17

    The intro, the ad, the "forum post", absolutely classic! Amazing job Jeff!!! Deduplication isn't for me, but it's great learning about these features the proper way!

  • @jamztiberius68
    @jamztiberius68 Před rokem +1

    Your videos on everything about truenas are awesome. I've entirely configured my entire home server based almost all from your videos.

  • @Jamesaepp
    @Jamesaepp Před 2 lety +4

    I played around with the dedup role in Windows server recently which is not at all like ZFS (no experience myself with ZFS dedup). Some things I thought important to point out:
    * Dedup is often a one-way route. If you have a 1TB disk storing 2TB of logical data - you aren't disabling it lest you kill your pool. If you start with a pool with dedup enabled and are troubleshooting a problem months or years later and go "huh, maybe the problem is dedup related?" you aren't necessarily going to be able to test that theory.
    * It is harder to free up disk space with dedup. Say you have dedup enabled and are running out of free space. Well, due to the fact that dedup is trying its hardest to minimize storage consumption, you are not necessarily able to free up space as easily as before. You might find a folder with 100GB of ""data"" on it that you don't need or have archived elsewhere. You could delete it but not see your vacancy on your pool budge. Probably because that data is used somewhere else.
    * This is more of a Windows thing - but you are **technically** introducing risk into your data storage. Yes, you are storing less copies of your data by the fact that you are deduping. Make sure your pool is configured with the redundancy you require. Make sure your backups are working too. Windows combats this by setting up a popularity area (there's a technical term that escapes me) where any 32KB chunk that is referenced a certain number of times (I think 50 or 100 by default) gets stored in the popularity area with parity. Not sure if ZFS does something similar, but at the end of the day - redundancy and backups.

  • @gordslater
    @gordslater Před 2 lety +30

    Swollen storage is the main reason I need Manscaped

  • @LubomirGeorgiev
    @LubomirGeorgiev Před 2 lety +9

    Would love to see a long term report of this project. I have only seen complaints when ZFS dedup is mentioned.

  • @BioToxin
    @BioToxin Před 2 lety +3

    14:50 so I watched this a while back, thought it was a nifty idea, thought I'd give it a try recently, got this far only to realize "in a month or so" apparently didn't happen? where's the automated updates, lancache, CoW snapshots, isci replication tasks, pxe/pcoip thin client booting, etc? did you ever get this going?

  • @chris_hertford
    @chris_hertford Před 2 lety

    Setup my first truenas yesterday thanks to your videos!

  • @rett.isawesome
    @rett.isawesome Před 2 lety +7

    I told you about my cousin from Trinidad in confidence.

  • @praecorloth
    @praecorloth Před 2 lety +17

    ZFS being a memory hog is also an old admin's tale. First, 8GB of memory isn't a lot of memory. Your browser is likely using 8GB of memory right now. Second, ZFS will use as much memory as is available. Like literally any other file system you're going to use. EXT4, NTFS, etc., will all use memory, AND AS WELL THEY FSCKING SHOULD.
    People freak out about ZFS and memory. It's a file system designed for a specific purpose. If you were spinning up a Windows file server, because I dunno, maybe you've given up on life or something, you would want as much of the memory on that system dedicated to serving up files as possible.

    • @eDoc2020
      @eDoc2020 Před 2 lety +1

      IMO 8GB is a lot of memory, even though plenty of people have that available. To get that much RAM you need a relatively beefy "real" computer while only needing 256MB would mean you could run on a cheap and low power "embedded" system. While this difference is minor in a large system with numerous drives it can make a huge difference in a smaller setup.
      The issue is it's hard to know how much RAM ZFS _actually_ needs. What happens if you use less than the recommended amount? If the penalty is a few 10s of milliseconds that's perfectly acceptable for many applications. If it means it takes 5 seconds to get a directory listing then it's a different story.

    • @praecorloth
      @praecorloth Před 2 lety +6

      @@eDoc2020 I mean yeah, what do you want? Do you want an embedded system, with a standard file system and 5 second directory listing times, or do you want a file server?
      Also, the memory requirements for ZFS are far from confusing, and the results for breaking those requirements is not nearly as bad as people FUD over.
      If you don't have enough memory for ZFS, what it means is that you're going to disk for data more often than not. You're going to do that with an embedded system with next to no memory, you're going to do that with a traditional file system with crappy FIFO caching.
      8GB of memory. Doesn't even need to be a bleeding edge system. DDR4 too expensive? Build a DDR3 system.
      Don't want ZFS eating up all of the memory in a dual-purpose system like Proxmox? Set the ZFS ARC to something reasonable. For my own workloads, I assign 10GB for ARC, and I only have 32GB of RAM in my Proxmox system. I do that for larger setups to, like 64GB, and 128GB of memory. A quick 10GB of ARC, and your VMs are off to the races.

    • @Mr.Leeroy
      @Mr.Leeroy Před 2 lety +1

      That's because 99% of ZFS users never tweak their datasets and as well as FreeNAS devs are not being fans of tiered storage.
      Having ZFS defaults not only means consume all free RAM, but cache ALL as well.
      Just learn how to "zfs set primarycache=metadata" for root dataset and cache=all only where it is really required, like MySQL DB dataset, or jail/VM config datastore, or similar and not your whole damn Plex library.
      My system takes ~5GB of RAM for services and the rest 12GB is for ZFS cache. That is for 21.8TB raw disks, 17.7TB usable storage.
      And guess what, there is like no difference at all in performance between full 12GB cache and 1GB cache after a reboot. The only really required memory is for services and this is usually under 8GB as per minimal requirements from devs.

  • @051egan
    @051egan Před rokem +4

    I set up a pool with deduplication, and added 2 iSCSI drives to this pool according to the other video. I added each drive to a different Windows machine, and copied my Steam Library onto both (same files on each drive). But when I check "zpool list" in the shell, my deduplication factor is only 1.05x.. Shouldn't it be closer to +/- 2.00x?

  • @M1America
    @M1America Před 2 lety +1

    I use this on the stripe of mirrors SATA SSD zfs that I use to have my steam library on. Works great. Note that you can turn it off and the blocks that are already deduped will stay as such and it would stop gobbling up all your ram and cpu. dedup and filesystem compression actually make something like this faster because the bottleneck is reading from disk.
    If you are on a few mechanical disks, its possible that gzip is faster than lz4 because of the extra compression ratio. You can also have an unencrypted dataset on an encrypted one with zfs. Excelent for maintaining an unencrypted steam library with an encrypted home folder for instance.

  • @camerontgore
    @camerontgore Před 2 lety +12

    This is a very ball heavy episode... 🤣

  • @charliekim7302
    @charliekim7302 Před 2 lety +3

    This is a very good video about iSCSI and TrusNAS. Thanks. I have a question. I am not sure if you heard of Virtual Disk Software called CCDISK which is a Windows based server software and uses iSCSI to provide files to clients as a local disk for like a gaming cafe. Client PCs can read and write as a normal disk but the source files stored in the server is read-only and any writing is saved in a cache(writeback-cache) and it is initialized once reboots. Is it something can be done with freeNAS + iSCSI?

  • @SecurityCard1
    @SecurityCard1 Před rokem

    Hey Jeff, great Video!
    But one question, have you done a follow up Video for the iSCI-Replication thing?

  • @ewenchan1239
    @ewenchan1239 Před 2 lety +8

    What would be the performance if you DIDN'T use SSDs for the dedup tables?
    The other test would be if your system is starved for RAM AND the vdev is 90% full AND the dedup table resides on the spinning rust.
    THAT would be an interesting test/result to compare against.

  • @wallydisc
    @wallydisc Před 2 lety +1

    You are doing a very good job with this stuff, man. Keep it up.

  • @hakovatube
    @hakovatube Před 2 lety +3

    Apart from being a very informative video, the intro deserves 5 stars for the humor. Cheers Jeff!

  • @lucavignati2958
    @lucavignati2958 Před 2 lety +1

    great video, i wanna know if dedup prevent the raid-z2 to make copies of the same file on others drive, or if is a "upper layout" only

  • @shanghaiultra
    @shanghaiultra Před 2 lety

    The most entertaining and amusing sponsored content section yet seen on CZcams - bar none.

  • @tinybolt8130
    @tinybolt8130 Před 2 lety +8

    Some of these phrases sound like they belong in Virus Alert by "Weird Al" Yankovic.

  • @hescominsoon
    @hescominsoon Před 2 lety

    That intro was absolutely awesome...:)

  • @denvera1g1
    @denvera1g1 Před 2 lety +1

    I have 4 110GB optane drives for dedup and write cache, works great, and if anyone was wondering, truenas doesnt really care if you're using Intel or AMD when implimenting optane

  • @nemtudom5074
    @nemtudom5074 Před 11 měsíci

    Okay that intro was hilarious!

  • @TheBobcat1978
    @TheBobcat1978 Před rokem

    Hey Jeff. I been doing a Nas project for over a year now. I have a core i3 10100k, 16 gbs of standard ddr4 RAM and a Intel x540. IAll in a Silverstone CS351 (which could be designed better) I been struggling with figuring out what OS to use. I'd like to use plex, keep my music, photos, and play roms from and to backup my steam library.
    I have narrowed it down to TrueNas, unRAID, and Windows 10. unRAID is easiest to use, TrueNas is a bit complicated for the different size drives but willing to use it.
    So since I am new to doing a jone server, which OS would you recommend for my needs. Thanks in advance.

  • @UnwittingSweater
    @UnwittingSweater Před 2 lety +7

    Great tutorial but I just wanted to say you give better tasting notes than some dedicated food channels.

  • @uberchemist
    @uberchemist Před 2 lety

    Hi Jeff! Cheers dude, keep up the great work!

  • @WizardNumberNext
    @WizardNumberNext Před 2 lety +4

    actually thinking about deduplication
    if this would be in busy environment there will be huge RAM savings, as there would be no need to cache each instance of same data, so single cache, not many caches

  • @jackwright7014
    @jackwright7014 Před 2 lety +1

    I use it for backups of game worlds as there is a LOT of duplicates. Then I have two cloud sync jobs that both sync the worlds to cloud storage on a daily and weekly basis.

  • @brierepooc8987
    @brierepooc8987 Před 2 lety

    This guy has the best CZcams channel. Still a huge fan

  • @georgeashmore9420
    @georgeashmore9420 Před rokem

    What is dedup like for datasets? I have been trying it out but with dedup only being recorded on the pool I’m struggling to decipher the advantages

  • @borewiq
    @borewiq Před rokem

    How would I check if my dedup works on 2 ZVols in Dataset? Do filesystems matter, eg. will it work on if I have one ext4 and another one with ntfs?

  • @MrDAndersson
    @MrDAndersson Před 2 lety +1

    As for read performance is there any major difference between dedup and non-dedup?

  • @Squinoogle
    @Squinoogle Před 2 lety +1

    Been a while since you've had a good beer, and this one sounds like a doozy!
    And, you can tell it's testing time when you pull out those trusty 3TB Seagates :D

  • @phantom7802
    @phantom7802 Před rokem

    Hey, I don't know if I'll get a response buuuut, with deduplication and using a nas as my steam library, would it be possible to have multiple computers run the same game install since they see their own install?

  • @jamess1787
    @jamess1787 Před 2 lety +1

    Can you post the bloopers for that advertisement?

  • @UntouchedWagons
    @UntouchedWagons Před 2 lety +2

    So were the two Rosewill cases supposed to be used in some sort of dedup metaphor or what?

  • @JamieStuff
    @JamieStuff Před 2 lety +3

    A Like solely for that intro...

  • @ejbully
    @ejbully Před rokem

    Thank you. This put things into practical perspective. Appreciated 🙏🙏

  • @wyfyj
    @wyfyj Před rokem

    Love the content. Need more nature b rolls!

  • @ewenchan1239
    @ewenchan1239 Před 2 lety

    Stupid question - so I am trying to actually execute the process as stated here where you are installing the same game to multiple machines.
    Does that mean that you point all three machines to the same iSCSI volume, and when you format said iSCSI volume, you are technically formatting it thrice with NTFS?
    I'm not sure that I really follow how you are able to download a game once, but install it on three separate machines without each of the machines downloading the data again because that is what Steam appears to be doing on my systems.
    A video tutorial on this would be greatly appreciated.
    Thank you.

  • @FlaxTheSeedOne
    @FlaxTheSeedOne Před 2 lety +17

    I have a question on the dedupe tables. What happens if that device fails? So if your SSDs fail? Are all the links to the same files lost? Or can they be rebuild?

    • @intheprettypink
      @intheprettypink Před 2 lety +2

      Im running this exact setup, and you can mirror the dedupe vdevs on multiple devices just like regular "storage" vdevs. Pretty much every vdev you can create or add to a zpool can be mirrored. So definitely mirror your dedupe vdevs.

    • @FlaxTheSeedOne
      @FlaxTheSeedOne Před 2 lety +1

      @@intheprettypink so I can have a vdev for dedupe as shown and then replicate that vdev to disk on a nightly basis for example to keep it safe in case of failure?
      Or can i just put them in a mirrored pool?
      I mean thers always the risk of device failure or a faulty controler and i would not like to loose the table that stores differences and paths.

    • @intheprettypink
      @intheprettypink Před 2 lety

      @@FlaxTheSeedOne I think you miss-understood how vdevs and replication works. Vdevs are just the raid-whatever group of storage devices you put in the pool. This can be the main storage vdevs, dedup, special, etc. In ZFS you can "raid1" mirror the dedup vdevs. This means for one dedup vdev, that vdev actually has two or more actual storage devices for mirroring. However, you do not replicate the dedup drives. The dedup vdevs are invisible to everything but ZFS. On top of the pool, are things called datasets. These datasets are the filesystems you interact with. These ZFS datasets are the things you replicate by using zfs send/receive commands. Dedup is completely invisible to the datasets. Using zfs send / receive will replicate all data on the dataset (on the pool vdevs) to where-ever you send it. You can even just use rsync to replicate the files on the ZFS filesystem. If you use the dedup vdev feature, just make sure you use two or more storage devices in mirroring and you will be set. Just remember, mirroring inside the vdev is not your actual backup, and is instead intended for uptime.

    • @FlaxTheSeedOne
      @FlaxTheSeedOne Před 2 lety +1

      @@intheprettypink I know all that, but my question still remains: What happens when your dedupe vdem hits the fan?
      My follow up was, can i Replicate the dedupe vdev onto the main storage vdev or a zvol on there, so when the dedupe vdev bites to dust i can just replicate the data.
      (my guess is no or at least not through tbe gui)
      Than i still am wondering what happens when your dedupe information is completely lost.

    • @intheprettypink
      @intheprettypink Před 2 lety

      @@FlaxTheSeedOne What happens when your dedupe vdev dies? Your pool dies. You lose all your data.
      Can I replicate the dedupe vdev... No. What you have to do is turn off dedup, and then copy the data to a different spot on the pool, and then delete the first copy to remove dedup data from the pool, and even then you cant remove the dedup vdev. You would be better off moving your data off the pool and recreating it without the dedup vdev. Again, it sounds like you dont understand how ZFS works with vdevs and storing your data. You do not touch the drives anymore after you add them to the pool, unless you are replacing dead ones.
      How would you avoid losing the dedup vdev? Same as the main storage vdevs. Use the raid features of ZFS. I personally have a 4x1tb ssd mirror for a dedup vdev. All four of those nvme drives would need to die before the pool is lost. If one dies, you pull it out and replace it, and add it back to the dedup vdev.
      There is nothing to wonder about when you lose the dedup vdevs. You lose your data. Its essentially keeping the dedup tables in a storage device such as ssds for faster than spinning access.
      It sounds more like you should just not worry about using this feature if you are this worried about replicating the dedup vdevs.

  • @brycedavey1252
    @brycedavey1252 Před 2 lety

    Best intro yet!!

  • @mimimmimmimim
    @mimimmimmimim Před 9 měsíci +1

    Deduplication works on the block level. Does it show benefits on virtual disk files? It means virtual blocks inside the virtual disks should match as blocks on the outside. I will try this myself...
    One additional note, back in 2012 or something like that, I remember upgrading the ram of a server at work in order to make it boot after an unsafe shutdown. It was stuck organising or maybe scrubbing, not sure now, at boot time, since the deduplication load the memory was not enough and it failed repeatedly until the upgrade was done.

  • @prodeous
    @prodeous Před 2 lety

    Another great overview for a noob like myself. Now to just get some Rack Nas and may the fun begin :)

  • @lxst-in-trvnslvtixn
    @lxst-in-trvnslvtixn Před 2 lety

    what happens when the disks storing the dedupe table get corrupted or die?

  • @MrJohnnnZ
    @MrJohnnnZ Před 2 lety +53

    ZFS got my girlfriend pregant

    • @gordslater
      @gordslater Před 2 lety +18

      you better pray that dedup works or it's gonna be sextuplets

    • @JoaoSilva-jr9ez
      @JoaoSilva-jr9ez Před 2 lety +11

      @@gordslater If dedup works maybe the baby is born with only one eye, ear, hand, and so on... That might not be ideal.

    • @edwardallenthree
      @edwardallenthree Před 2 lety +3

      I want to make a reiserfs joke but it is in really poor taste.

    • @5Breaker
      @5Breaker Před 2 lety +1

      @@JoaoSilva-jr9ez But since they are mirrored and not, e.g. two left foots, it still should be fine. LOL

    • @AshtonSnapp
      @AshtonSnapp Před 2 lety

      *prrrregante*

  • @ProtoXoa
    @ProtoXoa Před 2 lety +3

    Has to be one of your funniest intros!

  • @ViratKadaru
    @ViratKadaru Před 2 lety +2

    Love that intro

  • @Boltran
    @Boltran Před 10 měsíci +1

    Maybe the reson not too many people know this is that the dedup vdev is a farly new feature. I might be mistaken, but it may only be available on Linux, and therefore not available on TrueNAS CORE. I am not sure, though. Anyone knows the answer?

  • @apaskiewicz
    @apaskiewicz Před 2 měsíci

    This would really nice for massive games like MW2 and Ark Survival and COD Cold War are installed on multiple machines. 10gbe is fast enough to load the game over NIC and four or five copies of those games installed on machines would be in the multiple TiB range. Last I checked Ark with all maps is almost 500gb and MW2 was a few hundred GiB,and Cold War is also a few hundred GiB. Having those games installed on a Samsung EVO on four machines would cost you a fortune that could be offloaded to an iSCSI drive with dedup. Each client would be presented with its own “copy” and video settings but only take up the space of one copy.

  • @ragtop63
    @ragtop63 Před 9 měsíci +3

    Interesting. If I understood this correctly, if you have a 1 TB zvol and you fill it up with 900 GB of stuff, some dupiclates, the OS will still report that 900 GB is used? So if, in the case of an iSCSI scenario, the dedup'd data is still being reported to the OS as used space, there is no benefit of "space savings" for that specific scenario since the OS will not let you write more data to the drive if it's reporting itself as full. Even though, physically, it is not full.

    • @acenio654
      @acenio654 Před 2 měsíci

      Yeah this confused me a bit as well. If both OSes report the drive as logically full what would the benefit actually be?

  • @SureshotCyclonus
    @SureshotCyclonus Před 2 lety +5

    Contact your doctor if you have deduplication tasks lasting longer than 4 hours.

  • @Jademalo
    @Jademalo Před 11 měsíci +2

    If both TrueNAS and Windows report the full amount of data on the disk, how does this exactly save space?
    If I have a 6tb volume with 3tb of the same data twice, wouldn't everything report as full even though I theoretically have 3tb free space? How would I use and access that free space?

  • @danielfisher1515
    @danielfisher1515 Před 2 lety +7

    "Erectile hyperfunction"?! LOL!

  • @falazarte
    @falazarte Před 2 lety +1

    You are hilarious! Kind of hard to find good humor among geeks like us.

  • @RyouConcord
    @RyouConcord Před 2 lety

    Always wanted to know if it actually... You know... Worked. Thanks for the upload!

  • @HcgRandon
    @HcgRandon Před 2 lety

    Heck yeah been waiting for this one

  • @thumbwarriordx
    @thumbwarriordx Před 2 lety

    For a single-user setup or mostly a single user setup running something like fslint and linking all the duplicate files together is a lot more practical for long-term storage of static files.
    But that still leaves me using deduplication in several little places where it simply won't do.

  • @feartogail6933
    @feartogail6933 Před 2 lety

    Oh my. Only thing better than that thumbnail is that intro. 5/7 would chuckle snort again

  • @RockNLol2009
    @RockNLol2009 Před 2 lety

    the thumbnail is gold 😂

  • @brucoder
    @brucoder Před 2 lety

    This was worth it for the Manscapes commercial!

  • @patrickprafke4894
    @patrickprafke4894 Před rokem

    Can you get a 32gb Optane drive to work with C602 chipset systems?

  • @TehJumpingJawa
    @TehJumpingJawa Před 2 lety

    When copying data that already exists on the ZFS share, is the checksum generated on the client?
    If so, you'd also see massive reductions in network bandwidth usage, and the maximum attainable speed would be largely dependent upon the speed at which the checksum could be created by the client (i.e. the read speed of the local storage)
    Much like CPU cache latency exploits, there must be security implications for having deduplication enabled, as clients would be able to indirectly query the contents of the ZFS share by engineering the contents of files and measuring the resultant effective write speeds.

    • @Mr.Leeroy
      @Mr.Leeroy Před 2 lety

      No.
      ZFS OS is the one check-summing, ofc.

  • @guspaz
    @guspaz Před 2 lety +3

    I tried ZFS deduplication years ago. It really *did* slow everything down immensely, and barely saved any space. Because it works on the block level, it may not actually help at all on an almost identical file if there is even a one-byte offset between them (exception: ZFS is copy-on-write, so inserting a byte in the middle of a file won't cause an offset). And the RAM requirement... At the time, I had a 20TB filesystem, and certainly didn't have an extra 20GB of free memory to spare. I eventually migrated the ZFS filesystem to a non-deduplicated one, and enjoyed a massive performance improvement. I do have a ton more RAM in there now, but it's also now a 92TB filesystem, so yeah. Don't make my mistake, avoid ZFS deduplication. It's very hard to undo because you can't turn it off, you can only disable it for future writes, not previous writes. Maybe look into file-based dedupe instead, which can work on any filesystem by looking for identical files and replacing one copy with a hard link. The downside is that it's now a single file on disk and modifications from either location will affect the other, the upside is that it still saves the duplicated space and has zero performance/ram/etc overhead.
    EDIT: There are use cases where deduplication makes sense, but for the vast majority of people, it doesn't. Having that 1GB of RAM per 1TB of storage avoids the huge performance penalty, but that's a huge amount of RAM in any reasonably sized NAS. Perhaps storing the tables on SSDs might be an easier option, but that wasn't an option when I used dedupe.

  • @LNSFLIVE
    @LNSFLIVE Před 29 dny

    that MMMBop from Hanson gets me every time

  • @manslayerdbzgt
    @manslayerdbzgt Před 2 lety

    Keep rocking and rolling

  • @supercj70
    @supercj70 Před 2 lety

    Best intro ever!!

  • @dangingerich2559
    @dangingerich2559 Před 2 lety +2

    I once used deduplication built into Windows 2012 (not R2) and all the files I had stored on it ended up empty. So, not all dedup technology is created equal.
    My big question is this: if I create a TrueNAS VM on a separate storage LUN and then created iSCSI LUNs to all my other VMs to it, can I use built in Windows Backup to backup to those and get reliable, free backups with dedup?

    • @lucavignati2958
      @lucavignati2958 Před 2 lety

      i'd love to know me too

    • @dangingerich2559
      @dangingerich2559 Před 2 lety

      I'm part of the way there. I have my TrueNAS VM created on my 6X4TB RAID 10 set. I got that done last night. Gotta create the shares now.

  • @chad2304
    @chad2304 Před 23 dny

    That beer sounded really good

  • @Cpuboye11
    @Cpuboye11 Před 2 lety

    So I don't use trunes. I'm using Ubuntu latest build LTS with ZFS installed on it. I've been trying to find the correct syntax or a way to put the dedo table onto an SSD and not memory. Like he stated in the video I found very false and troublesome and incorrect data on form posts.
    Obviously true now is using ZFS considering he just use z pool command within the terminal. My question is can someone point me in the direction or if possible just give me the command to set up a dedo table on an SSD and not in ram.
    I guess the next logical question is, if this is done will ZFS still use ram as cash for the incoming and outgoing data sets.
    Thank you

  • @richardleaneagh4274
    @richardleaneagh4274 Před 2 lety

    really love hanson brothers in slap shot

  • @Raymond6494
    @Raymond6494 Před 2 lety

    keep up the great work!

  • @mikeyd4308
    @mikeyd4308 Před 2 lety

    that mmmbop bug sounds like a feature, sign me up!

  • @voodoovinny7125
    @voodoovinny7125 Před 2 lety

    There is nothing like watching a video and your OCD kicks in. When you were dividing 208,255,868,928 by 2, you said 104,128,353,780 bytes while we were seeing 104,127,934,464.

  • @monty00701
    @monty00701 Před 2 lety +1

    haha - loved this health ad intro!!!

    • @monty00701
      @monty00701 Před 2 lety

      haha - cousin in Trinidad with swollen testicles - spat out my drink laughing!!

  • @damzelfly
    @damzelfly Před 2 lety

    Is there a way to remove the dedup vdevs or its a one-way thing?

    • @sinisterpisces
      @sinisterpisces Před 2 lety +1

      I had the same question. My understanding is that you'd need to create new VDEV(s) that aren't dedup'd, copy all hte data over to the new VDEVs, then delete the old ones.

  • @gordoncreAtive
    @gordoncreAtive Před 2 lety +1

    I've had serious issues with deduplication - horrible slowdowns and undeletable ghost files. It might have been that my system (RAM + CPU, no dedicated dedup table storage) is not up to the task but I think dedup might not be as straightforward as other ZFS features.

    • @UnreasonableSteve
      @UnreasonableSteve Před 2 lety +1

      It absolutely isn't. The best part is after everyone tells you XFS supports dedupe and it works great, if you ask for troubleshooting help, with dedupe you'll get nothing but derision for ever thinking you could turn it on.
      Top it off with the fact that disabling dedup on a pool doesn't remove the dedup tables or truly un-deduplicate anything. If you want your issues to go away, you have to destroy your pool and restore from backup. Not exactly something to "try out"

  • @cheplays2482
    @cheplays2482 Před rokem

    Actually being from Trinidad, that Nikki reference really caught me offguard 🤣

  • @andrewbrady8564
    @andrewbrady8564 Před 2 lety

    Just thank you!

  • @Kapisketo
    @Kapisketo Před 2 lety +7

    I can see dedup working in a mail server or in a company where all the computers do a netboot but for home users I think that is better that each user has it's games installed locally and use lancache to help with the downloads.

  • @coletraintechgames2932

    Do you have any interest in discussing a metadata vdev?
    And comparing with dedupe?
    Thank you so much, btw, enjoy your channel!

  • @edwardallenthree
    @edwardallenthree Před 2 lety +6

    Before watching the video: No! Don't so it!!! Now let's watch and hopefully change me mind. Update pending.
    Edit: it's still not for me. I have one steam library, a media library, and a bunch of Linux vms. You would think the duplication in the Linux vms would be enough, but this is such a small part of my total storage, that it doesn't seem worth it.

    • @colonelangus7535
      @colonelangus7535 Před 2 lety +1

      Youre an unraid candidate.
      I'm not personally looking at trunas for at least another year.

    • @edwardallenthree
      @edwardallenthree Před 2 lety +1

      @@colonelangus7535 I use Ubuntu and ZFS because of hardware compatibility. I have various pools, with various properties based on need (e.g. one of the pools has 16 write intensive 200gb ssds in mirrored pairs ("RAID10"). Also have a pool with 16 2 tb 3.5" sas drives. And one with 8 2tb sad drives that have a firmware flaw making them only usable at 1.9tb.

    • @UnreasonableSteve
      @UnreasonableSteve Před 2 lety +1

      Yep, deduplication almost certainly isn't for you. If your ZFS supports it, though, I definitely recommend making sure zstd compression is enabled

  • @GameCyborgCh
    @GameCyborgCh Před rokem

    In theory with deduplication windows and truenas could see more data used than what's available.
    It makes sense that if you write data that already exists in a storage pool then all that needs to happen is comparing the checksum to make sure it already in the pool so you don't need to write it again, making that much faster. But why is writing for the first time faster? Doesn't ZFS need to write the file and calculate and write the checksum? would that make writing files for the first time slower?

  • @jonas-fc5nq
    @jonas-fc5nq Před 2 lety +1

    Maybe i just didn't get it, but what real advantage do i get from Deduplication if i can't access the saved space, since the file system can't see it? 🤔

    • @eusebiusthunked5259
      @eusebiusthunked5259 Před 2 lety

      If the file system can see it, it isnt working... rather, your filesystem will show more capacity than it should if duplicate data is recognized

  • @jabolko1k
    @jabolko1k Před 2 lety +1

    what is this 2u case? is it possible to buy it?

    • @acenio654
      @acenio654 Před 2 měsíci

      2 years late, but the case is the one showcased in his other video "A Three Server HomeLab for less than $1,000!"

  • @Fay7666
    @Fay7666 Před 2 lety

    My plan is to use a 400GB Intel 750, for a 4TB Mirrored pool. It's mostly SQL and vhdx files that should all be the same. I'm wondering if I should use it as L2ARC or as Dedup drive. Opinions?

    • @Fay7666
      @Fay7666 Před 2 lety

      SQL and vhdx _backups_ may I add, nothing actually running off it.

    • @UnreasonableSteve
      @UnreasonableSteve Před 2 lety

      If it's only one SSD, use it as l2arc (and maybe even partition it so that you're not using a full 400GB l2arc. There are concerns if you use too much RAM store headers pointing at the l2 arc, you won't be storing as much cache data directly in ram anymore)
      The reason I say use for l2arc is simple - your dedup device (or SLOG, etc) should be as resilient as your main vdev, if not more so. Data loss there can mean unrecoverable losses

  • @DavidThorarinsson
    @DavidThorarinsson Před 2 lety +3

    Thanks for good content and most of all the good laughs!

  • @threepe0
    @threepe0 Před 2 lety

    1gb of ram per tb of storage is not insignificant if you have a large pool. Yes there's a lot of misinformation, but I think this is a barrier for some people. De-dup can be scripted and handled other ways as well. That being said, it was good to see some hard performance examples, and I'm taking another look at enabling de-duplication now. Thanks!

  • @amp888
    @amp888 Před 2 lety +1

    How about a VM performance performance comparison between using ZFS deduplication versus using Proxmox's VM template feature (converting an existing VM to a template) combined with the linked clone storage option?

  • @felderup
    @felderup Před 2 lety

    i've been looking into borg, not a thought for you?

  • @llpolluxll
    @llpolluxll Před 2 lety +2

    My home lab has become my zen garden. Albeit a grossly expensive one.

    • @malbeth8700
      @malbeth8700 Před 2 lety +1

      Aren't all zen garden grossly expensive? XD

  • @jakeyod
    @jakeyod Před 2 lety

    I really like the building on knowledge style of video

  • @JoaoSilva-jr9ez
    @JoaoSilva-jr9ez Před 2 lety

    Does anyone know if the data from the pool can be recovered in the event of the dedup vdev failing? Like if I use a single drive as a dedup vdev and it fails, is all of my data gone?

    • @htwingnut
      @htwingnut Před 2 lety +1

      If it's the only copy of the dedup table then I'd say yes. I guess that's why redundancy is important.

    • @UnreasonableSteve
      @UnreasonableSteve Před 2 lety +1

      You'll want your dedup vdev to be as resilient as the main pool, at least.