Setting Up Proxmox High Availability Cluster & Ceph

Sdílet
Vložit
  • čas přidán 15. 06. 2024
  • Setting up Proxmox HA Cluster and Ceph Storage from scratch.
    ○○○ LINKS ○○○
    pve.proxmox.com/wiki/High_Ava...
    • Proxmox Stuff
    ○○○ SHOP ○○○
    Novaspirit Shop ► teespring.com/stores/novaspir...
    Amazon Store ► amzn.to/2AYs3dI
    ○○○ TIMECODE ○○○
    0:00 intro
    0:16 High Availability & Ceph
    1:13 My Proxmox Setup
    1:45 Setting Up Proxmox Cluster
    3:59 Installing Ceph
    6:14 Setup Ceph OSD
    6:55 Ceph Monitors Setup
    7:25 Ceph Pool
    7:54 HA Group Settings
    9:09 setting up container
    10:49 Testing Migration
    12:51 Testing Failover
    16:11 Conclusion
    ○○○ SUPPORT ○○○
    💗 Patreon ► goo.gl/xpgbzB
    ○○○ SOCIAL ○○○
    🎮 Twitch ► / novaspirit
    🎮 Pandemic Playground ► / @pandemicplayground
    ▶️ novaspirit tv ► goo.gl/uokXYr
    🎮 Novaspirit Gaming ► / @novaspiritgaming
    🐤 Twitter ► / novaspirittech
    👾 Discord chat ► / discord
    FB Group Novaspirit ► / novasspirittech
    ○○○ Send Me Stuff ○○○
    Don Hui
    PO BOX 765
    Farmingville, NY 11738
    ○○○ Music ○○○
    From Epidemic Sounds
    patreon @ / novaspirittech
    Tweet me: @ / novaspirittech
    facebook: @ / novaspirittech
    Instagram @ / novaspirittech
    DISCLAIMER: This video and description contains affiliate links, which means that if you click on one of the product links, I’ll receive a small commission.
  • Věda a technologie

Komentáře • 41

  • @romayojr
    @romayojr Před 4 měsíci +16

    i’ve been running proxmox cluster with ceph pool on 3 dell 7060 in my environment for about 6 months now it’s been working great and it hasn’t had any failures as of yet. i highly recommend doing this if you have the resources

    • @techadsr
      @techadsr Před 4 měsíci

      The i5 8500 has 2 more cores than the N100 giving your cluster 6 more cores than mine. What usage and how many cores is ceph consuming on your cluster?

  • @Cpgeekorg
    @Cpgeekorg Před 4 měsíci +6

    at 10:50 there is a cut, and for folks who may be following along with this video I want to clarify what happened because it's a gotcha that people new to proxmox really should understand (I'm not trying to undermine Don here, he did a fantastic job with this video demo). - the way that the test container is shown configured (specifically using the default storage locations for storage, which in this case is the local storage) is INCORRECT for this configuration. you *must* choose the shared storage (in this case the ceph pool that was created called "proxpool" if you want to configure HA (it doesn't let you do it otherwise because it's not backed by shared storage). do not despair amigos, if you configure your vm or container accidentally on local storage and you already start deploying your workload and set it up and then you decide you want this vm/ct to be part of your HA config, you can just move the storage from your local to your shared storage by:
    1. click on the ct you accidentally created on local storage
    2. click on "resources"
    3. click on "root disk"
    4. click on the "volume action" pull-down at the top of the resources window
    5. click on "move storage"
    6. select the destination (shared) storage you want to move it to
    repeat this for all disks that belong in this container, and HA will work once all disks attached to the container are on shared storage. this procedure works the same for VMs as well, but you'll find the storage configuration under the "hardware" tab instead of the "resources tab" - and then you just click on each disk, and do the same "volume action" - "move storage" as with ct's.
    *pro tip*: proxmox, at the time of this writing, does NOT support the changing of the "default" storage location when you make new vm's and CT's, HOWEVER, this list is ALWAYS (at the time of this writing) in alphabetical order, and it defaults to the first storage location in the alphabet. if you wish to set the default, you can name whatever storage you like to be the default to the first name alphabetically. (for lots of people i've seen this as something like "ceph pool" BUT, for some strange reason proxmox prioritizes storage target ids that are capitalized, so I call my ceph pool NVME (because it's built on my nvme storage) and it shows up at the top of the list, and is thus default when I create storage. note: unfortunately you can't just change a storage id because your vm's won't know where your storage is. if you need to rename your storage, your best bet is to create a new ceph pool with the new name (based on the same storage - don't worry, ceph pools are thin provisioned), go to each vm/ct's storage, and move the storage to the new pool. when there is nothing left on the pool (you can verify this by looking at the pool in the storage section and making sure there aren't any files stored there), you can remove it from the cluster's storage section, then remove the pool from the ceph options.

    • @remio0912
      @remio0912 Před 4 měsíci +1

      Would you be willing to make a video?

    • @Cpgeekorg
      @Cpgeekorg Před 4 měsíci +1

      @@remio0912 I’d be happy to, but it’ll probably be a couple weeks for that one. Currently rebuilding my network.

  • @GreatWes77
    @GreatWes77 Před 3 měsíci

    Excellent presentation. Didn't over or under explain anything IMO. I appreciate it!

  • @ronm6585
    @ronm6585 Před 4 měsíci

    Thanks for sharing Don.

  • @dijkstw2
    @dijkstw2 Před 4 měsíci +1

    Just thinkjng about it and you’re posting this 😂🎉

  • @LukasBradley
    @LukasBradley Před 8 dny +1

    At 6:20, you say "you'll see the disk that I add." Where did this disk get defined? My config doesn't have an available disk.

  • @karloa7194
    @karloa7194 Před 4 měsíci

    Hey Don, are you RDP-ing to manage your infrastructure? I noticed that there are two mouse cursors. If your using some sort of jump box or bastion host, could you share how you're connecting to your bastion host and what is your bastion host?

  • @fbifido2
    @fbifido2 Před 4 měsíci +1

    @4:35 - please dive more on the cluster networking part:
    on VMWare vSAN you have storage network, VM network, fail-over network, etc ...
    what's the best way as in networking to build a ceph cluster with 3 or more host?

    • @Darkk6969
      @Darkk6969 Před 4 měsíci +2

      CEPH is very flexible in terms of network redundancy. I had two dedicated 10 gig network switches just for CEPH. Hell, if both switches fail it can use the LAN public network as backup.

  • @techadsr
    @techadsr Před 4 měsíci +1

    I added a usb NMVE enclosure with 1TB SSD to each node in my N100 3 node Proxmox cluster. The nodes' have USB 3.0 ports. Installed a Ceph pool using those three SSDs. Ceph and NFS (Synology DS1520+) are using the second adapter on each node and the NAS making the storage network traffic isolated from regular traffic. I moved a debian vm to NFS and timed a migrate from node 2 to 1. Repeated that migrate with that same debian on Ceph. Like Don, I was pinging the default router from that debian vm during the migrations. Never lost a ping. The timing for the 48G debian machine migration on NFS was 19 sec with 55 ms downtime. For Ceph, the timing was 18 seconds with 52 ms down time. Migration speed for both was 187.7 MiB/s. The HP 1TB EX900 Plus NVME SSD is gen3 but the SSK SHE-C325 Pro NVME Gen2 enclosure is USB 3.2.
    Not much of a performance difference in my config for NFS vs Ceph. At least there's a benefit for not having the NAS as a single point of failure.

    • @ultravioletiris6241
      @ultravioletiris6241 Před 3 měsíci

      Wait you can have high availability with migrations using Synology NFS? And it works similar to Ceph? Interesting

    • @techadsr
      @techadsr Před 3 měsíci

      ​@@ultravioletiris6241The difference is that Ceph doesn't have a single point of failure like a Synology NAS does.

    • @NatesRandomVideo
      @NatesRandomVideo Před 27 dny

      @@ultravioletiris6241All Proxmox HA cares about is if the storage is shared.
      (Not even that, really. You can set up ZFS root during node creation and replication of a VM to one or more nodes and migration is screaming fast - but can lose data. Shared storage eliminates the data loss window between replication runs.)
      There are ways to get / build HA SMB/CIFS storage and HA NFS storage - with as much redundancy as you like and the wallet can afford.
      That said, a single Synology isn’t that. So it is the SPOF in using either SMB or NFS shared storage for Proxmox cluster HA.
      Quite a few home gamers going for “decent” recoverability may use a shared storage system with UPS and the built in “nut” UPS monitoring in Linux to distribute UPS power status to various things so they can all do a graceful shutdown before the battery fails.
      It’s not protection against a NAS hardware failure - but it covers the most common outage that most people see. Power.
      Other things to consider when using a consumer grade NAS for shared storage is how badly it slows down during disk replacement and recovery. Many find their VMs performance to be extremely poor during that time.
      You can go very deep down this rabbit hole. Some even run into similar problems when their CEPH needs to rebuild after a failure. Giving it its own fast network for just itself is one of the ways to mitigate that.

  • @dzmelinux7769
    @dzmelinux7769 Před 4 měsíci

    So does this work with LXC container too? They don't start after migration, but HA isn't considered migration.

  • @Lunolux
    @Lunolux Před 4 měsíci

    thx for the video,
    6:22 i'm a little confuse, this storage is where ? on the proxmox server ? remote storage ?

    • @techadsr
      @techadsr Před 4 měsíci +1

      Don created the OSD on /dev/sdc. But his three node Proxmox cluster itself was virtualized. So not sure if /dev/sdc was also virtualized.

  • @remio0912
    @remio0912 Před 4 měsíci +1

    I got messed up when you created a OSD. I had no other disks available to use on any of the nodes.

  • @CaptZenPetabyte
    @CaptZenPetabyte Před 4 měsíci

    Just to confirm, I think I missed something, each 'prox1, prox2, prox3, prox.n' would be different machines running proxmox on the same network? I shall rewatch and maybe have a bit of a play around with the documentation. Thanks for all the recent proxmox tutes mate, they have been very helpful indeed!

  • @mikebakkeyt
    @mikebakkeyt Před 4 měsíci +2

    I run an HA cluster atm with two identically named ZFS pools and so long as I put the CT disks in that pool then it allows replication and full HA functionality. I don't see any need to add the extra complexity of Ceph just for HA. Ceph seems awesome but it's an order of magnitude higher complexity over ZFS...

    • @Darkk6969
      @Darkk6969 Před 4 měsíci

      I use ZFS with replication for production servers at work. The biggest benefit of CEPH is real-time replication while ZFS it's based on timely snapshots and then those snapshots get sent to the other node in the cluster. The reason I am not using CEPH is for performance. Newer versions of CEPH may have gotten alot better so may have to revist it at some point.

    • @mikebakkeyt
      @mikebakkeyt Před 4 měsíci +1

      @@Darkk6969 agreed. I have prototyped ceph but it needs to be run at a larger scale to make sense and the management of it is very complex with opportunities to mess up data. I want to use it but I need dedicated nodes and a faster network so for now it waits

  • @fbifido2
    @fbifido2 Před 4 měsíci

    @8:30 - can you have a Proxmox host/node for Ceph storage only, not for running VM?
    eg: you have your 3 compute node, but running low on storage, can you add a node or two with lots of storage to expand your Ceph cluster?

    • @Darkk6969
      @Darkk6969 Před 4 měsíci +1

      Yes you can. You'll need to install CEPH tools on all the nodes. Just create the OSDs on the nodes you want to dedicate to CEPH.

  • @shephusted2714
    @shephusted2714 Před 4 měsíci

    this is one you needed to do but pls explore other netfs plus what is you wanted to combine all pc and gpu to look like one pc - can you explain or diy that in followup? #HA 40g #no switch #load balancing

  • @diegosantos9757
    @diegosantos9757 Před 2 měsíci +1

    Hello dear,
    Would it work with Pimox too??
    Thanks for the great vídeos!

  • @robert-gq5th
    @robert-gq5th Před 4 měsíci

    What do i do if when i add a drive it makes a new lvm and i cant use it for osd?

  • @Cpgeekorg
    @Cpgeekorg Před 4 měsíci +1

    3:32 "as far as migrating through here, you cannot do that yet until you set up a ceph" - this is incorrect. in this state, you CAN migrate vm's from one node to another, they just have to be paused first. all that's required for moving vm's node to node is a basic cluster. HOWEVER, because the storage isn't shared between them, it does take longer to move vm's between nodes in this state because the entirety of the storage needs to move from one node to the next. the way it works if you have ceph (or another shared storage, it doesn't have to be ceph, it could be an external share or something, ceph is just a great way to set up shared service with node-level (or otherwise adaptable) redundancy), is that instead of moving full disk images when you migrate, the destination node accesses the shared storage volume (so the storage doesn't have to move at all). which means the only thing that needs to be transferred between nodes is the active memory image, and this is done in 2 passes to minimize latency in the final handoff (so it transfers all blocks of the active vm ram, then it suspends the vm on the source node, copies any memory blocks that have changed since the initial copy, and then the vm is suspended at the destination node and resumes. on a fast network connection this final handoff process can be done in under a couple miliseconds so to any users using the services of the vm being transferred, are none the wiser. - you can start a ping, migrate a vm mid-request and the vm will respond in time at it's destination (maybe adding 2-3ms to the response time). it's FANTASTIC!

  • @primenetwork27
    @primenetwork27 Před 3 měsíci

    i have 3 node in ceph if it possible to add 1 more server?

  • @LVang152
    @LVang152 Před měsícem

    Can you delete the local-lvm?

  • @JohnWeland
    @JohnWeland Před 4 měsíci

    I have 2 nodes running myself (2 Dell r620) waiting for some lower end CPUs to arrive in the mail before I bring the third node online. It came with some BEFFIER CPUs and that means jet engine screams (1u server fans).

  • @leachimusable
    @leachimusable Před 4 měsíci

    1:15 min. Wich system is that in VM?

  • @fbifido2
    @fbifido2 Před 4 měsíci +1

    @7:20 - it would be nice if you would explain what each field is for or what's best practices.

  • @puyansude
    @puyansude Před 4 měsíci

    👍👍👍

  • @techadsr
    @techadsr Před 4 měsíci +1

    How impactful is ceph to CPU resources on a three node, N100 based Proxmox cluster? I put off trying ceph when the resource requirements mentioned dedicating CPUs.
    So far, i have NFS storage on a separate network connected to 2nd NICs on Synology and the ProxMox nodes.
    It looked like ceph could be set up with it's management traffic on a separate network as well. But with only 12 cores available on my cluster, maybe ceph isn't for me.
    Thoughts?

    • @ewenchan1239
      @ewenchan1239 Před 4 měsíci +2

      So, I have a three node HA Proxmox cluster (each node only has a 4-core N95 processor, so the less performant version of the N100 that you have), with 16 GB of RAM and 512 GB NVMe in each node.
      When I installed Proxmox, I had to re-do it a few times because I needed to create the 100 GB partition for Proxmox itself (on each node) + 8 GB swap partition and the rest of the drive can be used for Ceph.
      In terms of CPU usage -- Ceph RBD and CephFS itself, actually doesn't take much from the CPU side of things.
      It is HIGHLY recommended that you have at least a second NIC for all of the Ceph synchronisation traffic (my Mini PC has dual GbE NICs built in), which works "well enough" for the rest of the system being only 4-core N95 with 16 GB of RAM).
      Of course, Ceph isn't going to be fast with a GbE NIC in between them, but given what I am using my HA cluster for (Windows AD DC via turnkey linux domain controller, DNS, and Pi-Hole), it doesn't really matter to me.
      Nominal CPU usage on my cluster is < 2%, even when there's traffic going through the Ceph network.
      Nominal memory usage is whatever Proxmox already consumes (any additional memory consumption is negligble/imperceiveable).
      What will matter more in terms of CPU/memory will be what you plan on running on it.
      And I have effectively the same setup as you, but with slower processors (also 12 cores total, and 48 GB of RAM total, but spread out amongst the nodes) which means that I can't do anything TOO crazy because it isn't like the VM can spawn itself over multiple nodes, so everything has to run as if there was only one node available anyways.

    • @techadsr
      @techadsr Před 4 měsíci

      ​@@ewenchan1239 Sounds like the resource requirements doc had me concerned too much. I wonder how ceph compares to NFS running on Synology.
      I hadn't thought about explicitly creating partitions for proxmox os and swap. The install creates separate logical disks (see **** below) in LVM for proxmox swap and root. Will logical disks for swap/root not work for ceph?
      root@pve2:~# fdisk -l
      Disk /dev/sda: 953.87 GiB, 1024209543168 bytes, 2000409264 sectors
      Disk model: NT-1TB 2242
      Units: sectors of 1 * 512 = 512 bytes
      Sector size (logical/physical): 512 bytes / 512 bytes
      I/O size (minimum/optimal): 512 bytes / 512 bytes
      Disklabel type: gpt
      Device Start End Sectors Size Type
      /dev/sda1 34 2047 2014 1007K BIOS boot
      /dev/sda2 2048 2099199 2097152 1G EFI System
      /dev/sda3 2099200 2000409230 1998310031 952.9G Linux LVM
      **** Disk /dev/mapper/pve-swap: 8 GiB, 8589934592 bytes, 16777216 sectors
      Units: sectors of 1 * 512 = 512 bytes
      Sector size (logical/physical): 512 bytes / 512 bytes
      I/O size (minimum/optimal): 512 bytes / 512 bytes
      **** Disk /dev/mapper/pve-root: 96 GiB, 103079215104 bytes, 201326592 sectors
      Units: sectors of 1 * 512 = 512 bytes
      Sector size (logical/physical): 512 bytes / 512 bytes
      I/O size (minimum/optimal): 512 bytes / 512 bytes
      Disk /dev/mapper/pve-vm--102--disk--0: 48 GiB, 51539607552 bytes, 100663296 sectors
      Units: sectors of 1 * 512 = 512 bytes
      Sector size (logical/physical): 512 bytes / 512 bytes
      I/O size (minimum/optimal): 65536 bytes / 65536 bytes
      Disklabel type: dos
      Device Boot Start End Sectors Size Id Type
      /dev/mapper/pve-vm--102--disk--0-part1 * 2048 98662399 98660352 47G 83 Linux
      /dev/mapper/pve-vm--102--disk--0-part2 98664446 100661247 1996802 975M 5 Extended
      /dev/mapper/pve-vm--102--disk--0-part5 98664448 100661247 1996800 975M 82 Linux swap / Solaris
      Partition 2 does not start on physical sector boundary.
      Disk /dev/mapper/pve-vm--1000--disk--0: 8 GiB, 8589934592 bytes, 16777216 sectors
      Units: sectors of 1 * 512 = 512 bytes
      Sector size (logical/physical): 512 bytes / 512 bytes
      I/O size (minimum/optimal): 65536 bytes / 65536 bytes
      Does the ceph pool need to be created before creating VMs and LXCs or downloading ISOs or CTs?

  • @da5fx
    @da5fx Před 3 měsíci

    I’m going to talk only about my experience with proxmox and I have tried to used several times. I do not like it I think that there are better options. My setup is a 3 node i5 13th gen with ceph and 32gb ram two nics one for ceph traffic and the other for all other traffic. I think it’s very slow, there is a problem, in my opinion, when stopping the vm’s when they get stuck or you made a mistake in some way with the vm. The templates can only be cloned from one node and they are attached to that node different from VMware of course you can migrate the templates. Installing a Linux vm in the traditional way takes a long time like several hours something like 4 hours or more. The ceph speed on ssd was around 13mb/s. I made a test by moving all my 10vm from 3 to only 2 nodes to test on the third node the speeds. Maybe it’s me and I’m not used to this kind of solution because I was a VCP on 5.5 and 6 I normally prefer fedora KVM because of cockpit but that doesn’t provide any way to cluster 2/3 machines. In sum I got tired of it and installed harvester hci and now a vm is installed in 5m or a bit more, longhorn gives speeds around 80mb/s.
    This is just my last experience and the previous ones. I hope this helps someone. Thank you.