High Availability/Disaster Recovery (HA/DR) Basics

Sdílet
Vložit
  • čas přidán 8. 06. 2021
  • Welcome to ScienceLogic Symposium 2021. My name is Jesse Triplett. I'm a Senior Support Engineer with ScienceLogic Support, and in this video, we're going to cover the basics between HA and DR cluster. So let's get started.
    This video will be covering the basics of a heartbeat network, services and the resources that make them up, virtual IPs or VIPs and what the differences are between HA and DR clusters.
    In short, the heartbeat network is merely a network through which the primary and secondary nodes pass back and forth a token. Each node expects that token to come back around in a configured amount of time. If that token does not come back around, then the cluster will begin a a failover process. This is the top level cluster health check for your cluster.
    Services are made up of resources. These resources can be anything that an application needs to run. That includes mounted storage, file systems, the software for the application itself, really anything. For SL1, we specifically use distributed replicated block devices, a backend database made of MariaDB, the processes and parts of the application for SL1 itself, as well as a virtual IP. The VIP is a secondary IP that is applied to the public-facing primary NIC on the node that is currently running the SL1 stack. What this means is that any system that accesses that VIP is sure to get the SL1 stack itself and not the non-active node. It is, in that way, a very simple load balancer.
    When it comes to HA versus DR, the easiest way to describe it is that HA is meant for and focuses on immediate and automated failover from one node to another. This means, if for some reason, one node cannot run the surface, the other note will pick up the slack. What makes DR, or disaster recovery clusters, different is that instead of focusing on server-level issues that would prevent a server from being able to run the application, such as a hypervisor failure, disaster recovery focuses more on less common, but more impactful, issues that may bring down an entire data center. For that reason, you have a primary node and a secondary node that is offsite in a separate data center.
    Because of the distance between these two nodes and the latency between them, this means that in the case of a failure situation, some user intervention will be required. That is the difference you are most likely to notice between running an HA cluster and the DR cluster. It's also important to note that you can run an HA cluster with a DR node for maximum stability and uptime.
    I hope this video made clear that the heartbeat is the primary cluster level health check of the nodes themselves, that highly available services are really just combinations of all the bits and pieces that an application needs to run. The virtual IP is very important, as it is the central load balancer that allows the application to be accessible no matter which node it's running on. I'm going to do that one more time because that was not a great take.
    The virtual IP is a simple load balancer that allows the application to be available no matter what node it is running on. And finally, HA clustering focuses more on quick failover for more common, but less impactful, issues while DR focuses on issues that could bring down an entire data center.
    Thank you so much for listening to this video. I hope you got a lot from it, and I hope you have a great symposium.
  • Věda a technologie

Komentáře • 1

  • @GodIsWithin3
    @GodIsWithin3 Před 9 měsíci +1

    This was great info, thank you!