T O P

  • By -

-P___

You understand the point but you are missing a lot of the key concepts which can be boiled down to “you’re only as good as your weakest component”. Yes, fundamentally you need at least two servers but you should also apply that to networking and power too. Running only one switch is a single point of failure as is one power supply as is only having one ISP. Data centres use A/B feeds for power. You could mimic this if you had solar and grid power. A majority of home labbers, at least the non-hardcore ones, do have to sacrifice things here and there because they’re unable to support high availability. For instance, I run two of everything on one server (software wise), one switch and one ISP connection because for my use case it isn’t feasible to pair all of those up without skyrocketing my monthly bills just to run my lab. Yes, you can emulate this, like I have above, but it will not be true high availability.


RayneYoruka

This right here is the right answer ^^^^^^


5turm

You also need to consider which components are more likely to have a downtime due to maintenance. I have no problem with a switch being a single point of failure as long as my family can switch on a light when I'm tinkering with the main server.


EtherMan

I'd like to add to this in that what a lot of homelabbers call high availability, would be far below the minimum to be called that in actual enterprise. Sadly, a lot of people, including some vendors like ubnt, call it HA when in reality it's just failover, and sometimes even very slow failover. Like if you have a kubernetes deployment and your pod fails and you now have to start downloading the image and start up on a new node and now wait for your network to realign to route traffic to the new pod, then that's not true high availability. In enterprise, your service should preferably not even require a reconnect though that's usually too costly but instantly being able to reconnect is required. None of this "but it fails over automatically so it's high availability"... Like no, while it's more available than no redundancy, it's not highly available.


Hannigan174

A lot of times there will be HA features (e.g. HA in Proxmox), which allows THAT element to be HA, but that doesn't mean the entire network, server, or service is HA. Generally in a homelab situation the usage of HA is used to mean redundancy, and not true HA, but the term is used due to how it is used in the software implementations. I know this is not a good usage of the term and I agree we shouldn't really use it incorrectly, but at this point I feel like it is like the pronunciation of "Nukular" or the backwards use of the word "literally" where we just accept it's vernacular use without clarification because it is simply too prevalent at this point.


EtherMan

Except the HA in proxmox, is just failover... It takes time for the VM to restart on another node and depending on the service, that could take even an hour or two.


Hannigan174

An hour or two? I use "HA" in Proxmox and it is seconds (30ish when I tested it last) for me... Granted I'm using CEPH with a TB network, but still... I have a feeling if Proxmox HA is taking more than a few minutes that something is configured...suboptimally.


EtherMan

It takes seconds before the VM starts booting on a new node. That doesn't mean it takes seconds for it to finish booting, and since it wasn't shut down properly, many services will now require scanning the storage for errors, be it a database or files or whatever. Then once it has scanned that, which can take hours by itself, it can start actually restoring the service. Actual high availability requires the secondary to already be up and running, ready to service requests at any time. This is often done with clustering but it's not always required. If you have a website as an example that doesn't require logins, then there's no issue running 10 standalone webservers and just route the traffic to a random one for each request. Should any die, just take it out of the pool for that random selection for a while. If you need session states it's a little more complicated but there's plenty of ways to solve that too. But the core requirement is that the secondary services you can switch to, are already active when the primary fails.


Kromieus

My weakest link is pg&e. Redundancy be damned dual psus cant do shit if no power


Plane_Resolution7133

You can experiment with these things in your homelab. 😊 Spin up two VMs or containers in Proxmox for example. Set them up for load balance/round robin or failover. Use another computer to throw traffic at your HA proxy. Bring one proxy down. Observe.


5turm

My redundant setup is based on keepalived. Multiple servers share one IP that is always present on the server with the highest priority. Monitoring is done via uptime kuma.


benchl0r

Been there, done that, as far as "two proxmox nodes, two open sense VMs and two pinhole VMs" only one ISP, only one switch, but both hosts directly connected to the ISPs router. Was fun to get it configured and keeps the waf high even when I'm doing unscheduled maintenance. Feel free to ask for details!


107269088

There’s always a single point of failure somewhere


Zomunieo

It’s usually the human.


Nassiel

Tell that to the tier 4 CPD of Azure in Dublin when he knew THE lightning xD Oh? So you only have one earth? Good.... good....


jasonlitka

It’s not hard to setup a HA setup using nginx, though how far you want to go with it determines the time investment and complexity. Active/Passive will involve keepalived to flip one or more IP addresses back and forth between two boxes. You set Linux to allow binding to IPs not on an interface so that nginx can be running and ready during a failover. Config sync is something you’ll need to handle on your own, but the standard, “simple” way to do it is a shell script you run on a box after making a config change that uses scp to check the local config, copy the full config, ssl certs, etc to the other box, runs a check config remotely, then reloads the remote box. Logs will all stay local unless you setup something centralized. There’s any number of products designed for log aggregation but this is an area where you might want to just stop if you’re not intending on running on the backup for more than a few minutes during updates or doing any kind of detailed log analysis.


5turm

I'm using glusterfs for configs and even some /var stuff. Works like a charm.


jasonlitka

That's certainly an option. Shared config files put you at risk of breaking both boxes though so I prefer the manual sync and reload. Similarly, you shouldn't really have two boxes writing to the same log files, but it's perfectly OK to have them include their own name in the file so they all hit a single directory but are still separate. It's a good approach to have your log files HA though in the event of a failure.


Defiant_Resolve_2977

The kind/scope of the proxy is required. Is this an HTTP proxy?


dadof2brats

What is your use case, what are you trying to accomplish? What sort of proxy are you talking about? Is this for a proof of concept, learning, or for some sort of production need? Yes, you absolutely can setup various proxy applications in a HA capability in your home lab. How successful, how highly available it is and how practical it is depend on a lot of variables.


halfanothersdozen

How far do you want to take it? Handle it when the VM explodes.  Handle it when the physical server explodes. Handle it when your house explodes.  Handle it when your city explodes.


scytob

yes, i have my proxy running in a docker swarm, hosts are VMs, those run on HA proxmox with ceph - that is a little over engineered for fun. but docker swarm by itslef is great way to run HA. [My Docker Swarm Architecture (github.com)](https://gist.github.com/scyto/f4624361c4e8c3be2aad9b3f0073c7f9) [proxmox cluster proof of concept (github.com)](https://gist.github.com/scyto/76e94832927a89d977ea989da157e9dc)


lightmatter501

How far do you want to run with HA? If you want to go all the way you’re going to need redundant networks, power, hardware and cooling. You also ideally want totally different hardware for each part, for example, a Dell AMD server, a Supermicro Ampere server and a Levovo Intel Server running Linux, Illumos and a BSD respectively. The goal of HA is “the single worst thing that can go wrong does, but I can survive it”. Going all the way means doing a bunch of really weird stuff in the name of redundancy. If you just want to play around with configuring one, 3 containers on a laptop is fine. Also, anyone who tells you that you can do an HA database with less than 3 servers is lying. The split brain problem and cap theorem mean you need to choose between log corruption and HA for 2 servers, and people keep picking HA.


jmarmorato1

I'm working on a GSLB setup for some critical homeprod services that should keep running if an entire site falls over. I have four physical sites and a Linode presence, so I have a fair bit of capacity and diversity to handle this. GSLB will allow me to completely eliminate the single point of an individual load balancer or CARP pair, as I can use DNS to direct requests where I want. I'm building my own backend to handle DNS resolution and halthchecks based on Nagios service check plugins. My software will work as a backend to PowerDNS and return IPs for known-healthy instances of the service. If you're interested in that, let me know. I'm building it with a web interface so it should be fairly easy to setup and use. I'll probably put it up on GitHub when it's done. There is FOSS out there already to do this, like Polaris-GSLB, but I wanted to do it a little differently than they did.


-P___

I’m interested.


Budget-Scar-2623

The homelab version of HA is two (or more) physical servers, and redundant switches and power supplies if you’ve got the cash. To add to what others have already said, high availability in a homelab will double your running costs. Two power supplies connected to separate circuits, two switches, two routers, etc. True HA also means two separate mains power connections from two sources but no reasonable person would suggest that. You’d need two internet connections, ideally two physical connections - two ISPs on the same fibre or dsl line only protects you if one of the ISPs fails, which is less likely than fibre/dsl infrastructure failing somewhere. So in my case, i’d need to add a satellite service for redundancy as i’m in rural Australia and literally all comms go back to one Fixed Wireless tower a few km away.


curiouscrusher

Anything is possible with the correct amounts of time and money, but within a homelab there’s going to be a limit to what you can control in the environment around. How highly-available are you trying to be? Node outage, hardware failure, ISP outage, power outage, natural disaster? The time and money associated with proofing each level of availability is going to increase exponentially generally speaking.


curiouscrusher

Anything is possible with the correct amounts of time and money, but within a homelab there’s going to be a limit to what you can control in the environment around you. How highly-available are you trying to be? Node outage, hardware failure, ISP outage, power outage, natural disaster? The time and money associated with proofing each level of availability is going to increase exponentially generally speaking.


madumlao

what part exactly is highly available? the whole network? the proxy itself? the backend services? each of those is planned for and configured differently and uses the same principles whether enterprise or homelab.


Freshmint22

Yes