I'm not sure, I guess it depends what you're looking for.
I'm a hobbyist, with no real professional experience in server admin, so I am probably missing some important things.
But most of this can be replicated with ZFS and FreeBSD jails (or on linux, BTRFS and LXC containers).
>> 1. A solution for unlimited scheduled snapshots without affecting performance.
You can very comfortably have instant and virtually unlimited snapshots with zfs/jails (only occupying space when files change). Very easy to automate with cron and a shellscript.
>> 2. API access for managing configuration, version updates/rollbacks, and ACL.
>> 3. Close to immediate replacement of identical setup within seconds of failure.
There is a lot of choices for configuration management (saltstack, chef, ansible, ..).
I run a shellscript in a cron job that takes temporary snapshots of the jail's
filesystem, copies to a directory, and makes an off-site backup.
A rollback is as simple as stopping the server, renaming a directory, and restarting it.
It's probably more than a couple of seconds, but not by much.
I think I'm uncomfortable exposing an API with root access to my systems to the internet, but I'm not sure how these systems work.
I don't think it would be hard to set it up with flask if you wanted it though.
>> 4. No underlying OS management.
I don't know what this is, but I'm curious and looking it up :D.
In most of the posts I'm reading here, people have really beefy rigs.
But you could do this on the cheap with a 2000s era laptop if you wanted (that was my first server).
> You can very comfortably have instant and virtually unlimited snapshots with zfs/jails
Yes, but that both requires manual scripting it and remains local to the server. Compare to scheduled RDS backups which go to S3 with all its consistency guarantees.
> There is a lot of choices for configuration management (saltstack, chef, ansible, ..)
Sure, those are an improvement over doing things manually. But for the recovery they can do only so much. Basically think how fast can you restore service if your rack goes up in flames.
> I don't know what this is, but I'm curious and looking it up :D
It means - who deals with kernel, SSL, storage, etc. updates, who updates the firmware, who monitors SMART alerts. How much time do you spend on that machine which is not 100% related to the database behaviour.
I wasn't recommending everyone use RDS. If your use case is ok with a laptop-level reliability, go for it! You simply can't compare the cost of RDS to a monthly cost of a colo server - they're massively different things.
Thank you very much for this reply! Those are all very good points. You’re right, this is a service, and not having the worry about hardware or scripts at all is valuable when you have a million other things to manage.
>> 1. A solution for unlimited scheduled snapshots without affecting performance.
You can very comfortably have instant and virtually unlimited snapshots with zfs/jails (only occupying space when files change). Very easy to automate with cron and a shellscript.
>> 2. API access for managing configuration, version updates/rollbacks, and ACL.
>> 3. Close to immediate replacement of identical setup within seconds of failure.
There is a lot of choices for configuration management (saltstack, chef, ansible, ..). I run a shellscript in a cron job that takes temporary snapshots of the jail's filesystem, copies to a directory, and makes an off-site backup. A rollback is as simple as stopping the server, renaming a directory, and restarting it. It's probably more than a couple of seconds, but not by much. I think I'm uncomfortable exposing an API with root access to my systems to the internet, but I'm not sure how these systems work. I don't think it would be hard to set it up with flask if you wanted it though.
>> 4. No underlying OS management.
I don't know what this is, but I'm curious and looking it up :D.
In most of the posts I'm reading here, people have really beefy rigs. But you could do this on the cheap with a 2000s era laptop if you wanted (that was my first server).