Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The RDS replication that stopped working after 5min messing with it, with no alert of failed health check is a bit worrying...


Obviously the devil's in the details, and it's almost impossible to troubleshoot from a screencast, but my experience has been that AWS is generally pretty liberal with the CloudWatch Metrics, but does place the onus upon the user to dig through the 150++ of them to read the docs to find the one that matters. They also claim <https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_...> there's a console table cell for the replication status, but my experience with the console is that often one must opt-in to having that column shown which is suboptimal :-(

That "shared responsibility model," they lean on it heavily


I can assure you, you can not trust any AWS health checks to be a primary alert for something down. You have to do it all yourself, on host, or inside the container.

AWS/Rackspace support just say: "It's your problem as we don't manage what is inside the AWS service".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: