Ask HN: Is DNS Failover a Problem?

I've been in IT for quite a while, mostly in a single shop, and I'm exploring the idea of a small SaaS product for automatic DNS failover. Before diving in, I’d love to gauge how much demand there might be for something like this.

The service would monitor the status of your primary and backup endpoints. If the primary fails, it would automatically update the DNS record via API to point to a backup. It would also send alerts and continue monitoring backup endpoints. Naturally, API credentials would be encrypted for security.

I’d offer a free tier, with a paid monthly plan for advanced features. While some DNS providers (like Cloudflare and Route 53) already offer similar functionality, others—such as GoDaddy—don’t seem to. I’m thinking this could be useful for smaller MSPs, solo developers, or businesses that need a simple, independent solution.

Does this sound like something people would find useful? I'd appreciate any feedback!

3 points | by kbench 6 hours ago

2 comments

  • mtmail 6 hours ago
    Nice coincidence I was looking for such a solution last week. We use Cloudflare DNS and their traffic feature. It monitors if our (redundant) endpoints are up and if one is down removes that DNS entry and alerts us. DNS is free, the traffic feature isn't. I was hoping DNSimple, a smaller SaaS provider, would offer this, but they don't. And we thought we build a script ourselves but it's another piece of our infrastructure that could fail so better to outsource that. Cloudflare works for us, I think their traffic feature, for our usecase, is too expensive long-term. If I understand correct your solution would be independent and update for example Cloudflare DNS via their API.
  • p_ing 6 hours ago
    Solved problem via [geo] load balancers such as Azure Front door.

    The thing to keep in mind is that you need to assume a given DNS server (server server) doesn't respect TTL. TTL should be considered a value that is ignored unless you operate the DNS server and clients. So while you update your TTL, some client in a remote-region-in-the-middle-of-nowhere is leveraging a DNS server that sets all TTL values to 10 days. Now you've got a client that cannot connect.

    • kbench 2 hours ago
      This is a really good point. It'd be the nature DNS of course. I guess my API Calls would try to encourage a TTL of a maximum to minimize the downtime. Remote region of nowhere that sets all TTLs to 10 days. Well, that sucks.