Last week we got a call from one of our customers, a bit frantic (I am not going to say anything about who they are), asking for an assessment about whether they had been hacked. After a little investigating, we decided that the answer was no, but so began a story that lasted for four days as we tried, in vain, to get their vendor to help them.
What is the single point of failure? DNS.
Years ago companies used to run their own DNS servers and put them on their network. But there is a problem with that. If their server or network goes down, so does their DNS and that means that email will start bouncing and their web sites will go dark.
SO, what people have done for years is to outsource that to companies like Network Solutions, GoDaddy, Cloudflare, Amazon and others.
And that is where this story begins. Please be patient.
In this case, their domain registrar was Network Solutions, but their DNS vendor was GoDaddy. While this is slightly unusual, it is not that out of the ordinary and there is no reason why it should not work.
The first clue that they had a problem was when a key business partner told them that emails to them were bouncing. After some investigating, they partner told them that there was a problem with their DNS Security records or DNSSec. DNSSec is a way to digitally sign your DNS records so that they can’t be easily hijacked by an attacker.
Our customer went to GoDaddy who realized that, for about two weeks, this client’s DNSSec records were signed with an invalid key. This should be a very quick fix, but not for GoDaddy. They claimed that they fixed that problem, but what we soon discovered is that in that process, they somehow corrupted the DNS “zone file” and it was no longer providing DNS information to the rest of the Internet.
The client tried to escalate this within GoDaddy, but for a day they refused to admit they were the cause of the problem, instead blaming Network Solutions, which was completely wrong.
What is really infuriating is that the fix only takes a minute. Assuming we could talk to a competent tech. Delete the corrupted DNS zone file and recreate it. Problem solved. Instead, the only thing they were willing to do was open a ticket, which they said could take 3 days to resolve.
In the meantime, the client’s website was dark and their email was bouncing. Anything else that was DNS based was broken as well.
At one point the client asked to speak to the tech’s supervisor and was told that the tech didn’t have a supervisor.
Eventually the client did get to speak to a supervisor, but the supervisor claimed that there was no way to prioritize the ticket in the system, even though they were completely, 100% down.
To make a long story short, the customer moved to a different DNS provider, manually recreated the DNS file and was back on the air in minutes. At this point we still have no resolution from GoDaddy.
If you want more details, reach out to me.
There is a point here.
For almost all organizations, DNS is both critical and a single point of failure.
Equally important, since it is a service that is outsourced to a cloud vendor, users have very little control over fixing it. They are totally dependent on their vendor.
In this case GoDaddy completely failed this customer and seems to show no indication that they are responsible.
Hopefully for this customer, as a result of the four day outage where they literally went into the digital stone age, there are no long term effects, but there certainly was a lot of stress.
Every organization needs to consider how they will handle a situation like this. In part, it is a function of how critical their web sites and email are to their business.
There are ways to make DNS very resilient, but it requires work and some extra cost on the part of the company, so it becomes a business risk decision.
For us, GoDaddy is now on our list of not recommended vendors for our customers. While this was an unusual situation, the way they handled it was, in our opinion, unacceptable.
Feel free to contact us for more information and for assistance.
And, feel free to share your experiences with DNS failures. I know that a lot of people use GoDaddy and have not had bad experiences, but where the rubber meets the road is when you have a critical outage, how does the vendor perform.