UPDATE: The Sun, not always the most reliable information source, is saying the outage and trickle down affected 300,000 passengers and may cost the airline $300+ million. The CEO, Alex Cruz, allegedly said, when warned earlier about the new system installed last fall, that it was the staff’s fault, not the system’s, that things were not working as desired. Cruz, trying to rein in the damage, said in an email to staff to stop talking about about what happened. Others have said that the people at Tata did not have the skills to start up and run the backup system – certainly not the first time you wind up with a bumpy situation when you replace on-shore resources with much lower paid off-shore resources – resources who have zero history in the care and feeding of that particular very complex system. Even if the folks at Tata were experienced at operating some complex computer system, no two systems are the same and there is so much chewing gum and bailing wire in the airline industry holding systems together, that without that legacy knowledge of that particular system, likely no one could make it work right.
Of all of the weekends for an airline to have a computer systems meltdown, Memorial Day weekend is probably not the one that you would pick.
Unfortunately for British Airways, they didn’t get to “pick” when the event happened.
Early Saturday British Airways had a systems meltdown. This really is a meltdown since the web site and mobile apps stopped working, passengers could not check in and employees could not manage flights, among other things.
Passengers at London’s two largest airports – Heathrow and Gatwick – were not getting any information from the staff. Likely this was due to the fact that the systems that the staff normally used to get information were not working.
Initially, BA cancelled all flights out of London until 6 PM on Saturday, but later cancelled all flights out of London all day.
Estimates are that 1,000 flights were cancelled.
Given this is a holiday weekend, likely every flight was full. If you conservatively assume 100 passengers per flight, cancelling 1,000 flights affected 100,000 passengers. Given the flights are all full, even if they wanted to rebook people, there probably aren’t available seats during the next couple of days. That means that for a lot of these passengers, they are going to have to cancel their trips. Given that the airline couldn’t blame the weather or other natural disasters, they will likely have to refund passengers their money. This doesn’t mean giving people credit towards a future trip, but rather writing them a check.
In Britain, airlines are required to pay penalties of up to 600 Euros per passenger, depending on the length of the delay and the length of the flight.
In addition they are required to pay for food and drinks and pay for accommodations if the delay is overnight – and potentially multiple nights.
Of course there are IT people working around the clock trying to apply enough Band-Aids to get traffic moving again.
Estimates are, so far, that this could cost the airline $100 million or more. Another estimate says close to $200 million. Hopefully they have insurance for this, but carrying $200 million in business interruption insurance is unlikely and many BI policies have a waiting period – say 12 hours – before the policy kicks in.
But besides this being an interesting story – assuming you were not travelling in, out or through London this weekend – there is another side of the story.
First, one of the unions blamed BA’s decision to outsource IT to a firm in India (Tata). BA said that was not the problem. It is true that BA has been trying to reduce costs in order to compete with low cost carriers, so who knows. In any case, when you outsource, you really do need to make sure that you understand the risks and that doesn’t matter whether the outsourcer is local or across the globe. We may hear in the future what happened, but, due to lawsuits, we may only hear about what happened inside of a courtroom.
Apparently, the disaster recovery systems didn’t come on line after the failure as they should have. Whether that was due to cost reduction and it’s associated secondary effects or not we may never know.
More importantly, it is certainly clear that British Airways disaster recovery and business continuity plan was not prepared for an event like this.
One one point the CEO of BA was forced to say, on the public media, that people should stay away from the airport. Don’t come. Stay home. From a branding standpoint, it doesn’t get much worse than that. Fly BA – Please stay home.
As part of the disaster recovery plan, you need to consider contingencies. In the case of an airline, that includes when you cancel flights, how do you get bags back to your customers. Today, two days later, people are saying that they still don’t have their luggage and they can’t get BA to answer their phones. BA is now saying that it could be “Quite a while” before people get their luggage back and if they don’t, that is more cost for BA to cover.
One has to assume that the outcome of all of this will be a lot of lawsuits.
From a branding standpoint this has got to be pretty ugly. You know that there has been a lot of social media chatter on the horror stories. In one article that I read, a passenger was talking about taking a trip from London to New York and that all the money they were going to lose for things that they planned on doing when they got to New York. Whether BA is going to have to pay for all of that is unclear, but likely at least some of it.
You also have to assume that at least some passengers will book their next flight on “any airline, as long as it is not BA”.
To be fair to BA, there have been other, large, airline IT systems failures in the last year, but this one, it’s a biggie. Likely these failures are, at least in part, due to the complex web of automation that the airlines have cobbled together after years of cost cutting and mergers. Many of these systems are so old that the people who wrote them are long dead and the computer languages – notably COBOL – are considered dead languages.
The fact that there were no plans (at least none that worked) for how to deal with this – how to manage tens of thousands of tired, hungry, grumpy passengers – is an indication of work for them to do.
But bringing this home, what would happen to your company if the computers stopped working and it took you a couple of days to recover. I know in retail, where all the cash registers are computerized and nothing has a price on it any more, businesses are forced to close the store. We saw a bigger version of that at the Colorado Mills Mall in Golden earlier this month. In that case likely a number of businesses will fail and people will lose their jobs and their livelihoods.
My suggestion is to get people together, think about likely and not so likely events and see how well prepared your company is to deal with each of them. Food for thought.
Information for this post came from the Guardian here and here, The Next Web and Reuters.