You will have been impacted by the Facebook, Instagram, and WhatsApp outage, but what can you do to make sure your site doesn’t lead to a full-blown crisis for your firm?
Cover off the basics
It seems that Facebook misconfigured systems which announce their network's location on the Internet. This is a bit like removing your number from the phone book and not telling anyone. It falls under a category of tasks, like ensuring your website address is registered, which are small, annual or biannual, and can fall between the cracks or be taken for granted.
Make sure you know your regular tasks and test plans, who is responsible for doing them, when they will do them and most importantly that you see an advanced notification that it will be done.
Plan for the worst
In this case, it looks like the teams responsible for fixing the issue fell victim to a Catch 22, where those who could fix it couldn't access the servers, and those who could access the servers couldn't fix it.
Make sure you have a disaster recovery and business continuity plan in place and your people know what their role is, have a clear sense of the communications you will send, and the channels you will use, and mitigate the risk of damage by acting clearly and decisively.
Not all crises are due to sudden developments, this one certainly looks like it could have been avoided but there are external factors that you can pre-empt. Two areas of particular interest are legislation and cyber attack. The first is seen with failure to meet GDPR, PECR, and accessibility compliance standards and the subsequent fines. Cyber attack, whether ransomware or data theft, requires constant vigilance and proactive security measures.
Make sure that you are appraised of these two areas and keep them rolling on the risk register, so there is proper organisational oversight.
Your site may range from purely informational to housing key transactions, or services, but in all cases, losing your site due to human or technical error, or cyber attack is going to impact brand reputation, customer trust, and stock value. Yesterday’s news proves that hindsight is a great teacher, but there are lessons we can all learn to ensure that our sites remain running tomorrow.
A schedule for regular infrastructure, security and maintenance tasks
A human readable test plan that the entire business can understand
Clear Disaster Recovery, and Business Continuity plans for site outages
An Emergency Communications plan that acknowledges some of the regular channels will be unavailable
Staff who know their role and responsibility in a crisis, and can flag any potential issues
Add longer, persistent issues to your Risk Register