Welcome Guest, you are not signed in. //sign in    //register


Powered by CompTIA

Share this article on
Digg v Del.icio.us
f Reddit StumbleUpon

Testing is Key Part of IT Disaster-Recovery Plans

This article is the second in a two-part series about disaster recovery.

While having a disaster-recovery plan in place is a good start for businesses small and large, it remains worthless if the effectiveness of that plan is never tested. Consider this scenario: a hurricane is bearing down on your business’ data center; your information technology (IT) team has spent countless hours crafting a disaster-recovery plan, but never got around to testing it. What happens when a real-life disaster strikes? It’s either sink or swim.

Pure Survival
While having a disaster-recovery plan is a matter of pure survival, Lee Walkky, services practice manager at Irvine, Calif.-based Vision Solutions Inc., said testing that plan is a matter of common sense. Testing, he said, assures “piece of mind that your plans and solution meet the stated requirements of the business. You hope to never use it, but it is good to know you can if necessary. You do not want your first ‘test’ to be a live disaster.  I have seen that before and the results are never good. The ones who survive disasters are the ones who can execute with no uncertainty. The ones who do not survive are the ones who panic because they are unsure of what they are doing.”

As services practice manager at one of the world’s leading providers of high-availability, disaster-recovery and data-management solutions for the IBM Power Systems market, Walkky is responsible for a team of solution architects, solution engineers and project managers whose mission is to design and implement high-availability and disaster-recovery solutions for customers. To not test the disaster plans his team develops for clients would be irresponsible.

A disaster-recovery plan can be complex, which may intimidate some companies when it comes to testing. Having a defined solution process for recovery is one thing, but being able to test it and test it sufficiently is another. “Let’s say you have an offsite recovery location where the plan is to recover from tape,” Walkky said. “Rebuilding the environment is critical. However, many times connectivity to the outside world cannot be fully tested since it may be impossible to put those servers on a live network. Even the best-designed plans and tested plans may never be 100-percent tested in some cases. In addition, you have to make time and resources available for the testing. That is why it can be so complex.

“But regardless of the complexity or thoroughness of any testing, it is better than having nothing at all.”

Test And Test Again
It is not sufficient to test once and assume all will run smoothly. Organizations are almost constantly changing. New technologies are always being introduced, more threatening viruses are being created and even employee turnover in an IT department can be reason to conduct disaster-recovery testing.

The IT department at Bristol West Holdings Inc., a leading provider of liability and physical damage insurance focusing exclusively on private passenger automobiles across the United States, is familiar with the importance of testing its disaster-recovery plan. With its main data center, located in south Florida, susceptible to hurricanes, the IT department tests the disaster-recovery plan—moving production from its Davie, Fla., headquarters to the secondary data center in Independence, Ohio—often and thoroughly.

Ralph Czekalinski, director of technical services at Bristol West, said the company is always looking for ways to improve the efficiency of its disaster-recovery plan. “When we complete a disaster-recovery test, all issues are reviewed and prioritized so that they can be resolved as quickly as possible,” he said. “(The issues are) definitely corrected prior to the next test. This issue list also is the source to look at for new technologies to improve the overall plan.”

This methodical disaster testing proved priceless when Hurricane Wilma hit Florida in 2005. The Davie location was in the hurricane’s direct path. “There was little warning and almost immediate loss of communication at our primary site,” Czekalinski said. “However, because of our testing the disaster-recovery location was brought up and was providing services to our customers within five hours. Disaster-recovery is important to the business, but it is more important to your customers.”

Bristol West tests its plan twice a year. During the most recent test in May, the company moved all production applications from the Davie location to the Independence data center where they ran for 24 hours and were then switched back to the Davie location. The test was a success—during the test the company booked 654 new policies, representing more than $500,000 in premiums. Without this testing, the IT department would not know that the disaster-recovery plan has more than 750 tasks and takes just fewer than four hours for each switch to occur. These are important numbers to have when faced with an actual disaster. Although the test itself is imperative, the review process after the test is just as important. Czekalinski said the team will spend weeks reviewing the results to refine the process.

Like Insurance
Fortunately, disasters do not strike regularly. But, as Walkky said, having a good—and thoroughly tested—disaster-recovery plan is like insurance. “You hope to never use it, but if you have to you will know what to do.”