What is Recovery Testing? An Introduction

Software undergoes extensive testing to prevent failures in production, but accidents occasionally happen. What is Recovery Testing? Its purpose is to review the failure and recovery process.

Recovery Testing intentionally causes the system to fail to identify whether it is recoverable, the time it takes to recover and whether it causes data and other losses upon hitting a crash.

Software is used in almost all sectors. In some places, its purpose is to reduce work effort, while in others, it supports mission-critical tasks, such as in the defense industry, healthcare, banking, etc. For example, digitization has helped store and retrieve patient information with ease in healthcare and deliver instant test results. Likewise, automation implemented through software enables the banking industry to be available for its customers around the clock.

These sectors have seen numerous benefits from implementing software applications. However, it can become a big concern for these industries when a system is down for an extended period, or recovery isn’t achieved fast enough.

Is it challenging to perform Recovery Testing?

The nature and scope of this kind of testing can present specific challenges to testers, and it requires adequate planning before execution.

Extensive Knowledge and Preparedness

Unlike functional testing, where testers review functional requirements and design straightforward test cases, recovery testing requires determining the point of failures, which demands a good knowledge of how the product integrates and works and an understanding of the system architecture.
Before assigning someone to perform recovery testing, the reviewer needs to study the system, interact and communicate with different teams, determine the scope of testing and then come with a plan.
A significant effort is needed to learn different common errors and how to handle each error scenario.

Time Consuming

Recovery Testing involves a lot of studies, analysis, and proper test setup, which overall makes this process very time-consuming.

Expensive Process

Recovery testing requires several tools to automate, needs a proper test environment and a data backup mechanism, which involves a costly setup. Overall, it becomes an expensive process.

What to test?

When deciding which tests to include in recovery testing, decision-making becomes very difficult. It isn’t easy to test every possible combination of potential failures, given that the test environment can never fully replicate a real production environment.

Some critical factors to look at during a recovery test includes:

Data loss – A crucial factor to test during recovery testing is the potential for data loss due to failures. Whenever a system crashes and impacts data, the immediate solution is to test the backup data recovery.
Impact time – Recovery testing also provides information about a system’s downtime upon hitting a failure and whether the system becomes operational after a while or needs some intervention to be working.
Identify the cause of failure – A system can fail due to hardware issues, network issues, power failures, and numerous other reasons. Knowing what caused the loss becomes an essential output for any recovery testing. Depending on the finding, the decision can include replacing hardware, improving network connection, or establishing a new backup mechanism.

Recovery testing

How to do the testing?

As mentioned above, recovery testing requires a fair bit of preparation; anyone involved needs to learn about the data, its accessibility, and the overall restoring procedure. In addition, assigned colleagues need to review the acceptance criteria and make themselves aware of all the critical failures in the system. Establishing a proper test environment is another aspect to be considered for a successful recovery test. It is advisable to examine as much as possible using actual devices with recovery testing tools. Performing tests manually cannot provide comprehensive coverage. Lastly, while involved in the testing, it is essential to document each step to help with analysis when the system encounters any problems.

What to do after finding a problem?

When testers discover issues, they report them as bugs. However, recovery testing requires something more than just reporting bugs. Recovery testers need to document every step and observation in detail, then submit a feedback report. Discussion within the team takes place to define the list of acceptable risks and recommendations for improvement. The problems identified during the tests are not usually straightforward to resolve. Some identified issues might incur additional setup, cost, and resources and may require further analysis and implementation over time.

Conclusion

Even though recovery testing is a costly affair and requires a lot of resources, it is highly beneficial in establishing preparedness. In addition, it helps with planning on what steps to take upon hitting a failure. Overall, recovery testing helps ensure that operations can continue without data loss and with the least amount of downtime.