Run a Test Failover

It is recommended that you run a regular test failover to check that your disaster recovery protection is configured correctly, and the VMs are correctly replicated at the recovery site.

 

During a test, real time replication doesn't need to be stopped and the production workload is protected. There are two types of failover tests:

  • Test failover is a clean shutdown (cleanly completes final sync with source) and failover.
  • Real failover is a non-clean (drop everything, don’t check source) and failover. See Run a Real Failover

 

It is also recommended to test all the VPGs being recovered to the same cluster together. For example: High availability configuration in a cluster includes admission control (to prevent VMs being started if they violate availability constraints). Testing the failover of every VPG configured for recovery to this cluster, at the same time, shows whether the constraints are violated or not.


Topics


The Test Failover Process

The table below describes what happens during a test failover:

 

Stage Description
Start the Test Failover

Single or multiple VPGs can be tested. The test VMs are:

  • Created at the remote site without CD-ROM drives, even if the protected VMs have CD‑ROM drives.
  • Use the network specified for testing in the VPG settings.
  • Configured to the checkpoint specified for the recovery, and
  • Powered on, making them available to the user. If applicable, the boot order defined in the VPG settings is used to power on the VMs.

 

Note: By default, test VMs are started with the same IPs as the protected VMs in the protected site. To avoid clashes, you need to ensure that different IPs are assigned to the VMs when they start, by configuring the VM NIC properties in the VPG. If you have defined the new VMs so that they are assigned different IPs, the re-IP cannot be performed until the new machine is started. Virtual replication changes the machine IPs, and then reboots these machines, with their new IPs.

During the Test

The VMs in the VPG are created as test machines in a sandbox. They are powered on for testing using the test network specified in the VPG definition and use the virtual disks managed by the VRA.


All testing is written to scratch volumes. The longer the test period the more scratch volumes are used, until the maximum size is reached, at which point no more testing can be done. The maximum size of all the scratch volumes is determined by the journal size hard limit and cannot be changed. The scratch volumes reside on the storage defined for the journal. Using scratch volumes makes cleaning up the test failover more efficient.

 

While a test is running:

  • The VMs in the VPGs continue to be protected throughout the test.
  • You must not delete, clone, migrate to another host or change the disk properties of any of the test VMs.
  • You can add checkpoints to the VPGs, and if necessary fail over the VPGs.
  • You cannot take a snapshot of a test machine, since the VM volumes are still managed by the VRA and not by the VM. Using a snapshot of a test machine will create a corrupted VM.
  • You cannot move the VPGs being tested.
  • You cannot initiate a failover while a test is being initialised or closed.
Stop the Failover Test

The test VMs are powered off and removed from the inventory.


Start a Test Failover

Follow these steps to start a test failover:

 

1. From your target datacentre, open the Silver-lining DR self service portal. At the bottom right-hand side of the screen, set the operation to Test and click Failover.

 

 

2. The Select VPGs screen appears. Select the VPG name/s to test, then Next. By default, all VPGs are listed. At the bottom of the screen, the selection details show the amount of data and the total number of VM's selected. The Direction arrow shows the direction of the process: from the protected site to the peer recovery site.

 

 

3. The Execution Parameters screen appears. By default, the latest checkpoint added to the journal is displayed. If you want to:

  • use this checkpoint, click Next and go to Step 7 below.
  • use one of the checkpoints from the last 3 days, click on the checkpoint that is displayed.

 

 

4. The Operations Checkpoints screen appears.  Select the Checkpoint you want to fail back to as a test and click OK.  To locate a specific checkpoint, use the table below.

 

 

5. To locate a specific checkpoint, select from the following options (as shown in the screenshot above).

 

Filter option Description
Latest

The recovery, or clone, is to the latest checkpoint. This ensures that data is crash-consistent for the recovery or clone. If a checkpoint is added between this point and starting the failover or clone, the later checkpoint is not used.

Latest Tagged Checkpoint

The recovery operation is to the latest checkpoint created manually. Checkpoints added to the VM journals in the VPG by the Zerto Virtual Manager ensure that data is crash-consistent to this point. If a checkpoint is added between this point and starting the operation, this later checkpoint is not used.

Select from all available checkpoints By default, this option displays all checkpoints in the system.
Refresh the list.

 

6. The Execution Parameters screen reappears showing the selected options checkpoint. Click Next.

 

7. The Failover Test screen appears. The topology shows the number of VPGs and vApps being tested to failover to each recovery site. In the following example, one VPG will be failed over to the WPD2 site, and contains 2 vApps.

 

 

8. Click Start Failover Test. The test begins an initialisation period, during which the vApps are created in the target (recovery) site.

Note: Any changes made in production are still able to be made in the target site. You still have the option to do a live failover which will include any changes up to that point.

 

9. The Silver-lining DR self service portal shows 'Testing Failover' in the Operation column of the VPGs being tested, and at the bottom left of the screen.

 

 

10. Once the failover test has completed, the vApps will appear in the target site. They are powered on but isolated from affecting the live workload. It is possible to interact with and make changes to these vApps, but any changes will be lost when the failover test is completed. When Stop Failover Test is selected, it deletes the vApps and removes any changes that were made inside the test. See Stop a Failover Test below.

 

Source protected vApp during a failover test:

 

vApp at target site during a failover test:

 

A notification will appear in the Recent Events panel on the CloudCreator dashboard.

Monitor a Test Failover

Follow these steps to monitor a test failover:

 

1. In the Silver-lining DR self service portal, click the VPGs tab to monitor the status of a failover test.

 

2. In the General view, the Operation field displays 'Testing Failover' and the completion percentage when a failover test is being performed.

 

 

3. Click on the name of a VPG you are testing. A dynamic tab is created displaying the specific VPG details including the status of the failover test.

 

 


Stop a Test Failover

Follow these steps to stop a test failover:

 

1. In the Silver-lining DR  self service portal, select the VPGs tab, then click the Stop icon  in the Operation column.

 

 

2. The Stop Test screen appears. Use the table below as a guide to complete the fields, then click Stop.

 

 

Field Description
Result Specify whether the test succeeded or failed.
Notes

Add a description of the test (optional). For example, you can specify where the external files are saved which describe the tests performed. Notes are limited to 255 characters.

 

3.  After stopping a test, the following events occur:

  • The vApps in the recovery site are powered off and removed.
  • The checkpoint that was used for the test has the following tag added to identify the test: 'Tested at StartDateAndTimeOfTest'. This checkpoint can be used to identify the point in time to use to restore the VMs in the VPG during a failover.
  • In vCloud Director, the Testing Recovery vApp will be removed, including the testing VMs inside it. The original VMs remain untouched and unchanged.

 

A notification will appear in the Recent Events panel on the CloudCreator dashboard.

View Test Results

The date and time of the last test is displayed in a column in the VPGs and VMs tabs.


 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

The page cannot be found

The page you are looking for might have been removed, had its name changed, or is temporarily unavailable. Please make sure you spelled the page name correctly or use the search box.