Storage VMotion fails with the error: Failed to unstun VM after disk reparent

Details

  • Storage VMotion fails for a virtual machine
  • You see the error:

    Failed to unstun VM after disk reparent.
     
  • The virtual machine is partially migrated and powered off. Generally, the virtual machine cannot be powered on again.
  • This issue affects:
    • Virtual machines converted in-place (not deployed or cloned) from template virtual machines with compact disks.
    • Virtual machines cloned from a virtual machine of the above type (as well as virtual machines cloned from those virtual machines, and so on).
    • Virtual machines with disks created with the following command: 

      vmkfstool -i -d thin
       
    • Virtual machines created through the SDK with at least one disk created using theRelocateSpec.transform parameter set to sparse.

Solution

This issue affects these types of virtual machines because when a disk that is thin-provisioned (or is flat but was cloned from a virtual machine that thin-provisioned disks) is copied to the target datastore as part of Storage VMotion, the disk's content ID (CID) value is not preserved (although the content of the disk is correctly copied). When the virtual machine attempts to open the disk, it notices the CID is different from what it expects and fails to resume because it believes the disk is corrupted. In reality, the disk is not corrupted, only the CID is incorrect. So not only does the virtual machine end up powered off, but it cannot be powered back on because the CID is still incorrect.

 

Patch Fix

Patch ESX350-200802411-BG fixes this issue. For more information about the patch, see VMware ESX Server 3.5, Patch ESX350-200802411-BG: Enhanced Validation Checks, and Fixes for Storage VMotion and Lab Manager (1003458). Download the patch at http://www.vmware.com/download/vi/vi3_patches_35.html.

 

Pre-Patch Workaround

Step 1: Download the Required Scripts and Copy Them to the Appropriate Location
 
This article includes as attachments the following scripts:
  • isVMAffectedByKB1003874.pl checks if the virtual machine has a mis-matched CID value and therefore can be fixed with the next script. Copy this script to the Remote CLI host, into the location where you runsvmotion.plFor Windows Remote CLI installations, the default location is C:\Program Files\VMware\VMware VI Remote CLI\binFor Linux Remote CLI installations, the default location is the home directory or /usr/sbin for the appliance.
     
  • completeSVMForVMAffectedByKB1003874.pl fixes the CID issue on the virtual machine disk file. Run this script after checking if your system is affected, using the previous script. Copy this script to the service console of your ESX host in the /root directory.

To download these files:

  1. Find the links to the scripts under the Attachments section at the bottom of this article.
  2. Right-click each link and save the file to your machine. 

    Note: If the host where you intalled the Remote CLI has a Web browser, you can save the files directly to that host. Then you need to copy only one file to another system (the ESX host).
To copy the files to the Remote CLI (for a Linux host or appliance) or to the service console from a different machine, use scp or WinSCP. If the Remote CLI is installed on a Windows host, follow the file transfer methods with which you are familar (for example: FTP, Network File Share, or other).
 
 
Step 2: Verify Your Virtual Machine Is Affected
 
To verify your virtual machine belongs to the affected group of virtual machines described in the Details section of this article, run isVMAffectedByKB1003874.pl from the same directory where svmotion.pl is located.
 
For example, on a Windows Remote CLI host if you copied the script to the default location:
  1. Open a Command Prompt window.
  2. Change to the directory:

    cd "C:\Program Files\VMware\VMware VI Remote CLI\bin" 
     
  3. Run the command:

    isVMAffectedByKB1003874.pl
Similarly for a Linux Remote CLI host, if you copied the files to the default Remote CLI scripts location for the appliance, run the command:

perl /usr/sbin/isVMAffectedByKB1003874.pl
 
The output looks similar to the following example for an affected virtual machine:

************************** Attention! ****************************
The virtual machine "Compact WinXP 2" IS AFFECTED by KB 1003874!

If you attempted to Storage VMotion this VM, refer to KB 1003874
for instructions on completing the Storage VMotion and enabling
your VM to power on again.

Otherwise, do not Storage VMotion this VM until you have installed
patch bundle ESX350-200802411-BG. If you have already installed
this patch bundle, then your VM is safe to Storage VMotion and you
can disregard this message.
******************************************************************

The output looks similar to the following example for an unaffected virtual machine:

The virtual machine "VirtualCenter XP" is NOT affected by KB 1003874.
You can safely Storage VMotion this VM. 


Notes:
  • You can also run this script as a precautionary measure on virtual machines that have not encountered this problem, before moving them with Storage VMotion.
  • If the script indicates that the virtual machine is not affected by this issue, the next step does not fix the problem. You need to perform additional analysis and other troubleshooting steps with the help of documentation, the forum community, or Technical Support (if appropriate). For related information, see Snapshot Operations Submitted Directly to an ESX Server Host During Storage VMotion Corrupt Virtual Machine Data (1003114).
 
Step 3: Repair the Problem Virtual Machine
 
To complete the Storage VMotion process and enable your virtual machine to power on again:
  1. Log in to the service console of the ESX Server host where you copied the script.
     
  2. Run the script completeSVMForVMAffectedByKB1003874.pl, specifying the name of the affected virtual machine as a command-line argument, surrounded by single quotes. 

    For example:

    [root@esxhost root]# perl completeSVMForVMAffectedByKB1003874.pl 'Compact WinXP 2'

    The output looks similar to the following:

    Configuring your virtual machine to complete Storage VMotion...

    **************************** Success! ******************************
    The script has modified the VM so that it is able to complete the
    Storage VMotion, as described in KB 1003874.

    Please follow the rest of the steps in the KB to complete the
    Storage VMotion and power-on your VM.

    Do not attempt to Storage VMotion this VM again until you have
    installed patch bundle ESX350-200802411-BG.

    Original copies of all modified files have been placed in
    /vmfs/volumes/477e63a9-a760d731-eecc-000e0c819981/Compact WinXP 2_1/backups.
    ******************************************************************** 


  3. Using VMware Infrastructure Client, create a snapshot of the virtual machine.
     
  4. In the snapshot manager, select Delete All. This collapses the snapshot you just took with any remaining DMotion-* files on the source datastore into the disks on the target datastore.
     
  5. Power on the virtual machine. This powers on the virtual machine using all the files on the target datastore, allowing the virtual machine to complete its migration.
Warning: This issue still affects the virtual machine even after running the script on it. Do not attempt to move this virtual machine again with Storage VMotion until you have downloaded and installed patch ESX350-200802411-BG (now available). After you have installed the patch, you can safely move this virtual machine with Storage VMotion. 

Based on VMware KB 1003874
  • 0 användare blev hjälpta av detta svar
Hjälpte svaret dig?

Relaterade artiklar

Hardware and firmware requirements for 64-bit guest operating systems

PurposeThis article explains the host machine hardware and firmware requirements for installing...

Logging in to the vCenter Server 5.0 Web Client fails with the error: unable to connect to vCenter Inventory Service

DetailsAfter upgrading from vCenter Server 4.1 to 5.0, you experience these symptoms:Cannot log...

Multiple network entries in vCenter Server 5.0.x after migrating virtual machines from a virtual switch to a virtual distributed switch

SymptomsAfter migrating virtual machines from a virtual switch to a virtual Distributed...

Minimum requirements for the VMware vCenter Server 5.x Appliance

PurposeIf you are using the VMware vCenter Server Appliance, beginning with vSphere 5.0 you can...