From time to time, I hear from practicing engineers something strange: VMDK, VHD and VHDX are completely different virtual disk formats, almost closed, and converting from one to another is a long and painful process. Today I’ll demonstrate that this is not so, I’ll figure out how these formats relate to each other and how to do quick conversion when migrating from Hyper-V to VMware and vice versa.
A bit of theory. From the point of view of properties, virtual disks are divided into two types:
- thin (dynamic disk) and
- thick (fixed disk).
Everything else – difference, thick provisioned lazy-zeroed – only variations.
I will not dwell on this in detail. I can only say that further we will talk about thick disks.
Disc formats
RAW – “raw” image of any disk. This is a regular container that does not contain any specific headers and footers and represents the disk image “as is”. If we open such an image with a HEX editor, we will immediately see the headers of the GPT / MBR and / or file system. The exact same image is obtained through the dd command on Linux. RAW in this regard is absolutely honest with us.
VMDC. VMware ESXi is an ordinary RAW, where the disk geometry is described in a regular text file descriptor (descriptor). It is his name that we see in vSphere Console when we connect a virtual disk to a virtual machine or browse the contents of a directory on Datastore. VMware ESXi does nothing with the image. Absolutely. The disk rests on itself and expands as needed. In the best VMware tradition, the descriptor format is very simple:
# Disk DescriptorFile version=1 encoding="UTF-8" CID=fffffffe parentCID=ffffffff isNativeSnapshot="no" createType="vmfs" # Extent description RW 15122560 VMFS "disk-example-flat.vmdk" # The Disk Data Base #DDB ddb.adapterType = "lsilogic" ddb.geometry.cylinders = "941" ddb.geometry.heads = "255" ddb.geometry.sectors = "63" ddb.longContentID = "4f5dc83d0a5270bee54e2d85fffffffe" ddb.uuid = "60 00 C2 93 b4 38 ed dd-a3 85 88 48 68 40 2f c0" ddb.virtualHWVersion = "13"
And it is not only simple, but also functional: it is enough to make notes in the descriptor file to expand the virtual disk to any supported values. This allows you to fill disks with zeros or mark it thin, without having to keep geometry information in the disk headers.
Below are some standard values for all sections of the descriptor:
Value |
Description | Parameter |
Section |
(default)1 | Specifies the version number of the descriptor. Usually does not change. | version | Header (# Disk DescriptorFile) |
A random 32-bit value generated at the time of creation. | Content ID A random 32-bit disk identifier involved in building a snapshot tree. Is the ParentCID for child delta disks. | CID | |
Ffffffff (CID_NOPARENT)
CID of the parent drive. |
CID of the parent drive. If there is no parent disk, the CID_NOPARENT flag (ffffffff) is set. | parentCID | |
For ESXi, vmfs (in the case of a virtual disk) or vmfsRawDeviceMap and vmfsPassthroughRawDeviceMap (in the case of RDM). | A pointer to the type of disk described in the descriptor (it could well be a physical disk, and differential disks, and even an array of VMDK disks). For ESXi, the set of properties is limited. | createType | |
no (VMkernel), yes (VAAI) |
Marks by what means the snapshot will be done: VMkernel or means of storage (VAAI). | isNativeSnapshot | |
The section contains the drive path, access type and size. In format:
<access type> <size> <extent type> <path to the VMDK file or to the device> <offset>. |
Extents (# Extent description) | ||
RW (read / write)
RO (read only) NOACCESS (access denied). |
Type of disk access. | Access | |
The number of logical sectors of a virtual disk is indicated. It is calculated by the formula:
<Size in bytes> / <logical sector size>
|
Disk size | Size | |
May take on value
FLAT, SPARSE, ZERO, VMFS, VMFSSPARSE, VMFSRDM, VMFSRAW. |
Pointer to disk mode. | Type of extent | |
The path to the VMDK file. | Filename | ||
The offset in bytes relative to the beginning of the disk before the start of the data block. | It is used if you need to specify the start offset of the guest OS data. For virtual disks, it is usually 0 (or not specified). For RDM may be nonzero. | Offset | |
Describes the geometry of a virtual disk. | Disk Database (# The Disk Data Base) | ||
Only 3 types are supported:
Moreover, the VMware Paravirtual adapter is always marked as lsilogic. |
Type of virtual SCSI-adapter VM. | ddb.adapterType | |
The number of cylinders, heads, and sectors for describing the geometry of a virtual disk. | ddb.geometry.cylinders ddb.geometry.heads = «255» ddb.geometry.sectors = «63» |
||
1- the disk is thin,
0 or absent – thick disk |
Thin disk flag. | ddb.thinProvisioned | |
Descriptor id | ddb.uuid | ||
Virtual Hardware Version | ddb.virtualHWVersion |
VHD. Thick VHD is the same RAW, but with a 512-byte footer, which describes the geometry of the disk. The Microsoft Hyper-V virtual machine does not have a separate descriptor file. Description of disk geometry takes 4 bytes. Actually, from here the limitation on the disk size of 2 TB.
The most interesting thing is that if you create a descriptor file and slip a VHD disk with a footer into the ESXi, the VMware hypervisor will ignore this footer and accept VHD as a native one.
When Storage vMotion converts a disk to thin, it simply cuts off this footer, and at the output we get the same RAW without zeros at the end. And when converting to a thick disk – honest RAW. This is what I am going to demonstrate a little later.
VHDX. All information about the disk geometry is stored in the first 4096 Kbytes of the virtual disk – in the header are.
What is this area like? It contains two copies of the headers with their logs, the BAT and the metadata area are common.
Only one copy of the header is active per unit of time. This provides a certain level of header fault tolerance in the event of unplanned interruptions in read / write operations. After each I / O operation, the copy is replicated and a switch is made to it.
To convert VHDX to RAW, we just need to cut the first 4096 KB.
An attentive reader, of course, will say: ok, but can you convert RAW to VHDX ? To which I will answer: it depends on the file system and on how much it allows you to write data to the beginning of the file. Manually on the NTFS file system, this can be done by moving the beginning of the file 4 MB forward in the MFT and appending the header to this place.
The vhdxtool.exe utility works on the same principle. However, with this conversion, we will not get a beautiful picture in the form of a 4 MB header and RAW. The disk will be visible and will even work correctly as VHDX, but there will also be a lot of “garbage” of zeros that appeared due to manipulations with offsets. The drive will not be optimized. VMs with such a disk are recommended to be migrated to another volume or optimized through Convert-VHD or Optimize-VHD . If this is not done, the disk will take up more space than it should, and may work more slowly.
However, in the migration scenarios from VMware to Hyper-V, this utility is indispensable, as it allows for in-place conversion, without the need for a byte-read of the source disk and creation of a copy nearby. All roughness will be smoothed out at the first Storage Live Migration.
Conclusion: thick disks of the VMDK, VHD, VHDX formats are actually not much different from each other. They are based on RAW with various additives. Using the same HEX editor or OS functions for working with the file system, we can turn 10 Tb VMDK or VHDX into the target hypervisor disk in a couple of seconds.
Let’s take a look at how VMware Exsi deals with VHD.
As an example, I created a Windows Server image using Convert-WindowsImage with an injection of VMware drivers and parameters:
- OS Version: Windows Server 2019 Standard,
- Disk Type: Fixed,
- Disk Layout: GPT,
- Disk Size: 30GB
- Rename the drive to Win2019-test2-flat.vmdk to load it onto the ESXi Datastore.
- Next, I create an empty VM in VMware ESXi with a Thick (Eager Zeroed) disk so that the VMDK descriptor is created automatically and does not have to manually calculate the cylinders.
- We connect to the host via WinSCP and replace the existing file.
- Turn on the VM and see that the OS boots without any problems. It remains only to install VMware Tools, which will be simple, since Convert-WindowsImage allows us to install device drivers.
- Move the disk to another Datastore via Storage vMotion and convert it to a thin disk.
- Check the size – the disk has become thin.
- If we convert back to a thick disk or migrate the VM to file storage, we get the purest RAW without headers.
The same focus works for RAWs created through dd. And even in the opposite direction. This way you can see that VMware ESXi accepts third-party footer or RAW discs.
If you do not want tricks, then you can use the tools below.
Command example | Instruments | Target format | Source format |
vhdxtool upgrade -f <file name >.vhd | vhdxtool.exe | VHDX | VHD |
vhdtool /convert <file name flat>.vmdk | vhdtool.exe | VHD | VMDK (RAW) |
vhdtool /convert <file name flat>.vmdk | vhdtool.exe vhdxtool.exe |
VHDX | VMDK (RAW) |
vhdxtool upgrade -f <file name >.vhd | VHDX | VHDX (RAW) |
To summarize
The different formats of thick virtual disks are not so different. At the heart of all is RAW with various “additives.”
Converting virtual disk formats is not scary, and, as I have shown, sometimes you can do without it.
The main profit of all this is to reduce migration time from Hyper-V to VMware and vice versa and VM downtime during migration. In DataLine, we practice this with VM downtime for less than 30 minutes. The record is 40 seconds of VM downtime during migration between hypervisors.
Just remember that when migrating between different hypervisors, one conversion is not enough. At a minimum, you must first install the integration components of the target hypervisor, remove or disable the launch of the components of the source hypervisor, remove the virtual devices of the source hypervisor, etc.