Since the introduction of VAAI storage acceleration for VMware, we have witnessed the “big guys” blogging on the performance boost provided by this technology. For those of you with real lives, VAAI is a collection of T10 SCSI commands designed to accelerate some performance-critical operations by offloading them from the host to the SAN.
VAAI provides these fundamental operations:
- Atomic Test & Set (ATS), which is used during creation and locking of files on the VMFS volume
- Clone Blocks/Full Copy/XCOPY, which is used to copy or migrate data within the same physical array
- Zero Blocks/Write Same, which is used to zero-out disk regions
- Thin Provisioning in ESXi 5.x and later hosts, which allows the ESXi host to tell the array when the space previously occupied by a virtual machine (whether it be deleted or migrated to another datastore) can be reclaimed on thin provisioned LUNs.
- Block Delete in ESXi 5.x and later hosts, which allows for space to be reclaimed using the SCSI UNMAP feature.
In this blog, I will discuss the impact of VAAI on VDI provisioning and cloning. I will also give you some performance comparisons between some of the “big iron” and our IO Offload Engine (IOOE).
To illustrate the effect of VAAI, we performed the following operations with VAAI enabled and disabled configurations (10G iSCSI):
- add a new Hard Disk to the VM, 100GB, thick, cluster supported (zeroed)
- clone the VM (with the added disk) within the same LUN
- clone the VM to another LUN
- Storage vmotion the VM
Operation | VAAI On | VAAI Off |
100GB VMDK creation with cluster support (zeroed) | 2:05 | 3:46 |
Clone VM (and disk) same datastore | 2:02 | 10:36 |
Clone VM (and disk) second datasore | 2:39 | 9:58 |
Storage vmotion | 1:59 | 5:19 |
This shows the results you would expect, that VAAI significantly accelerates these operations. How do we compare to other systems from the major vendors?
The exact same test was performed and published using a NetApp FAS3210 array with Fibre Channel connectivity.
We summarize the comparison below:
Benchmark | GreenBytes(VAAI Enabled) | NetApp FAS3210 (VAAI Enabled) | Improvement |
100GB VMDK creation with cluster support (zeroed) | 2:05 | 4:49 | 2.32X |
Clone VM within datastore | 2:02 | 9:25 | 4.79X |
Clone VM between datastores | 2:39 | 9:29 | 4.50X |
Storage vmotion | 1:59 | 10:44 | 5.41X |
On average, the GreenBytes IOOE was 425% faster than the similarly priced NetApp 3210. What is especially interesting is the ability of the IOOE to run parallel operations across both controllers. When both controllers are active, cloning proceeds in parallel improving on the NetApp time to clone between datastores by 10X (dual parallel clones).
Clearly, the IOOE is in a different performance class. Stay tuned for my next entry:
Attack of the Clones: Performance and Data Reduction at Scale.
