Copying/moving files inside a VM is dramatically slow

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
superludox
Posts: 12
Joined: Thu Apr 29, 2010 1:16 pm

Thu Apr 19, 2012 9:31 am

Hello, will try to be as precise as possible

2 node Hyper-V Cluster, SW Native SAN 5.8 1TB edition Ent.

Here is the configuration for hosts servers :
W2K8R2 Entreprise 24 Go RAM
StarWind Native SAN 5.8
Intel E5620
HP SmartArray P410
1 RAID1 300 SAS 15k for OS and Hypervisor (C:\)
1 RAID10 600 Go SAS 15K for storing Starwind .img virtual disks (HA device) (S:\)

Network :
1x 10Gb NIC dedicated to ISCSI traffic + heartbeat , MTU 9000 (directly plugged into the other Hyper-v server 10Gb NIC) 10.0.0.0/8, MPIO
1x 10Gb NIC for Starwind Synchro, MTU 9000 (directly plugged into the other Hyper-v server 10Gb NIC) 20.0.0.0/8
2x 1Gb NIC (HP Team) for CSV traffic MTU 9000 (directly plugged into the other Hyper-v server) 30.0.0.0/8
2x 1Gb NIC (HP Team) for VM LAN traffic MTU 1500 (attached to the production switch) (Hyper-V switch not shared with OS)
1x 1 GB NIC for live migration MTU 9000 (directly plugged into the other Hyper-v server) 40.0.0.0/8
1x 1 GB NIC for Host LAN MTU 1500 (attached to the production switch) 192.168.34.0/24

Starwind :
HA device, Write-back caching, 64, 5000 (default)

Microsoft CSV are enabled (so each VM VHD are stored on it, each CSV are starwind .img targets):
1 HA device (280 GB) presented as CSV1 for the hosts
1 HA device (200 GB) presented as CSV2 for the hosts
1 HA device (2Gb) presented as quorum disk for the hosts


Everything Ok, HA, live migration, failover… regarding Microsoft Hyper-V Failover clustering.

Problem :
Copying files inside the VM is dramatically slow. VM disk (VHD inside the Starwind img HA device, HA device on the Host RAID10) are fixed VHD but exhibit really poor performance.

Tested:
copying files inside the VM on itself 15MB/s on average, begins fast then drop !
copying from Hypervisor OS RAID1 to VM Share begin at 150 MB/s then drop to 15 MB/s

copying from Hypervisor OS RAID1 (C:\) to same volume (on itself) = 150-200 MB/s
copying from Hypervisor OS RAID1 to the same Hypervisor RAID10 (S:\) volume = 250-300 MB/s
copying from Hypervisor OS RAID10 to the same Hypervisor RAID10 (S:\) volume = 200-250 MB/s
copying from Hypervisor OS RAID10 to the other Hypervisor RAID10 (S:\) volume = 250-300 MB/s on the ISCSI network (10.0.0.0/8)
copying from Hypervisor OS RAID10 to the other Hypervisor RAID10 (S:\) volume = 100-150 MB/s on the LAN network (192.168.34.0/24)

I can say ISCSI network is OK regarding Hosts and the architecture is generally OK.
But I cannot explain the poor disk performance inside the VM. Like the img Starwind files are bottleneck ?!
Please Help me
BRGDS
superludox
Posts: 12
Joined: Thu Apr 29, 2010 1:16 pm

Thu Apr 19, 2012 9:58 am

For information, I notice the same problem when copying, a 50 GB VHD for instance, from the coordinator node to its CSV2 folder (c:\ClusterStorage\volume2).
Like inside the VM it begins at 150-170 MB/s then step by step drop to 20 MB/s !
Problem with the intel X520-t2 10GB NIC or with Starwind synchro ?
I am a bit lost.

BRGDS
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Thu Apr 19, 2012 4:32 pm

And did you have a chance to run a big sequential write test against the target, and then try random read operations on it?
The problem can hide in the controller performance.
By the way, what is the block size/allocation unit size
on the SAN
On the client machine partition where the CSV is created
on the VM itself
Max Kolomyeytsev
StarWind Software
superludox
Posts: 12
Joined: Thu Apr 29, 2010 1:16 pm

Fri Apr 20, 2012 8:25 am

Hi MAx, thanx for your reply. Well, I will try to open a case on that issue since all the migration plan is stopped at my customer !

To answer your questions :

"what is the block size/allocation unit size on the SAN" : do you mean "what is the sRAID configuration at the HP SmartArray controller". Since the SAN are the RAID arrays of each partner (Starwind NAtive SAN for Hyper-V). At what level can I check this ?

Regarding CSV, they are on the seccond dedicated array (RAID10) physically speaking. The host machines have 6 SAS 300 GB 15K disks, 2 for parent partition (Hyper-V + OS) RAID1 et 4 for DATA partition (Starwind) RAID10. Same unique controler HP SMartArray P410i (HP ML350 G6).

At host level, it seems to be OK, but at the VM level it is not acceptable.
Maybe I am missing something or misconfiguring something ?! But what !
BRGDS
superludox
Posts: 12
Joined: Thu Apr 29, 2010 1:16 pm

Fri Apr 20, 2012 8:34 am

HP ACU says Stripe size : 256k, RAID Accelarator "activated"
superludox
Posts: 12
Joined: Thu Apr 29, 2010 1:16 pm

Fri Apr 20, 2012 9:40 am

Here is another test I made:
Created another target on one host, normal starwind device (60GB, no caching by default) on the RAID10 volume (same as CSV), connect the target via host initiator (local), copy syspreped VHD on it, created a vm on the local host Hypervisor, attached the VM to the same teamed virtual switch for LAN access.

When I copy a 1.5 GB folder containing multiple little files to the VM share from the host : starts at 150 MB/s, then drop to 50 minimum, then finish at 100. Folder copied in few seconds.
When I copy the same folder from the host (CSV1 coordinator, so local access)to the share of a VM located on CSV1 : starts at 150 MB/s then drop to 50,then 40, then 30...9MB/s minimum. Folder copied in minutes !

Starwind HA device issue?, caching issue ? MPIO misconfiguration ?
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Fri Apr 20, 2012 10:46 am

I think the problem is hiding in the Hyper-V allignment, we can try to increase the speeds by setting the stripe size->SAN volume allocation unit size->Client server allocation unit size->VM HDD allocation unit size to 64k. Some arrays don't support 64k stripe size - you can use 128k instead.
Max Kolomyeytsev
StarWind Software
superludox
Posts: 12
Joined: Thu Apr 29, 2010 1:16 pm

Fri Apr 20, 2012 1:51 pm

I will try this but I thought that the larger the strip-size was the better the perfs were. Especially when HP recommendations are to use larger strip for RAID0,1,10 !

Just to confirm :
"SAN volume allocation unit size" : you mean change the strip-size to 64ko iso 256ko on the HP controller for the RAID aggregates ?
"Client server allocation unit size" : what technically do you mean here ? Format the NTFS parent partition volumes with 64Ko ?
"VM HDD allocation unit" : same here, format the vhd using NTFS 64ko ?

Thanx for your support
BRGDS
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Sat Apr 21, 2012 10:48 pm

"SAN volume allocation unit size" : NTFS where the image files of starwind stored
"Client server allocation unit size": NTFS where the vhd stored
"VM HDD allocation unit": ntfs on the vhd in the VM

everything should be formatted in the way my colleague suggested.

I hoope it helps.
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
Post Reply