Page 1 of 1

Poor disk performance when operating HA VM cluster

Posted: Fri Jun 17, 2011 11:22 am
by GGUser
We have two high specification servers running a Starwinds SAN and the clustered Hyper-V role. We have configured the servers with 5 NICS:

iSCSI Channel -> Crossover direct between servers
Sync Channel -> Crossover direct between servers
Heartbeat -> Crossover direct between servers
Host-LAN -> Via Switch
Virtual-LAN -> Via Switch

Cache is set to 4GB with a timeout of 10000ms.

We are benchmarking the disk performance of a VM running in the cluster and when starwinds is running in HA mode with the two servers fully syncronised the performance of the SAN disk is far lower than it should be. All NICs are running at Gigabit (and have been tested to be working correctly at gigabit speeds) and yet the disk performance when running in HA is around 1/10th wire speed. If we shut down one of the servers, so the SAN is running on only one server, we get full wire speed as expected. Bring the second one back up and it drops again.

We have checked and the traffic appears to be going down the correct NICs. We have rebuilt the system from stratch several times and always get the same result. When we do a full sync we get half wire speed as expected. Physcial disk performance of the two servers is about 3 times wire speed so we don't think that physical disk I/O is a problem.

Any help would be greatly appreciated.

Re: Poor disk performance when operating HA VM cluster

Posted: Fri Jun 17, 2011 2:17 pm
by anton (staff)
It's all about numbers. "Poor" and "should be" really don't say anything :) Start with checking (one-by-one):

1) TCP on sync channel and initiator to target doing wire speed.

2) Local disk I/O working in the proper way. No RAID5/6, write back cache on controllers enabled, stripe size 64KB, partitions aligned, no gaps in benchmark reports.

3) Benchmark VM performance from inside Hyper-V.

With number from 1, 2 and 3 we should be able to tell what's broken.

That's it :)

Anton