I have two ESXi 5.0 hosts and one storage. Both ESXi hosts have 2 interfaces for iSCSI only and storage has 4. Every interface is configured to work in separate network segment. All connections are made using two separate HP V1910 (to get real failover) series switches without using jumbo frames (but results are the same when I use direct connections). My goal is to have failover and load balancing as well but... well... it works a bit in unexpected way. To clarify my issues I present to you two tests.
TEST 1
I log to ESXi host A (over SSH) and run following command:
Code: Select all
dd if=/dev/zero of=10GiB_v1.dat bs=1M count=10240
My expectations was something about 200 MB/s, but i get only ~90MB/s what can be seen on following screenshots.
First one shows 5 second average of write speed from esxtop. It ranges from 85 to 90 MB/s. This one presents what is happening with network cards - data transfer complement each other! When one is maxed out second one is equal to zero... Third screenshot shows what is happening when the same command is issued when multipathing policy is set to fixed and during transfer changed to round robin. In the beginning one channel is used and that's expected. But when policy changes transfer is equally split between both paths but the sum remains same as for single path (1xGbE). I admit I expected to see that transfer speed doubles... TEST 2
Finally I have done above test for two hosts simultaneously (starting with fixed policy and change during transfer policy to round robin on both hosts). [Unfortunately I cannot upload more then three attachments so I do this in the next post.] This test showed that two parallel transfers are possible without interfering each other (none of the dd command slowed down because of second transfer). But the effect known from one host experiment is exactly the same - instead of 4x1GbE I get 4x0.5GbE. After that I'm pretty sure that not the network is limit but ESXi round robin algorithm or StarWind Target method of handling round robin algorithm.
SUMMARY
In all tests round robin policy was set to "IOPS" mode and iops operations limit to change path set to 1.
What, the heck, is going on? In theory round robin works but why I obtain only half of the wire speed? ESXi is the most recent (5.0.0b768111), StarWind as well (5.8.2013). Is it networking problem (I doubt, switches logs do not show any conflicts, dropped packets etc.). Why I cannot get the aggregated wire speed? And I'm not sure where the problem in fact is - StarWind iSCSI Target or ESXi. Any help, comments, questions and discussion is appreciated.