Page 1 of 2

horrible slow connection to ESX

Posted: Wed Nov 16, 2011 9:24 pm
by ghost
Hi

I installed starwind at home, everything on VMware Workstation sitting on Windows7. So i run inside this, virtualized ESX5i and another VM with Win2003 and free version of StarWind. Everything was fine, i was getting copy speeds around 30-50mb/s which is more than ok because everything is on one desktop HDD with additional virtualization layers.

Then i installed the same configuration but on physical hardware, on Dell T310 (ESX5i) and Dell SC440 (Windows 2003R2 with Starwind).

The big problem: When i run ghettoVCB to the iSCSI target, i get around 3MB/s. Yes thats right THREE.
Also with dd /dev/zero and 1mb chunks it takes around 10 times longer than on my test environment.

I played around with NIC settings on Windows (Broadcom, seems to be no Jumbo available), no difference.
I disabled TCP Ack Delay on both ESX and Windows and also disabled Chimney and RSS on Windows but it made no difference.

In both servers are Broadcom NICs and they are directly connected to each other by CAT6 cable.

At vmkernel.log i dont see any errors.

Looking at the graphs it seems to bounce a lot in terms of speed.

I dont know even where to look next ...

Or is it just a broken NIC or cable?

Re: horrible slow connection to ESX

Posted: Wed Nov 16, 2011 9:51 pm
by anton (staff)
Do you happen to have any raw TCP peformance charts as well?

Re: horrible slow connection to ESX

Posted: Wed Nov 16, 2011 10:15 pm
by ghost
Hello Anton

the only thing i found is NIC statistics

Also i found that this NIC shares same IRQ (16) as SAS controller

Thomas

Re: horrible slow connection to ESX

Posted: Wed Nov 16, 2011 10:26 pm
by anton (staff)
NIC stats don't help much (except we know for sure there are no transmit errros / collisions). Also it's absolutely OK to share IRQ for PCI/PCI-X/PCIe devices. What you need to run is some TCP tests from the place you experience slow network results. NTtcp or / and IPerf. Is it possible?

Re: horrible slow connection to ESX

Posted: Wed Nov 16, 2011 11:20 pm
by ghost
Another strange thing i just found, if i copy a very big file from the VM to Windows over CIFS, it starts fast but after a couple of seconds it drops down to 300kb/s. But this are completely different NICs on a switch.

I only have windows clients and ESX, so i ran iperf between windows vm and windows machine with starwind. Got around 550mbit/s. But how this will test iSCSI network?

Also ran Atto on the old server, everything ok.

Maybe its just network cables?

Re: horrible slow connection to ESX

Posted: Thu Nov 17, 2011 5:53 pm
by ghost
I further tracked down that both on CIFS and iSCSI the download was getting worse but upload was fine. So checked in performance monitor and saw that the disk queue length was still at 100% when it was copying only 300kb/s.

So it seems i have found the problem at controller or disks :?

Re: horrible slow connection to ESX

Posted: Thu Nov 17, 2011 5:58 pm
by anton (staff)
Using another disk solves the issue? If it's SATA disk what does SMART monitor says about I/O errors?
ghost wrote:I further tracked down that both on CIFS and iSCSI the download was getting worse but upload was fine. So checked in performance monitor and saw that the disk queue length was still at 100% when it was copying only 300kb/s.

So it seems i have found the problem at controller or disks :?

Re: horrible slow connection to ESX

Posted: Thu Nov 17, 2011 7:05 pm
by ghost
Did not try other disk because i have none.
These are 2 brand new SATA Disks on SAS Dell PERC 5/iR Controller.

When i run atto alone with 2gb size, everything is ok. But as soon there is also traffic on network which involves disk writes, it gets so slow.

That probably also explains why NFS Server on Windows was so slow (2MB/s)

Re: horrible slow connection to ESX

Posted: Thu Nov 17, 2011 9:26 pm
by anton (staff)
Can you turn ACPI off and play with moving hard disk and network controllers to other IRQs? Or use another NIC for now?
ghost wrote:Did not try other disk because i have none.
These are 2 brand new SATA Disks on SAS Dell PERC 5/iR Controller.

When i run atto alone with 2gb size, everything is ok. But as soon there is also traffic on network which involves disk writes, it gets so slow.

That probably also explains why NFS Server on Windows was so slow (2MB/s)

Re: horrible slow connection to ESX

Posted: Mon Nov 21, 2011 3:24 pm
by ghost
I found out that disk cache was disabled, but this controller has also no BBU. After enabling the cache, i didnt had anymore slowdowns at writing.
It could be still faster, 20mb/s over iSCSI but better than 3mb/s before.

What is strange, that atto had no slowdowns. But as copied a big file locally it was also slowing down.

Re: horrible slow connection to ESX

Posted: Mon Nov 21, 2011 3:27 pm
by anton (staff)
20 MB/sec still sucks.

Synthetic tests have own issues. Make sure you run a bunch of them (add I/O Meter, SQLIO etc) to pinpoint the numbers.
ghost wrote:I found out that disk cache was disabled, but this controller has also no BBU. After enabling the cache, i didnt had anymore slowdowns at writing.
It could be still faster, 20mb/s over iSCSI but better than 3mb/s before.

What is strange, that atto had no slowdowns. But as copied a big file locally it was also slowing down.

Re: horrible slow connection to ESX

Posted: Fri Dec 02, 2011 10:18 am
by ghost
what is strange, the speed varies a lot. One time it took 800mins for an 140gb snapshot and next time it took only 60mins.

And when its sort of slow i can see NIC copy speed is going up/down all the time from 10-20MB/s

Does this still sound like an IRQ issue?

With tests i am a bit limited because only ESX has access to that storage.

Re: horrible slow connection to ESX

Posted: Fri Dec 02, 2011 12:10 pm
by anton (staff)
I would blame hardware. Can you put NIC into other slot?
ghost wrote:what is strange, the speed varies a lot. One time it took 800mins for an 140gb snapshot and next time it took only 60mins.

And when its sort of slow i can see NIC copy speed is going up/down all the time from 10-20MB/s

Does this still sound like an IRQ issue?

With tests i am a bit limited because only ESX has access to that storage.

Re: horrible slow connection to ESX

Posted: Fri Dec 02, 2011 4:48 pm
by ghost
i could if i would be physically there ...

i guess i will swap cables between NICs next time and make the other nic for iSCSI, i got some feeling the nic is defective

Re: horrible slow connection to ESX

Posted: Fri Dec 02, 2011 10:34 pm
by anton (staff)
Yes, please use other route for iSCSI (as many components different as possible) and let us know. Thank you!
ghost wrote:i could if i would be physically there ...

i guess i will swap cables between NICs next time and make the other nic for iSCSI, i got some feeling the nic is defective