Jumbo frame issue

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
oxyi
Posts: 67
Joined: Tue Dec 14, 2010 8:30 pm
Contact:

Wed Sep 07, 2011 4:12 pm

I am having problem with jumbo frame enabled on my NIC card. I've tried 3 different switches that supported jumbo framed, and they all have been tested out. Cisco, HP, Dell switches. Network cables all brand new CAT 6a.

Here is some benchmark testing with/without jumbo frame, tcpnodelay hack, and 64bit allocation volume.
Image


The problem seem to be whenever I have jumbo frame enabled, I can see this speed decrease on my ATTO benchmark..
Without Jumbo frame
Image

With Jumbo frame enabled
Image


IOmeter shows similar result..
I am getting 6476 IOps, and 202.3 MBps in two 1Gb MPIO. 100% Read Sequential 32k without jumbo frame.

With jumbo frame on. I am getting 2067 IOps, 129Mbps, 100% Read Sequential 32k.

Anyone know why with jumbo frame enabled I get such a crappy performance ?
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Wed Sep 07, 2011 4:59 pm

This means something is broken ("Thank you Capt. Obvious!" (c) ... )

On your place I'd start with TCP/IP stack performance itself as if it's TCP issue it has nothing to do with StarWind itself. Next if you've tried three different switches it sounds like it's NIC itself or NIC drivers problem. Try updating both driver and NIC firmware (if possible) to the most recent one and if you'd continue to have the same problem - dump NIC and replace it with something known working. BTW, what NIC brand and model do you use?
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
oxyi
Posts: 67
Joined: Tue Dec 14, 2010 8:30 pm
Contact:

Wed Sep 07, 2011 11:40 pm

HI Anton,

Thanks for stating the obvious ~ lol !

Yes, I did troubleshoot all that.. latest driver from Intel and Broadcom, and no firmware update for Intel. Talked to Intel tech, and RMA the nic as well. Same thing.

NIC card in question was Intel PT Quad Gb card. I would do iPERF between PT quad card to starwind client - which is just regular Dell R710 server with Broadcom nic.
The speed I would get from iPerf was unreal - 0.004MB/sec. iperf.exe -c 192.168.0.72 -P 1 -i 1 -p 5001 -f k -t 10.

After it pisses me off, I dump the PT card and got ET2 quad card, and I did iPERF testing with jumbo frame enabled, it seem to be decent. 60MB/sec. I thought I am golden, now with the ATTO benchmark I posted, doesn't seem like it.. any ideas ?

Also, I'd like to find out by formatting the drive on Starwind Server to 64K allocation unit size, does it improves the speed ? It's a 32TB raid.

From my benchmarking with ATTO, I would create an iSCSI target under the newly created 64K NTFS allocation unit, and once I connected with MPIO on the client, and I would format that drive with 64K allocation size again. I do noticed the speed increased a bit, but is it suggested to do that ? thanks !
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Fri Sep 09, 2011 9:16 pm

60 MB/sec for 1 GbE connection sucks. Something around 100 is what should be expected.

Playing with allocation unit size to improve performance does make sense if you've already tweaked TCP to perform flawlessly. You did not.

In general 4KB (so-called "small" memory page size) allocation unit is OK. There's no much fragmentation with NTFS in any case.

I'd suggest you to find working NIC at the very beginning.
oxyi wrote:HI Anton,

Thanks for stating the obvious ~ lol !

Yes, I did troubleshoot all that.. latest driver from Intel and Broadcom, and no firmware update for Intel. Talked to Intel tech, and RMA the nic as well. Same thing.

NIC card in question was Intel PT Quad Gb card. I would do iPERF between PT quad card to starwind client - which is just regular Dell R710 server with Broadcom nic.
The speed I would get from iPerf was unreal - 0.004MB/sec. iperf.exe -c 192.168.0.72 -P 1 -i 1 -p 5001 -f k -t 10.

After it pisses me off, I dump the PT card and got ET2 quad card, and I did iPERF testing with jumbo frame enabled, it seem to be decent. 60MB/sec. I thought I am golden, now with the ATTO benchmark I posted, doesn't seem like it.. any ideas ?

Also, I'd like to find out by formatting the drive on Starwind Server to 64K allocation unit size, does it improves the speed ? It's a 32TB raid.

From my benchmarking with ATTO, I would create an iSCSI target under the newly created 64K NTFS allocation unit, and once I connected with MPIO on the client, and I would format that drive with 64K allocation size again. I do noticed the speed increased a bit, but is it suggested to do that ? thanks !
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
oxyi
Posts: 67
Joined: Tue Dec 14, 2010 8:30 pm
Contact:

Sat Sep 10, 2011 5:05 am

Regarding to the TCP/IP, I have already optimized it. With NTTTCP, I am getting around 940 megabits ..

Image

Okay, this is pertain to the HA setup. I have setup the HA on two starwind server, and the client is connecting to both target with MPIO.

The benchmark somehow is really crappy. But it is only on HA, nothing have to do with TCP/IP speed, since a non-ha target was able to obtain the optimized performance, how come the HA crapped out ?

EDIT: Sorry, should've explained the benchmark more clearly. On the left, the iSCSI target is just a regular iSCSI target, not HA. On the right, it's crated as HA, but instead of mounting two HA target (HA and HAPartner), I mounted one only to test out the speed, and it's the same as when I mounted a fully HA iSCSI target.
oxyi
Posts: 67
Joined: Tue Dec 14, 2010 8:30 pm
Contact:

Wed Sep 14, 2011 3:40 am

Still have no solution yet..

Worked with tech support and we thought it was the sync pipeline issue, since I am doing a dual 1gb MPIO, my sync channel should be teamed and LACP.
But after I changed that, and verify the speed between the sync channel, as soon as I mount an HA target with two channel, the benchmark just fly everywhere..

Image

The problem is, when it is a non-HA target, let it be a simple iSCSI target or snapshot iSCSI target, MPIO works perfectly fine, I don't get degraded performance, but as soon as it's a HA target, MPIO would just crap out and benchmark speak for itself.. did I set up something wrong for the HA ??
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Wed Sep 14, 2011 12:49 pm

Try to change traffic priority from synchronization to client connection (right-click on the device)
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
oxyi
Posts: 67
Joined: Tue Dec 14, 2010 8:30 pm
Contact:

Wed Sep 14, 2011 2:14 pm

how do I do that ? Do I do that under Starwind or where is that ?
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Wed Sep 14, 2011 3:28 pm

Yes, in the StarWind. As I mentioned before you should click on the host in management console, go to Device tab, right click on the device and you will see corresponding option.
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Wed Sep 14, 2011 4:20 pm

Issue resolved via remote session
Reason: enabled onboard iSCSI offload on the Broadcom cards
Resolution: disabling iSCSI offload through BACS and reconnecting the targets
Max Kolomyeytsev
StarWind Software
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Wed Sep 14, 2011 9:36 pm

So my initial statement about broken NICs was correct. OK, I'll take a beer from the fridge if you don't mind :)
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
oxyi
Posts: 67
Joined: Tue Dec 14, 2010 8:30 pm
Contact:

Thu Sep 15, 2011 10:42 pm

lol wish I can give you my beer from the fridge that fast.
After I reboot my servers, the degraded performance is back :(
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Sat Sep 17, 2011 5:44 pm

But it still jumps as soon as we use 2 cards on the client side...
Have you already contacted MS regarding that MPIO bug?
Max Kolomyeytsev
StarWind Software
Post Reply