MPIO Network failure stops all traffic.

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
ggalley
Posts: 9
Joined: Wed Mar 30, 2011 5:21 am

Wed Dec 28, 2011 4:42 pm

In windows if I disable even a single network adapter on any server all traffic halts for 17 to 25 seconds is this normal is there a way to reduce this time or to not have it happen at all?

Configuration/Usage Below:
Starwind version 5.7.1807
Servers are all running 2008 R2 Enterprise
I have starwind configured in HA mode with 2x10GBE cabled to each switch.
The HyperV hosts currently have 4x1GBE dedicated to ISCSI.
This is on a dedicated storage network.

I have applied your recommended network settings to the starwind servers
http://www.starwindsoftware.com/forums/ ... t2293.html
I have jumbo frames enabled.
MPIO policy is set to use round robin.

Using Barts test stuff I run a stream of data to the mounted ISCSI drive.
All network adapters light up and I see network utilization on all storage server adapters and hyperv adapters.

Starwind 1
10.254.1.50 -> switch A
10.254.2.50 -> switch B
Starwind 2
10.254.3.50 -> switch A
10.254.4.50 -> switch B

HyperV 1
10.254.1.51 -> switch A
10.254.2.51 -> switch B
10.254.3.51 -> switch A
10.254.4.51 -> switch B

Are any of you familiar with this microsoft kb
http://support.microsoft.com/kb/2522766
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Thu Dec 29, 2011 11:00 am

Could you please provide us with detailed network diagram and specify what exactly NICs on what exactly servers have you disabled?

Thank you
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
ggalley
Posts: 9
Joined: Wed Mar 30, 2011 5:21 am

Thu Dec 29, 2011 4:29 pm

I have attached a picture of the network layout.

After I disable/unplug the single nic from store02. Every ISCSI network connection hangs for 10-30 seconds.

Is this to be expected with MPIO a single connection lost makes all clients stop sending traffice?
Attachments
Lab network
Lab network
Starwind_V1.jpg (80.06 KiB) Viewed 8621 times
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Fri Dec 30, 2011 11:11 am

Dear ggalley,

Since you are using two servers I am assuming that you are using HA, so please see attached diagram with recommended settings (I haven`t specified all the IPs but I think you will get the idea):
Attachments
Drawing1.jpg
Drawing1.jpg (32.18 KiB) Viewed 8590 times
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
ggalley
Posts: 9
Joined: Wed Mar 30, 2011 5:21 am

Fri Dec 30, 2011 6:29 pm

Is this Gap normal with MPIO ISCSI failovers after a single NIC failure?
NetworkIOGap.png
NetworkIOGap.png (2 KiB) Viewed 8571 times
I made the changes you request I am still getting the same results. I have configured my ISCSI to enable multi pathing and have multiple sessions. I have also configured my iscsi without enabling multi pathing with the same results.

Network activity while I have Barts test stuff running from hyper01 connecting to a single ISCSI mounted drive.
NetworkIO.PNG
NetworkIO.PNG (50.64 KiB) Viewed 8573 times
This is getting more urgent than a once a day response.
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Mon Jan 02, 2012 1:25 pm

Dear ggalley,

First of all as I can use your ASM is valid so you are welcomed to use it (you can email us using support@starwindsoftware.com]this email), and you can expect that you will get response faster.

I would also like to ask you to clarify on what NIC exactly do you see the gap that you have specified?
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Tue Jan 03, 2012 12:39 pm

Run remote session with them and publish results here. Should save quite a time on turn around. Thanks!
Anatoly (staff) wrote:Dear ggalley,

First of all as I can use your ASM is valid so you are welcomed to use it (you can email us using support@starwindsoftware.com]this email), and you can expect that you will get response faster.

I would also like to ask you to clarify on what NIC exactly do you see the gap that you have specified?
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Post Reply