Starwind virtual SAN free - Performanceproblems

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

hoba
Posts: 28
Joined: Mon Sep 18, 2017 6:44 am

Wed Oct 04, 2017 8:03 am

Hi Sergey,

I ran the iperf tests. I can get 10 gbit/s between all 10gig interfaces up and down between the vsan nodes. l also can push 10 gbit/s from the vmware server to the interfaces of the vsan nodes. However I only get around 4 gbit/s from any of the vsan nodes towards the vmware server. Sounds like some kind of settingproblem on the vmware server? Do you have any advice here? The nics are intel x540-AT2:

Code: Select all

Name    PCI          Driver      Link Speed      Duplex MAC Address       MTU    Description
vmnic1  0000:03:00.1 ixgbe       Up   10000Mbps  Full   0c:c4:7a:df:0f:7f 9000   Intel(R) Ethernet Controller X540-AT2
The changes in the starwind.cfg had no effect. I still see massively degraded performance when adding more than 2 MPIO paths.

Thanks
Holger
Delo123
Posts: 30
Joined: Wed Feb 12, 2014 5:15 pm

Wed Oct 11, 2017 6:18 am

Any updates on this? Never trusted those X540-(A)T2. I remember some very old driver seemed to work much better on vmware side, but can't remember exactly when this was...
hoba
Posts: 28
Joined: Mon Sep 18, 2017 6:44 am

Wed Oct 11, 2017 6:43 am

I have created a ticket with vmware. The driver is not developed, nor supported by intel. Vmware is creating and compiling it on their own from the intel opensource code. I'll let you know, once they have looked into it and hopefully have a solution. Not sure if the MPIO-problem (issues when adding more than 4 paths) is related too but I'll let them check that one too.
Delo123
Posts: 30
Joined: Wed Feb 12, 2014 5:15 pm

Wed Oct 11, 2017 6:50 am

Ok, thanks... As said I remember there where some issues with these Nics, so we abandoned them for exactly the reason you are referring to.

Maybe the lun is trying to use both vSan nodes as active optimized paths? Idealy only 1 node should be active optimized for a single lun, no? Not sure where to check that in recent builds
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Thu Oct 12, 2017 7:40 am

hoba,
Keep the community posted about the updates from VMware on your ticket. Thanks in advance.
hoba
Posts: 28
Joined: Mon Sep 18, 2017 6:44 am

Thu Dec 21, 2017 4:38 pm

Hi,

after my ticket at vmware was lost on a desk of an employee who went on vacation and I went through several steps of escalation, I at least got a hint from the next support technician. Performance is now actually fine. It all comes down to not configuring 2 vmkernel interfaces on the same vswitch hitting a single physical nic. I was doing this, to be able to use mpio with all 4 paths and even 2 paths if one of the vsan-nodes fails. After adding another NIC to the vmware server and putting a vmkernel interface on each of the nics I see decent performance, though I wonder why I see writes > 10 gibt/s but reads are maxed out at 10 gbit/s. However, if I run Tests from 2 vmware servers against 2 san nodes I see the full performance on both nodes (no degration while running diskbenchmarks on both vmwareservers inside VMs with diskfiles hosted on the vsan). :-)

However during migration I will have to setup a single vsan node first without HA and add the HA-Node later with syncing the data from the first non HA existing node. I'm going to add another post for this, as this is not related.

Thanks so far!

Holger
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Thu Dec 21, 2017 5:39 pm

Holger,
thank you for getting back to this matter. At least now we see it was misconfiguration in terms of VMware setup.
Mitmont
Posts: 4
Joined: Thu Sep 29, 2016 6:22 pm

Fri Jan 05, 2018 7:58 pm

Hello this thread. If I may add to this. In our case 2012R2 with UI, Dell servers with high throughput NICs. What we came across was an issue when we hit a certain number of VMs we started to receive tons of Excessive Flush Cache messages. We saw this using WireShark on one of the members. We saw it in 2012R2 Clustered Storage Spaces also but didn't know how to handle them back in 2014. We did come across a fix that resolved our issue. There's an article by Ard-Jan Barnas called Tuning Windows 2012 -File System Part 1. We choose to make the following change per that article. TreatHostAsStableStorage. that was the biggest one. We also found some other specific tuning for our environment that we continue to deploy each time we create a new SW HA node. I'm adding that this is what worked for us... it may or may not be a correct action for every situation. Proceed at your own risk.
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Sat Jan 06, 2018 6:25 am

Mitmont, thanks for sharing this info with the community.
Post Reply