Software-based VM-centric and flash-friendly VM storage + free version
Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)
-
naas_it
- Posts: 3
- Joined: Tue Oct 06, 2015 9:57 am
Tue Oct 06, 2015 10:26 am
Hi
I've set up a two-node virtual SAN in VMware following the guide here:
https://www.starwindsoftware.com/techni ... Sphere.pdf
It's mostly working fine, but when I switch off one of the VMs that Starwind Virtual SAN is running on (SWiSCSI2), the storage (DS-iSCSI) goes "inactive" in vSphere, even though SWiSCSI1 is still running. To resolve, I have to rescan the storage adapter in vSphere which brings the storage back online. Switching SWiSCSI2 back on brings the storage online too.
I guess the storage adapter is not automatically failing over to an alternative path when one of the VMs running Starwind goes down. Any ideas why this might happen and what I could do to troubleshoot?
Thanks!
-
nohope
- Posts: 18
- Joined: Tue Sep 29, 2015 8:26 am
Tue Oct 06, 2015 11:15 am
Hi
Have you configured all parameters of automatic storage rescan correctly from this guide? I would advice you to check them/
-
darklight
- Posts: 185
- Joined: Tue Jun 02, 2015 2:04 pm
Tue Oct 06, 2015 1:03 pm
Also check the multipath policy in ESX. By default its MRU (Most Recently Used). Switch to Round Robin for additional performance and better failover.
-
naas_it
- Posts: 3
- Joined: Tue Oct 06, 2015 9:57 am
Wed Oct 07, 2015 10:01 am
Thanks, will give these suggestions a go and let you know how I get on.
My understanding was that if one of the paths goes down it should fail over after a short period of downtime. However, I waited around 5 minutes when I was testing and the storage didn't come back online until I re-scanned in vSphere. Is this expected behavior?
I would have expected a failover to happen even if MRU was selected and the automatic storage rescan was incorrectly configured. Configuring these options would just speed up the failover process.
-
nohope
- Posts: 18
- Joined: Tue Sep 29, 2015 8:26 am
Wed Oct 07, 2015 4:38 pm
Hi again
Switching to round-robin multi-path policy indeed can speed up failover and enhance performance generally. However, configuring the automatic storage rescan impacts just on availability of the storage once one of the Starwind VSAN nodes has crashed.
-
naas_it
- Posts: 3
- Joined: Tue Oct 06, 2015 9:57 am
Thu Oct 08, 2015 8:38 am
Hi
Did some troubleshooting of the automatic storage rescan last night and found that my syntax on one of the server scheduled tasks was incorrect. Amended and tested by running the scheduled task and checking "recent tasks" in vSphere to confirm it was now working. Also, changed to round robin multi-path policy on both ESXi hosts as suggested.
Seems to have made a real difference. Tested powering off both StarWind VMs and ESXi hosts last night and there was no noticeable downtime at all.
Thanks for your help!