Am I missing an option on Synchronization?

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

User avatar
mooseracing
Posts: 91
Joined: Mon Oct 11, 2010 11:55 am

Mon Oct 11, 2010 12:00 pm

I am trying to set up Hyper V clusters and have accomplished that but I notice whenever I restart one of the Starwind HA servers v(5.4.1599) , upon restarting, Starwind starts re-syncing, which forces my connections to disconnect and then corrupt VM's.

Is what I am experiencing normal? Or is there an option that I need to check. It seems like my clusters would failover to the secondary storage partner and use that until the primary is resynced.

Thanks
Paul
User avatar
Aitor_Ibarra
Posts: 163
Joined: Wed Nov 05, 2008 1:22 pm
Location: London

Mon Oct 11, 2010 2:39 pm

On the Hyper-v servers, you need to make sure that they are set up to have a connection to both Starwind boxes, for each target. If you stick with default names, for each HA target, there will be two targets listed in MS iSCSI initiator - xyx and xyzpartner. You also need to make sure that multipathing is enabled when you create the connections, and you need to have Microsoft's MPIO feature installed and enabled for iSCSI. All this needs to be done for every hyper-v server in your cluster. Have you done all that?

If you have done that, and still have problems, then you should look at the 5.5 beta. Problems like you've been having were rare, but possible, with earlier versions of 5.x.

Hope this helps....
User avatar
mooseracing
Posts: 91
Joined: Mon Oct 11, 2010 11:55 am

Mon Oct 11, 2010 2:45 pm

Aitor_Ibarra wrote: Have you done all that?
Yeppers.
Is there anyway to verify MPIO installed correctly? I remember enabling it and then having to reboot.


3 Virtual Disks managed by starwind, all setup with HA. Currently 2 are resyncing as they are about 1TB. the other is only 5GB for a witness disk and re-synched quickly and connections were restored.

The other 4 iSCSI (2 for each disk, and 2 for each partner) connections all error with "The target name is not found or is marked as hidden from login".
hixont
Posts: 25
Joined: Fri Jun 25, 2010 9:12 pm

Mon Oct 11, 2010 3:39 pm

I have a Hyper-V cluser running with a StarWind HA SAN configuration and I have no problems at all. This does sound like an multi-pathing issue. I'm running Starwind 5.4 for my SAN and Windows 2008 R2 server for the Hyper-V hosts. I am also using a CSV disk configuration in my Hyper-V cluster. No problems or corruption encountered so far.
Last edited by hixont on Mon Oct 11, 2010 4:44 pm, edited 1 time in total.
User avatar
Aitor_Ibarra
Posts: 163
Joined: Wed Nov 05, 2008 1:22 pm
Location: London

Mon Oct 11, 2010 4:28 pm

mooseracing wrote: Is there anyway to verify MPIO installed correctly? I remember enabling it and then having to reboot.
Yes. There's 2 things you need to do.

On each Hyper-v server, open the MPIO control panel from Administrative Tools. On the first tab, you should see a list of MPIO devices, one of them should be the Microsoft iSCSI one. It won't be there until you set up a connection in the iSCSI initiator that uses MPIO. You may need to recreate the connection if you set it up before MPIO was installed.

For each HA target, in the iSCSI Initiator control panel, select the target or it's partner and click devices, then click MPIO. For a target where both starwind nodes are working and in sync, you should see two paths listed, one for each server.

If a starwind server is down or doing a full sync, the iSCSI initiator will show "reconnecting" in the target list, and the MPIO will only show the working path.

I don't know for 5.4, but for 5.5 beta, if you use write back cache then fast sync is disabled, and an auto resync will be done using full synchronisation. While a full sync is in progress, the node that is resyncing will not accept connections, so if you were to say reboot starwind and then reboot a hyper-v machine before the re-sync was complete, then the connection from Hyper-v will not succeed and the initiator will not try to reconnect until you reboot again. It should still connect to the other node OK though.
User avatar
mooseracing
Posts: 91
Joined: Mon Oct 11, 2010 11:55 am

Mon Oct 11, 2010 4:53 pm

Well I would say MPIO is installed correctly then. in MPIO properties, MSFT2005iSCSIBusType_0x9 is the first Device ID list on both of my Hyper V servers.

Then when looking at the iSCSI drive (the smaller witness disk) MPIO shows both Starwind servers connected and MPIO setup for Round Robin.

I won't be able to check the other drives since they are in re-sync. Probably won't be until tomorrow, as they are at 20% currently after about 4hrs.
User avatar
mooseracing
Posts: 91
Joined: Mon Oct 11, 2010 11:55 am

Tue Oct 12, 2010 2:01 pm

It is kind of looking better :lol:

I logged on today to see that Synchro completed but the IMG targets had a red triangle over them. So I just deleted them and created new ones. What was odd was that out of the 3 that completed Syncing, only my small witeness disc had the log.dat file.

The new one I created I did NOT do a first sync. Added the storage to my cluster, then reboot one of the Starwind HA boxes. The cluster did NOT lose the connection to the target. The HA box is up and resynching and I still have a connection to it. Right now I am moving a virtual over to test it out more.
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Wed Oct 13, 2010 7:39 am

Aitor, great thank's for your help here :)
As for synchronization process it will switch to full if the data difference exceeds applicible amounts (in this case full sync is performed to be sure that every bit is correct)
Max Kolomyeytsev
StarWind Software
User avatar
mooseracing
Posts: 91
Joined: Mon Oct 11, 2010 11:55 am

Tue Oct 19, 2010 12:03 pm

Ok, so after recereating them MPIO seems to be working, but I am still having some odd issues synching.

I just upgraded to the latest minor release and am in process of another resync. Either way though it seems like clockwork when I come in, in the morning, my servers are out of sync and I have to do a manaual full sync. I'm not understanding why the Starwind software isn't auto doing a sync?

I know there is semi-high load on the one partner server at night since it is doing backups, but wouldn't Starwind be able to do syncs still?
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Tue Oct 19, 2010 2:57 pm

Have you updated from a version without autosync?
If so you will need to recreate the HA device using existing images, enabling the autosync and specifying not to synchronize virtual disks.
Max Kolomyeytsev
StarWind Software
User avatar
mooseracing
Posts: 91
Joined: Mon Oct 11, 2010 11:55 am

Fri Nov 05, 2010 1:34 pm

So things have been going good since I update to 5.4 on both nodes, except on Monday and Last night I lost connect to my partner target.

I get these in the event log:
The initiator could not send an iSCSI PDU. Error status is given in the dump data.
Connection to the target was lost. The initiator will attempt to retry the connection.
Initiator failed to connect to the target. Target IP address and TCP Port number are given in dump data.

Then not even a second later reconnected.

It seems to drop out while the partner is running it's backups of our data. Is is feasible that the server is under 2 much of a load to respond to starwind?
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Fri Nov 05, 2010 2:24 pm

Not really. Network layer failure. Upcoming V5.5 will have heartbeat and V5.6 - multiple connection channels so you can mark this one as "one the way to South" (c) ...
mooseracing wrote:So things have been going good since I update to 5.4 on both nodes, except on Monday and Last night I lost connect to my partner target.

I get these in the event log:
The initiator could not send an iSCSI PDU. Error status is given in the dump data.
Connection to the target was lost. The initiator will attempt to retry the connection.
Initiator failed to connect to the target. Target IP address and TCP Port number are given in dump data.

Then not even a second later reconnected.

It seems to drop out while the partner is running it's backups of our data. Is is feasible that the server is under 2 much of a load to respond to starwind?
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
User avatar
mooseracing
Posts: 91
Joined: Mon Oct 11, 2010 11:55 am

Fri Nov 05, 2010 6:15 pm

Well these are both direct connected, I replaced the cord. Guess I Will go from there and work my way up.
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Fri Nov 05, 2010 8:35 pm

Please keep us updated about your progress. Also we'd be happy if you'd take part in V5.5 Beta program.
mooseracing wrote:Well these are both direct connected, I replaced the cord. Guess I Will go from there and work my way up.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
User avatar
mooseracing
Posts: 91
Joined: Mon Oct 11, 2010 11:55 am

Mon Nov 08, 2010 6:19 pm

After replacing the cable now I am getting the same error on both iSCSI machines, whereas before it was only on the Primary.

I hate to ask but is there an easier way to track down which end the problem is on. I'm feeling now I need to shutdown one box and install another NIC. Then if that doesn't fix it shutdown the other box and install a NIC. Alot of time inbetween and downtime I don't really want as I am behind as it is on this project.

I'd love to do the beta but I have no time for testing in the past year or upcoming since the dept has shrunk, which has created alot of extra work :?

Off to bang my head some more :mrgreen:
Post Reply