HA Node went down hard - how do I bring it back online?

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Fri Feb 01, 2013 12:00 pm

I`m glad that you have find out how to change the IP of Mgmt interface.

For now we do not have documentation that you are asking for. But basically everything is pretty much simple - you just need to use Add Replica and point it at the new (healthy) node, and specify the SyncChannel and HeartBeat interfaces of course, that is it
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Fri Feb 01, 2013 12:14 pm

StarWind is smart enough to take care of that. Anyway, until the targets on the second SAN aren't synchronized - they won't connect to the client servers.
So it's just specifying the new second node parameters and choosing the current alive node as synchronization source.
Max Kolomyeytsev
StarWind Software
jhamm@logos-data.com
Posts: 78
Joined: Fri Mar 13, 2009 10:11 pm

Fri Feb 01, 2013 2:12 pm

I’m sorry – I need some more specifics:

“But basically everything is pretty much simple - you just need to use Add Replica and point it at the new (healthy) node”

But what steps do I need to do beforehand?

1) Do I need to remove the old replicas and targets on the bad node first before I add the replica back from the good node? Does the bad node need to have its old targets/devices wiped out first?

2) Do I need to remove the iSCSI connections from the HyperV pointing to the bad node prior to doing the above procedure?

And then after the nodes are synced up, do I need to create new HyperV iSCSI connections after the nodes are synced up?

Thanks!
Jeff
jeddyatcc
Posts: 49
Joined: Wed Apr 25, 2012 11:52 pm

Fri Feb 01, 2013 4:17 pm

I will try to help, but please understand that I am speaking from my own experience and to keep common sense going as you recover.

These are the steps I have taken in the past to "fix" this kind of thing when I purposefully broke it in testing:

1. Open StarWind Management console on known good machine.
2. Remove the bad node as a replica for all of the HA devices.
3. Rebuild new machine and provision storage like in the old dead one. I kept the IP addresses, but changed names.
4. Install the same version of Starwind that is on the known good system. (Haven't tried with different versions, it might work.)
5. Back in Starwind on the Good machine, add replicas for the newly rebuilt server creating new img files and full synch, selecting known good as Source.
6. While it was synchronizing I went to my cluster machines and rediscovered the iSCSI targets. Deleted any favorites for the old machine. And add connections to the new targets. They will say connecting until synchronization finishes.

Overall, not so different as when doing windows updates or creating a new HA device, you just need to be careful which node you are using as source for the sync. To Max and Anatoly's point about StarWind taking care of itself, they are correct, as soon as the other node went down, the good server should become the primary.
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Mon Feb 04, 2013 10:39 am

I can confirm that steps that was provided by jeddyatc are correct. One thing that I`d like to add is that if you will run different builds they will do cooperate well, but we strongly recommend to have same builds on all SAN boxes.
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
jhamm@logos-data.com
Posts: 78
Joined: Fri Mar 13, 2009 10:11 pm

Tue Feb 05, 2013 8:25 pm

Thanks jeddyatcc - your instructions worked perfectly. Glad to finally be back up with 2 HA nodes! 8)
jeddyatcc
Posts: 49
Joined: Wed Apr 25, 2012 11:52 pm

Thu Feb 07, 2013 12:58 pm

Good to hear!!
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Mon Feb 11, 2013 11:30 am

I don't think it's outlined in a pdf, although it's available in the help section of StarWind -
Working with devices -> High availability device ->
Max Kolomyeytsev
StarWind Software
Post Reply