Xenserver DR Best Practices for HA
Posted: Mon Jul 19, 2010 7:07 pm
Please publish something on Xenserver best practices for HA. This seems to be a "broken" part of the software. If a server goes down, we need to know the steps to take for it to come back up.
Using (2) 4Tb iSCSI units:
Examples:
1. Storage server 1 needs to be rebooted for maintenance. HA properly kicks over to Storage server 2, however multipath now shows only 1 of 1 path active, even after storage server 1 comes back online. If you try to do a sync at that point (or have the setup so that it is automatically set to sync) it takes FOREVER to the point it renders the system unusable (1% after almost 2 hours????). Hence the only way to get it back onine and in HA mode is to force remove the devices. What is the regular process for something like this to re-establish the multi-path environment and get HA back up and functional?
2. Needed to power either both storage servers or Xenservers down for maintenance. Servers come back online and iSCSI virtual disk storage shows it is not connected at all. No commands will allow it to reconnect properly as it would show it could no longer log in (iSCSI login failure it claimed). Possibly have to start over and recreate the entire setup. Again, what is the process here?
Apparently, anytime we actually kick into HA mode, we have to start all over again with the console commands to get back up and running. There is to way to automatically reconnect back to the original or even resync the devices in a reasonable timeframe to establish HA (it seems that when you resync, the entire storage array is offline until the sync completes???)
Please let me know what processes have been developed so that we can use this in a true HA environment in Xen.
Using (2) 4Tb iSCSI units:
Examples:
1. Storage server 1 needs to be rebooted for maintenance. HA properly kicks over to Storage server 2, however multipath now shows only 1 of 1 path active, even after storage server 1 comes back online. If you try to do a sync at that point (or have the setup so that it is automatically set to sync) it takes FOREVER to the point it renders the system unusable (1% after almost 2 hours????). Hence the only way to get it back onine and in HA mode is to force remove the devices. What is the regular process for something like this to re-establish the multi-path environment and get HA back up and functional?
2. Needed to power either both storage servers or Xenservers down for maintenance. Servers come back online and iSCSI virtual disk storage shows it is not connected at all. No commands will allow it to reconnect properly as it would show it could no longer log in (iSCSI login failure it claimed). Possibly have to start over and recreate the entire setup. Again, what is the process here?
Apparently, anytime we actually kick into HA mode, we have to start all over again with the console commands to get back up and running. There is to way to automatically reconnect back to the original or even resync the devices in a reasonable timeframe to establish HA (it seems that when you resync, the entire storage array is offline until the sync completes???)
Please let me know what processes have been developed so that we can use this in a true HA environment in Xen.