The Latest Gartner® Magic Quadrant™Hyperconverged Infrastructure Software
Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)
dmansfield wrote:Good Afternoon, moments after the time change on Sunday, our 10 Gigabit sync channel was dropped and our iSCSI connections lost. Errors compounded from there and both servers self-instructed a reboot. Because both servers rebooted at approximately the same time, our HA datastores were both thrown out-of-sync, and we were forced into an 8+ hour recovery. We have two StarWind HA Enterprise servers running primarily HA datastores. The server are Dell T710s running Windows Server 2008 R2. We have contacted Dell and they have scoured our log files and have found no hardware errors. Please assist us in determining the cause of our issues on Sunday so that we can prevent a recurrence. Thank you.
dmansfield wrote:By "Good" are you saying the proposed changes from Dell are okay implement and won't conflict with StarWind? I am not sure what Dell is refering to about the iscsi settings. They did say that it may be fine but it is not a setup that they are used to seeing. Can we do a remote session with one of your engineers to take a quick look?
dmansfield wrote:Update for anyone running Intel 10gb nics for the sync channel. We were running two direct connections in a team type of "Adapter Fault Tolerance" with one nic active and the other nic in standby. This teaming is what was causing the servers to crash. We took away the team and are running with just one nic on each side enabled and everything works now.
clayton@mcc911.org wrote:you still never said why the servers rebooted??