understanding HA setup and performance
Posted: Thu Apr 08, 2010 9:07 pm
Hi All,
I am new here and this is my first post. I am testing out HA on startwind iscsi latest release and have some questions regarding failover performance. My setup involves 2 servers (sever1 and server2 acting as HA) and hypever v server that runs vm. I followed the technical guide call "hyperv and high availability storage." I am not using microsoft clustering in anyway since I only have one hyperv server. What I did was setup HA storage on server1 and sever2 and using microsoft iscsi on hyperv server, I connected to clustered HA storage which is on both sever1 and server2. Server 1 was set as a primary and server 2 was secondary. I was able to export window7 vm to shared HA storage on hyperv server. Then imported back into hyperv server. Windows 7 vm run very smoothly while server1 was set as a primary (mpio is working and set it to round robin as suggested) and server1 and server2 are fully synced before the import of windows7. So now it is time to test HA of Starwinds software. windows 7 was running and pinning to google.com at this point. I then killed starwind process from task manager on server1 (testing HA), and windows7 continues without any issues and performance. A couple of seconds later starwinds software notified the loss of connection from server1. At that point I restarted the service on server1 and logged back onto it. Now I see yellow exclaimation mark indicating there is a problem with HA and sync. At this point windows 7 is clearly running from server 2 since I intentionally killed process on server1. I then select "fully sync" to resync on server1. First of all it takes forever to start. And then I start notcing performance problem on windows 7 vm. Very slow responses on everything. For example, when I click x to close, it took a while to close it. So, I wanted to investigate futher with HA and see what is going on. On server 2 where the image file (HA setup) is, the image file was barely 1GB. I intially setup HA disk for 50GB on both server as HA storage. That surprised me since I am kind of thinking this would be close to at least 20, 30 GB which is the default install size of windows 7. On the primary, server 1, the image file was slightly more than 50GB which seems to make sense.
The sync was so long, I decided to switch to "fast sync" and surprisingly still slow.
So my question is how does HA really work when it failed over? Is performance hit expected this bad for one vm? What if there are many vms on shared HA storage? What is the performance hit in term of % or number on failover server?
I have fairly decent systems on this testing and performance hit was really slow and concerned me with futher testing for production use. Hyperv server is dual quad core, raid 1 and raid 5 for data, dell 2900 with hardware raid card. Server1 and server2 is also 2900 box with 12GB of memory and raid 5 for data where HA storage is and it is dell hardware raid.
So it this how HA work by slowly failover or something wrong with my setup? Can anyone answer this question because I am afraid how I can run multiple production vms on HA storage this slow.
Thank you for your answer in advance.
I am new here and this is my first post. I am testing out HA on startwind iscsi latest release and have some questions regarding failover performance. My setup involves 2 servers (sever1 and server2 acting as HA) and hypever v server that runs vm. I followed the technical guide call "hyperv and high availability storage." I am not using microsoft clustering in anyway since I only have one hyperv server. What I did was setup HA storage on server1 and sever2 and using microsoft iscsi on hyperv server, I connected to clustered HA storage which is on both sever1 and server2. Server 1 was set as a primary and server 2 was secondary. I was able to export window7 vm to shared HA storage on hyperv server. Then imported back into hyperv server. Windows 7 vm run very smoothly while server1 was set as a primary (mpio is working and set it to round robin as suggested) and server1 and server2 are fully synced before the import of windows7. So now it is time to test HA of Starwinds software. windows 7 was running and pinning to google.com at this point. I then killed starwind process from task manager on server1 (testing HA), and windows7 continues without any issues and performance. A couple of seconds later starwinds software notified the loss of connection from server1. At that point I restarted the service on server1 and logged back onto it. Now I see yellow exclaimation mark indicating there is a problem with HA and sync. At this point windows 7 is clearly running from server 2 since I intentionally killed process on server1. I then select "fully sync" to resync on server1. First of all it takes forever to start. And then I start notcing performance problem on windows 7 vm. Very slow responses on everything. For example, when I click x to close, it took a while to close it. So, I wanted to investigate futher with HA and see what is going on. On server 2 where the image file (HA setup) is, the image file was barely 1GB. I intially setup HA disk for 50GB on both server as HA storage. That surprised me since I am kind of thinking this would be close to at least 20, 30 GB which is the default install size of windows 7. On the primary, server 1, the image file was slightly more than 50GB which seems to make sense.
The sync was so long, I decided to switch to "fast sync" and surprisingly still slow.
So my question is how does HA really work when it failed over? Is performance hit expected this bad for one vm? What if there are many vms on shared HA storage? What is the performance hit in term of % or number on failover server?
I have fairly decent systems on this testing and performance hit was really slow and concerned me with futher testing for production use. Hyperv server is dual quad core, raid 1 and raid 5 for data, dell 2900 with hardware raid card. Server1 and server2 is also 2900 box with 12GB of memory and raid 5 for data where HA storage is and it is dell hardware raid.
So it this how HA work by slowly failover or something wrong with my setup? Can anyone answer this question because I am afraid how I can run multiple production vms on HA storage this slow.
Thank you for your answer in advance.