Poor performance

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Chick3nfingding
Posts: 5
Joined: Mon Feb 13, 2017 1:07 am

Mon Feb 13, 2017 1:58 am

Hi Starwind Support,

We have 2x Dell R730's in a lab environment here with the following specs:

- 8x INTEL SSDSC2BX40 (400GB EA)
2x in RAID1 for OS
6x in RAID10 for Starwind Image file drive (D:)
- 2x Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
- 128.00 GB RAM
- 1x Intel X710 10GbE card.

We have MPIO configured for iSCSI and the config is set for 127.0.0.1 as the active and the remote host as standby. I can confirm the loopback accelerator driver is working as per the logs.

We are seeing quite poor performance on non-replicated and replicated image files.

I've attached a screenshot showing raw disk performance for D: where the image files sit. Easily capable of 6GBps.

Replicated and non-replicated imagefiles show similar results. I'm using ATTO benchmark.
Raw disk performance
Raw disk performance
Raw.png (13.14 KiB) Viewed 61259 times
Here is a imagefile replicated (thick-prov'd, 4GB cache)
Thick-prov
Thick-prov
thick-prov.png (12.61 KiB) Viewed 61259 times
Here is a imagefile replicated (LSFS 4GB cache)
lsfs
lsfs
lsfs.png (13.13 KiB) Viewed 61259 times
Let me know if you need any more information.
Chick3nfingding
Posts: 5
Joined: Mon Feb 13, 2017 1:07 am

Mon Feb 13, 2017 3:18 am

starwindtest
starwindtest
starwindtest.png (65.75 KiB) Viewed 61255 times
See above for results from your testing tool as well.
Michael (staff)
Staff
Posts: 319
Joined: Thu Jul 21, 2016 10:16 am

Wed Feb 15, 2017 5:31 pm

Hello!
Please update StarWind to the latest build. The main update steps are described in KB article: https://knowledgebase.starwindsoftware. ... d-version/
Please check RAID settings on both nodes. It is recommended to use the following settings for SSD:
Disk cache policy: default;
Read policy: No Read Ahead;
Write Policy: Write Through;
Stripe Size: 64 KB
Firstly, create devices without L1 cache, connect them in iSCSI Initiator,run StarWind test one more time and share results with us!
Chick3nfingding
Posts: 5
Joined: Mon Feb 13, 2017 1:07 am

Fri Feb 17, 2017 5:43 am

Hi Michael,

I've upgraded both servers to version 8.0.0.10547 which is the latest as per your release notes.

Both servers have the following configuration as follows on the RAID10 virtual disk:

Disk cache policy: default;
Read policy: No Read Ahead;
Write Policy: Write Through;
Stripe Size: 64 KB

Here is the same test re-run:
latest.png
latest.png (63.81 KiB) Viewed 61159 times
Any other ideas?
Michael (staff)
Staff
Posts: 319
Joined: Thu Jul 21, 2016 10:16 am

Tue Feb 21, 2017 11:35 am

Hello!
Thank you for the feedback.
Please follow the steps below:
1. Restart StarWind nodes one by one - check that devices are synchronized on both nodes before the restart.
2. Change settings for StarWind device and target by assigning NUMA node in StarWind configuration file (C:\Program Files\StarWind Software\StarWind\StarWind.cfg).
It is recommended to set different nodes for devices and targets:
- device should be assigned to Node 0;
- target should be assigned to Node 1;
2.1 Stop StarWindVSAN service on one node;
2.2. Open StarWind.cfg there, find out the line for the corresponding device and set information about NUMA node (see example below):
<device file="My Computer\D\CSV1\CSV1.swdsk"name="imagefile1" node="0" />
2.3. Find out the line for the corresponding target and set information about NUMA node as well:
<target name="iqn.2008-08.com.starwindsoftware:test-csv1" devices="HAImage1" alias="CSV1" clustered="Yes" node="1"/>
2.4. Save StarWind.cfg and start StarWindVSAN service;
2.5. Wait while synchronization will be completed and repeat steps 2.1-2.4 on the partner node.
3. Run performance tests one more time and let us know about the results.
Alexey Sergeev
Posts: 26
Joined: Mon Feb 13, 2017 12:48 pm

Wed Feb 22, 2017 7:50 am

Do you recommend this settings for all kind of configurations or only in this particular case?
What about flat devices on RAID 10 with spindle drives?
There is no such information about separating NUMA nodes in your best practices.
Michael (staff)
Staff
Posts: 319
Joined: Thu Jul 21, 2016 10:16 am

Thu Feb 23, 2017 12:29 pm

Hello!
It makes sense to tune the performance only for fast (SSD/NVMe) storage. We will publish a separate document about it in the nearest future.
Chick3nfingding
Posts: 5
Joined: Mon Feb 13, 2017 1:07 am

Thu Feb 23, 2017 11:54 pm

Hi Michael

I've done what you have requested. Here is the latest test results:

As you can see it's still lacking - do you have any other ideas?
latest.png
latest.png (67.18 KiB) Viewed 61053 times
Michael (staff)
Staff
Posts: 319
Joined: Thu Jul 21, 2016 10:16 am

Fri Feb 24, 2017 3:22 pm

Hello!
Thank you for updating us.
Have you restarted StarWind nodes one by one?
Please try adding one more 127.0.0.1 session for the "local" targets. Additionally, try to set another MPIO policy - for example, Least Queue Depth.
I believe it should help.
Chick3nfingding
Posts: 5
Joined: Mon Feb 13, 2017 1:07 am

Tue Feb 28, 2017 4:26 am

Hi Michael

We're not getting anywhere still.

I've added additional local paths to 127.0.0.1 and changed the mpio policy to least queue depth, then back to failover only. There is no difference in performance.

Did one of your reps want to log on remotely to the servers to have a look?
Michael (staff)
Staff
Posts: 319
Joined: Thu Jul 21, 2016 10:16 am

Wed Mar 01, 2017 2:13 pm

Hello!
Yes, please submit a support case here: https://www.starwindsoftware.com/support-form.
Mitmont
Posts: 4
Joined: Thu Sep 29, 2016 6:22 pm

Mon Mar 27, 2017 3:42 am

I have the same type of issue where the performance of the members of our node is very poor. In Resource Monitor we see 300-1000ms latency response times, In the VMs is even worse and user experience is very poor. After moving the disks to the second member, rebooting, returning the disks to the new rebooted member the same condition exists. We've had a SW support engineer doing a remote session, tweaked the MPIO to Lowest Queue length with no change, then recommended to upgrade to the 8.0.10779 version. We were also told that our configuration for L1 and L2 was incorrect and that best practices changed weekly or every few weeks. We're running Dell gen 13 servers with 768gb RAM, Dual Procs, 12-6tb HDD and 4 1.6tb NVME drives. Our L1 and L2 cache configurations are 225gb L1 and 1.4 L2 cache for 16TB HDD partitions. These seem to be roughly the 1 and 10 % configuration recommended by SW last year. Our Disk configuration is RAID 5 with the L1 and L2 cache as shown before. We have a high VM count in a converged hyper-v environment. Anyone running large quantities of VM disks on SW? We're at about 40% free space on the 3 partitions. What free space is anyone else running? What is anyone elses response times? One other thing we even saw the same poor performance on SW SSD Disk where SW showed over 400ms on the one VM disk on the SSD. We have two other SW Nodes that are running correctly where response times are 1-3ms and are configured in the same way with the same type of loads.

Thanks
Ivan (staff)
Staff
Posts: 172
Joined: Thu Mar 09, 2017 6:30 pm

Thu Mar 30, 2017 7:38 pm

Hello Mitmont,
Please log another support case here https://www.starwindsoftware.com/support-form and one of our engineers will be more than happy to double check your configuration.
dfir
Posts: 10
Joined: Sat Apr 29, 2017 6:08 pm

Sat Apr 29, 2017 8:52 pm

Hi Chick3nfingding

Have you checked your iSCSI NIC settings within Windows on the Starwind server?
I saw a dramatical increase in throughput when I changed the NIC settings profile from the standard one called "Standard Server" to "Low Latency" (within the advanced options for the NIC).
I have the X520 NIC, so I'm not sure if this is an option for you.

My setup is 8xSSD in a RAID5 without any caching on the controller nor in the Starwind iSCSI. I'm only using 1x10GbE and it is now saturating that link in one direction (dunno why read is slower than write).
The only two changes I made was to enable Jumbo Frames and switch to the Low Latency profile.

Local RAID:
local storage.PNG
local storage.PNG (29.74 KiB) Viewed 60603 times
Jumbo Frames enabled and Standard Server profile:
profile-standardserver.PNG
profile-standardserver.PNG (25.85 KiB) Viewed 60603 times
Jumbo Frames enabled and Low Latency profile:
profile-lowlatency.PNG
profile-lowlatency.PNG (27.2 KiB) Viewed 60603 times
dfir
Posts: 10
Joined: Sat Apr 29, 2017 6:08 pm

Sat Apr 29, 2017 8:52 pm

NIC settings.PNG
NIC settings.PNG (52.24 KiB) Viewed 60603 times
Post Reply