Problem with the Flash Cache (L2 Cache) Performance

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
sa86
Posts: 2
Joined: Wed Aug 10, 2016 8:15 am

Wed Aug 10, 2016 1:24 pm

Hello

i'm testing Starwind Virtual SAN v8 on 2 retired servers with the Free License and have a Problem with the L2 Cache Performance.
From what i read in the whitepapers the l2 cache with default settings (write-through mode) should improve read operations but in my tests this was not case.
i hobe this post is not too overloaded with information... ;)

Here are the Config Details:
Hardware:
  • Server: HP SE326M1
    Raid Controller: HP Smart Array P410 with 1GB Cache + BBU
    Disks: 24x600GB SAS 10k, 1x120GB SATA SSD (a old OCZ Vertex 3, just for testing the caching)
    CPU: 2xXeon X5660 (6x2.80GHz)
    RAM: 144GB

    iSCSI Client
    Server: HP SE316M1
    Raid Controller: HP Smart Array P410 with 1GB Cache + BBU
    CPU: 2xXeon L5520 (4x2.27GHz)
    RAM: 144GB
Storage Config:
  • 2x600GB SAS 10k RAID1 for the OS
    22x600GB SAS 10k RAID10 Stripe Size 64KB
    1x120GB SATA SSD RAID0 Stripe Size 64KB (Disk for Cache testing)
Software:
  • iSCSI Server OS: Server 2012 R2 fully patched
    iSCSI Client OS: Server 2012 R2 fully patched
    Starwind Version: 8.0.0.9781
Network Config (Server/Client):
  • 2x1GBit NIC directly connected from Target to Client
    Everything except IPv4 on these nics is disabled, WINS/NetBIOS is disabled
    Jumbo Frames are set to 9014 Bytes on each NIC, Offloading Options are also active
    Each NIC is in a diffrent subnet
    MPIO for iSCSI Disks is activated and working
Starwind Disk Config
  • 1x500GB Image on 22x600GB RAID10, L1 Cache 3GB Write-Through
    1x500GB Image on 22x600GB RAID10, L1 Cache 3GB Write-Through, L2 Cache Write-Through 50GB Image on RAID0 SSD
    1x50GB Image on RAID0 SSD, L1 Cache 3GB Write-Through
Benchmark Tests (IOMeter)

Disk Target Settings
  • 1 Manager / 10 Worker (On the Host OS) or 5 Manager (5 Hyper-V VM's) / 2 Worker
    Maximum Disk Size 20971520
    # Of Outstanding IOs 64
    Ramp Up Time 0
    Run Time 2 Minutes
Access Specifications
  • 4K; 100% Write; 100% Random
    4K; 100% Read; 100% Random
    512B; 100% Write; 100 Sequential (Max IOPS Write)
    512B; 100% Read; 100 Sequential (Max IOPS Read)
    512KB; 100% Read; 100% Sequential (Max Troughput Read)
    512KB; 100% Write; 100% Sequential (Max Troughput Write)
to measure/compare Performance Each Test was performed on:
  • iSCSI Target Server 100GB RAM Disk
    iSCSI Target Server Host OS - DAS (Direct attached Storage)
    iSCSI Target Server Hyper-V VM's - DAS of the Target Server
    iSCSI Client Host OS - on iSCSI Target
    iSCSI Client Hyper-V VM's - on iSCSI Target
to setup the vm benchmark (1xiometer master vm + 5xbenchmark vm's) environment i used powershell to automate everything and to have the same settings for each benchmark

Each vm has:
  • 2 vProcessors
    2GB RAM
    1x Differencing disk for booting the os from an server 2012 r2 core image
    1x Disk which is located on the storage i wan't to test.
Here are some of the Benmark result
4K; 100% Read; 100% Random
Image

4K; 100% Write; 100% Random
Image

512B; 100% Read; 100 Sequential (Max IOPS Read)
Image

512B; 100% Write; 100 Sequential (Max IOPS Write)
Image

512KB; 100% Read; 100% Sequential (Max Troughput Read)
Image

512KB; 100% Write; 100% Sequential (Max Troughput Write)
Image
After looking at the benchmark result (sorry for the confusing naming conventions ;)) i see

The SSD can provide ~21k 4k Random Read IOPS and ~11k 4k Random Write IOPS
The 22x600GB SAS RAID10 give me max ~14k Random Read IOPS and ~11k 4k Random Write IOPS

I also get the same performance over iSCSI (with a little more response time)


But if i compare the SSD vs 22x600GB SAS RAID10 vs 22x600GB SAS RAID10 + SSD Cache i have a huge performance impact
The Volume with L2 SSD Cache should have ~21k 4k Read IOPS or at least ~14k from the Raid10 but it goes down to ~1k also the Write Operations are a little degraded these should never pass the l2 cache with write-trough?

I also tested l2 in write-back mode, but it did not improve write iops.. (i know this will kill my ssd faster but that's ok)

Can anyone help me to understand why a volume with l2 cache is slower than without the cache? do i miss something?

Benchmark results - SSD vs 22x600GB SAS RAID10 vs 22x600GB SAS RAID10 + SSD Cache
4K; 100% Read; 100% Random
Image

4K; 100% Write; 100% Random
Image
I also would like to submit a feature request
Editing/Adding/Removing Cache Settings of existing volumes should be possible via GUI, editing XML Files is not funny ;) (maybe you work on that already)

I have run a lot more tests if you need more benchmark results but i think thats enough for know :)

cheers
Al (staff)
Staff
Posts: 43
Joined: Tue Jul 26, 2016 2:26 pm

Fri Aug 12, 2016 3:53 pm

Hello sa86,

Thank you for your detailed explanation.

Could you tell me please build number you are testing?
sa86
Posts: 2
Joined: Wed Aug 10, 2016 8:15 am

Sat Aug 13, 2016 8:19 am

Hello,

I only get the version number (alredy listed above) from the management console: Starwind Version: 8.0.0.9781 (are the last 4 digits the build number?)

cheers
Al (staff)
Staff
Posts: 43
Joined: Tue Jul 26, 2016 2:26 pm

Thu Aug 18, 2016 4:15 pm

Hello sa86,

At the moment we are testing experimental L2 cache algorithm.

To enabe it:
1. Please choose server, on which you want to enable it.
2. Click configuration tab.
3. Choose experimental features and click "switch to experimental modules".

We would be pleased, if you shared your results with a community.
RyanNW
Posts: 5
Joined: Fri Apr 29, 2016 3:21 pm

Fri Aug 26, 2016 5:28 pm

Hello,

I am facing the same situation. When using an IOMeter test file that can fit into the L1 Cache (e.g. 500MB file with 1G L1 Cache), the performance is > 70K iops. But when using a file that cannot fit into the L1 Cache (e.g. 2G file with 1G L1 Cache and 100G L2 Cache), the performance dropped to ~1K.

However, when switched to Experiment mode, the performance was boosted back to ~ 60K iops. Does anyone know when the new L2 algorithm be released?

Thanks in advance
Ryan
Al (staff)
Staff
Posts: 43
Joined: Tue Jul 26, 2016 2:26 pm

Wed Aug 31, 2016 2:54 pm

Hello RyanNW,

New Algorithm of L2 cache should be reliesed approximately in 2 months.

Please follow our upcoming releases.

Thank you!
Post Reply