Write Cache. Bad Performance. [RUS]

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

User avatar
flysats
Posts: 12
Joined: Mon Oct 29, 2012 9:34 pm
Location: Sankt-Petersburg. CCCP

Mon Oct 29, 2012 9:48 pm

I want to ask a question about configuring caching.

I conduct tests on the latest public version of Starwind.
w2008 startwind -> esx 5 vmfs -> w2008 test
cache size on the target of 16 gigabytes
testing over the ntfs file system
Test file size 4GB
benchmarks crystaldiskmark (but not essential)
cache on the target set to WB

See acceleration of read and write speed can not see?

Own question:
1. What caching policy record? why, when the cache size larger than the dataset it does not act as a ram disk?
2. why caches are distributed separately on the target, and not globally on all target host?
3. Why, if synchronous transmission is faster than asynchronous, it should not be the default?
4. with a large cache and configured 2 or 3 node cluster - is it safe to synchronize caches? if so, does not understand why there is question number 1

I really hope that I have something misunderstood, and it's not design flaws starwind target.
makes the fourth visit to starwind (started in 2007) - and still can not find treatment for themselves.

Regards, Stanislav.
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Mon Oct 29, 2012 11:13 pm

First of all you really need to replace CrystalMark with something more trust-worthy. For example Intel I/O Meter. Second 4GB of image is not enough to test anything.

Back to your question. Synthetic tools generate random pattern and make cache dirty real soon. So they should report WRITEs faster and READS at a disk speed
because they either read non-cached data (if address is not in memory there's no acceleration) or read cached data purged out of cache because it's overcommited with
a new data - natural way for sequential access pattern.

1) They should not. Cache is a non-addressable buffer and RAM is just a block device in a RAM memory. Fundamentally different. We may consider
poping up a dialog telling it's better to stick with a RAM disk for testing purpose (if assigned cache > virtual disk size).

2) Because that's they way caches work. You cannot move cache memory from one hard disk and assign it to the other. The same about RAID controllers -
they use own memory and don't share it with a CPU thru the PCIe bus. Making long story short: we don't know your system workload so we're not at the position
to take memory from one device and re-assign it to other or whatever. We're however been long asked about "dynamic" cache policy and that's what you'll see
in the upcoming version.

3) Because they both solve totally different purposes. Sychronous mirroring is for creating a fault-tolerant clustered devices an asynchronous mirroring is for
creating a copy of data over slow WAN connection. DR. Disaster recovery.

4) It's unsafe not to synchronize the caches. If caches run out of sync you'll have all the data scrambled @ some point.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
User avatar
flysats
Posts: 12
Joined: Mon Oct 29, 2012 9:34 pm
Location: Sankt-Petersburg. CCCP

Tue Oct 30, 2012 3:55 am

ok
test again
iometer

?
...

later...

...

and later...
...

:shock:

-------------
...
set
...
hero
...
Last edited by flysats on Tue Oct 30, 2012 8:51 pm, edited 1 time in total.
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Tue Oct 30, 2012 12:04 pm

Please provide some I/O Meter results for underlying media. In this case - disk you're putting IMG files on.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
User avatar
flysats
Posts: 12
Joined: Mon Oct 29, 2012 9:34 pm
Location: Sankt-Petersburg. CCCP

Tue Oct 30, 2012 2:00 pm

test local target storage
4HDD in software RAID Windows 2008. RAID level 0.
IOMETER same pattern

result: 600 iops / 303 MBs per second

need more test?
Last edited by flysats on Tue Oct 30, 2012 8:51 pm, edited 1 time in total.
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Tue Oct 30, 2012 2:52 pm

These are fine. The only thing which looks strange is a 512KB transfer size. Can you run a short test for:

1) Physical array
2) Mapped IMG on it a) with and b) w/o cache enabled

with ATTO Disk Benchmark from here:

http://www.attotech.com/products/produc ... od=70&sku=

and post three correspinding graphs here. We'll see where gap is. Thanks!
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
User avatar
flysats
Posts: 12
Joined: Mon Oct 29, 2012 9:34 pm
Location: Sankt-Petersburg. CCCP

Tue Oct 30, 2012 3:20 pm

ATTO Bench.

...
-
...
-
...
Last edited by flysats on Tue Oct 30, 2012 4:03 pm, edited 2 times in total.
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Tue Oct 30, 2012 4:00 pm

Got it, thank you! Indeed pretty much confusing... Investigating.

P.S. "Direct I/O" should be always enabled within this test as in other case it's Windows file cache messing the whole thing up.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
User avatar
flysats
Posts: 12
Joined: Mon Oct 29, 2012 9:34 pm
Location: Sankt-Petersburg. CCCP

Tue Oct 30, 2012 4:09 pm

need more test or remote access to test vm?

----------

ok continue...

:shock: WOW! I see working cache
(IOMETER create test file)
Image

Run IOMETER... and again... :oops:
Image
User avatar
flysats
Posts: 12
Joined: Mon Oct 29, 2012 9:34 pm
Location: Sankt-Petersburg. CCCP

Thu Nov 01, 2012 10:47 pm

is there any news on this issue?
Why and for how long you expect?
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Thu Nov 01, 2012 10:54 pm

We have a Native SAN release this week so software engineer pinpointed to look at this issue is a bit busy. He'll turn back to pay closer attention to you after this weekend.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
User avatar
Alex (staff)
Staff
Posts: 177
Joined: Sat Jun 26, 2004 8:49 am

Tue Nov 06, 2012 11:58 am

Stanislav,
I have reviewed results of your tests, all looks pretty normal, excluding the first ones.

1. Compare ATTO results for cached and non-cached device:
ImageImage
Device with cache shows slightly better or equal performance on big packet sizes, and more that two times faster on small packets (8k or less).
It's OK.

2. Next test results IOMeter, 90% seq, 64kb writes.
This access pattern makes cache nearly useless as cache hits are too low. So we are having a short peak at the start of the test and slightly lower performance in steady state. These are 10% loss on the diagram.
Image

Some clarification on length of the starting peak. Write-back cache starts writing data to disk when it's full and when data is not accessed during certain period of time. It's lazy write algorithm.
So start of the peak must be counted as full speed, not (peak value - write speed).
It will take less than minute to fill cache at the speed of 180 MBytes/s. And this is what we see on the diagram.

3. Results that are not fitting with theory were shown only in the first series (http://www.starwindsoftware.com/forums/ ... tml#p17067).
Could you tell, what is the difference in test conditions for the test that shows bad results
Image
and the test that shows proper results (last ones)?
Why bad test results are not reproduced on the test that you referenced later (http://www.starwindsoftware.com/forums/ ... tml#p17074)?
I assume, both tests has been made with IOMeter. May be, different access patterns? Some changes in network connections? Anything else?
Best regards,
Alexey.
User avatar
flysats
Posts: 12
Joined: Mon Oct 29, 2012 9:34 pm
Location: Sankt-Petersburg. CCCP

Tue Nov 06, 2012 10:38 pm

1. iometr results contradict this optimistic conclusion
2. low percentage of cache hits in the write? we do about what types of cache are talking about? I'm afraid I'm not ready (like many other professionals) to understand your speech until you say incompatible with the logic of saying. I hope you're just confusing write with reading .... :oops:
3. absolutely identical conditions and equipment configuration. difference in the test pattern for iometer.results are reproducible at any time

ask:
1. how to change the lazy write algorithm to more suitable for me active mode when the cache does not expect overflow and start writing immediately after you GOT entry of data?
2. why there is a cache which reduces the disk performance?
User avatar
Alex (staff)
Staff
Posts: 177
Joined: Sat Jun 26, 2004 8:49 am

Wed Nov 07, 2012 8:39 am

2. Sorry, I have used not precise therm here. With the write-back cache write operation can address the block of cache that is allocated already and resides in the cache. This is the case when the same address has been written short time before. If the address has not been accessed before, write operation requires allocating new block which means flushing one of existing blocks or allocating new block from free memory.

3. You have shown the following patterns:
for the first test: 512k writes, 10% random.
for the second test: 64kb writes, 10% random.
We'll try to reproduce this in our lab.

1. how to change the lazy write algorithm to more suitable for me active mode when the cache does not expect overflow and start writing immediately after you GOT entry of data?
Write-through cache mode works exactly like you have described.
2. why there is a cache which reduces the disk performance?
Nothing is free in this world :)
With sequential writes of big packets write-back cache works like useless FIFO buffer.
We are working now on the next version of caching module, it implements modification of ARC and has some other improvements. This version has reduced overhead on worst-case patterns and gives even more performance on real-world scenarios.
Best regards,
Alexey.
User avatar
flysats
Posts: 12
Joined: Mon Oct 29, 2012 9:34 pm
Location: Sankt-Petersburg. CCCP

Wed Nov 07, 2012 8:53 am

I conducted additional tests using caching software from another manufacturer. has doubled and in some cases triple performance on write scenario.
My preliminary conclusion: the type and caching algorithms in Starwind, conflict when interacting with software RAID in WINDOWS Server. I found the following settings of the additional caching software, at which a similar drop in performance.

You need to give more options to configure caching.

p.s. in paragraph 2 of the previous question was referring to Write back cache. Additional software that I use now for caching can do is what I wrote.
Post Reply