VS8 free - Space reclaim and resize of LSFS luns

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
RGijsen
Posts: 5
Joined: Fri Feb 20, 2015 10:36 am

Fri Feb 20, 2015 10:49 am

Hi,
I'm currently testing with VS8 free, in combination with our Veeam 8 setup. Our policy describes we make one full backup each week, and daily incrementals. That means I have lots of redundant data in de backups, while Veeam only provides per-job dedupe. Therefore I am testing with VS8. However I have an issue with the space-reclaim, for which I made a little test setup:

I created an 30GB LUN on our backup SAN and mounted it to my Starwind test machine. Created a 10GB LSFS lun on that and mounted it through iSCSI. Works fine, dedups fine and stuff. But the space reclaim seems not to work. In fact the 30GB volume the LSFS volume resides on fills BEYOND the 10GB the LSFS volume actually is. Now I've deleted ALL (formatted) the mounted LSFS volume so that's practucally 0 bytes in size. The 30GB volume the LSFS resides on keeps having 22GB in use though. I let it sit there for a whole day, thining maybe the reclaim is triggered at night or at idle moments, but nothing changed overnight. This means it's very hard to plan for actual diskspace allocation, as the disk requirements grow beyond the actual LSFS volume size, and that's with dedupe enabled. So it seems to miss the spot here, or I am missing something :) Another thing I ran into that when LSFS is enabled, it seems to be not possible to resize the LSFS volume.

So my two questions:

- How is the space reclaim supposed to work or how to prevent to end up using even more space than is stored or presented?
- Is it possible to resize a LSFS LUN or do we need to create a new one and copy all data across?

[edit]
I've just noticed my LSFS device has snapshots set to YES when looking in the console. However, no snapshots exists as far as I can see. But I've read something about these automatically created snapshots each 30 minutes. Might that be my issue? How to disable that?
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Fri Feb 20, 2015 12:29 pm

Hi! Thank you for posting the questions! OK, os one-by-one
- How is the space reclaim supposed to work or how to prevent to end up using even more space than is stored or presented?
First of all brief explanation why LSFS growth - it record (logs) all the incoming data, to restore them. Currently we have the pre-release build that sets LSFS device to grow up to 3x from the LUN capacity maximum. Right now it will grow up. BTW, here is the link to download the beta-build: https://www.dropbox.com/s/hxw2ax3ycz71q ... 6.exe?dl=0
- Is it possible to resize a LSFS LUN or do we need to create a new one and copy all data across?
Currently it is not possible to resize LSFS, so you need to create the new one.
I've just noticed my LSFS device has snapshots set to YES when looking in the console. However, no snapshots exists as far as I can see. But I've read something about these automatically created snapshots each 30 minutes. Might that be my issue? How to disable that?
You need to take snapshots manualy in order they to appear.

Hope that helped.
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
RGijsen
Posts: 5
Joined: Fri Feb 20, 2015 10:36 am

Fri Feb 20, 2015 12:40 pm

Thanks for your quick answer. In the meanwhile I have been fiddling around a bit and was able to actually enlarge a LSFS volume by modifying its .swdsk file, and it worked fine although that's most probably not a good idea to do in production. I don't really understand the concept of al this though. I know LSFS is a log base filesystem, but still, what puprpose does a deduped drive have if it might grow up to 3 times the actual stored data, or with the current build even more? Isn't the space reclaim supposed to do something about that? I agree it might grow beyond while actually filling up the LUN but its housekeeping should keep the 'master volume' as small as the unique data in theory, isn't it?
I've read some (old) documentation of the old iSCSI software with dedup in compliance with Veeam V6. However I don't see how to use dedup with any purpose, backup or not, if the actual space (and thus writes if we are talking SSD) is actually far beyond the unique data.

In other words, does this render VS8 not a good solution for my needs which is an inline-dedup volume for my Veeam repository?
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Fri Feb 20, 2015 4:21 pm

BOLD

Short answer is - no. It's a bad idea to use StarWind in-line deduplication for backups. And it's a bad idea to use off-line volume-based dedupe for backups as well. This is what I wrote recently:

https://social.technet.microsoft.com/Fo ... erverfiles

"See, there are two types of deduplication: in-line and off-line.
In-line deduplication: data is space-optimized in RAM and written to primary storage "dehydrated". Such an approach allows to a) increase performance (as storage array writes LESS data in the same time "t") and b) prolong flash life as makes SSDs to do less "erase" cycles. But it requires storing hash tables in RAM (even storing them on flash slows down main write path dramatically, only newer PMC Sierra NVRAM and NVMe flash cards from FusionIO get close to what RAM can do). These are GIGABYTES. For example StarWind needs ±3.5GB of RAM for 1TB of served capacity with 4KB blocks, in-line dedupe from ZFS is even more memory hungry.
Off-line deduplication: writes data to primary storage AS IS and then optimization process (scrubber) reads content previously written, optimizes it and writes back to primary storage. As you see you have INCREASED amount of writes and extra reads so off-line dedupe is STEALING performance from array. This means off-line dedupe can work effectively only when load on storage system is "pulsating" and optimization process kicks in when there's no active I/O so performance is not needed now. Still you can do nothing to decrease amount of writes (many of them random to update metadata) so off-line dedupe is not flash-friendly at all. But good news - it can keep hash tables on primary storage (at least flash) so it's not RAM pig like in-line dedupe is.
As you can see for spindle-based and "cold" data (backups) in-line dedupe is a BAD usage scenario and we don't recommend customers using us for that. From other point MSFT dedupe being off-line is exceptionally good with this type of data (mean data, not idea to dedupe backups). So what you see as a competitive products are actually complimentary (also we don't replicate blocks between multiple controller nodes so with us errors don't get replicated like it would happen with a blind RAID1 so (+) data broken on one node would be recovered from another and (-) we'll use at least twice as much space for that config <-- still irrelevant in the context of the topic, no point to have backup storage fault-tolerant with a high uptime)."
RGijsen wrote:Thanks for your quick answer. In the meanwhile I have been fiddling around a bit and was able to actually enlarge a LSFS volume by modifying its .swdsk file, and it worked fine although that's most probably not a good idea to do in production. I don't really understand the concept of al this though. I know LSFS is a log base filesystem, but still, what puprpose does a deduped drive have if it might grow up to 3 times the actual stored data, or with the current build even more? Isn't the space reclaim supposed to do something about that? I agree it might grow beyond while actually filling up the LUN but its housekeeping should keep the 'master volume' as small as the unique data in theory, isn't it?
I've read some (old) documentation of the old iSCSI software with dedup in compliance with Veeam V6. However I don't see how to use dedup with any purpose, backup or not, if the actual space (and thus writes if we are talking SSD) is actually far beyond the unique data.

In other words, does this render VS8 not a good solution for my needs which is an inline-dedup volume for my Veeam repository?

Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
RGijsen
Posts: 5
Joined: Fri Feb 20, 2015 10:36 am

Fri Feb 20, 2015 10:29 pm

The question was not whether or not to use dedup at all for your backups (or any other data for that matter). I share your views on that. and if my budget was wider and I had lots of spindels I'd not go for dedup at all. But we've been using dedup (storeonce with HP Dataprotector) for years and it served us very well. In addition we have severe issues with Veeam V8 with reversed incrementals speed (multiple tickets are running for that) and until that's fixed I have no choice but to go for regular full backup once a week and daily incrementals. Incremental forever is not an option for us. This means my backup store grew over 4x as big as anticipated.

So the question is, can I use Starwind to create a deduped disk that DOESN'T grow awful in size like it seems to do now? There are some documentations and white papers on the old Starwind iSCSI software in combination with Veeam, which seemed to do exactly what I want. Is that still possible with VS8?
RGijsen
Posts: 5
Joined: Fri Feb 20, 2015 10:36 am

Sun Feb 22, 2015 4:04 pm

I still don't get how this space reclaim feature is supposed to work. The Virtual SAN PDF clearly states:

StarWind’s in-line deduplication and thin provisioning with space reclaim ensure effective resources utilization, and allow storing 10 times more VM data on expensive flash or other primary storage.

I don't see how this is working when unique writes in fact enlarge the LSFS filesystem. Can anyone clear that out?
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Wed Feb 25, 2015 8:11 pm

Sure! If you will fo with LSFS that includes 3 features: Deduplication, Thin Provisioning and Log Structuring itself.

The DD and Thin Provisioning are actually worked as you understood.

But the LSFS records (logs) all the incoming data, to restore them. Currently, we have the pre-release build that sets LSFS device to grow up to 3x from the LUN capacity maximum. Right now, it will grow up.

On the other hand, you can actually enable MS offline deduplication on the volumes where the StarWind Thick Provisioned devices reside.

I hope that helped.
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
RGijsen
Posts: 5
Joined: Fri Feb 20, 2015 10:36 am

Thu Feb 26, 2015 8:55 am

That still doesn't clear it up. The marketing tells we can store 'up to ten times more data' but in fact it COSTS 10 times more data (and with the new build then 3 times maybe)- if you write long enough to the volume. I fully understand how a logbased filesystem works - I think they are great for specific purposes. However, the fact (am I correct?) that StarWind only enables dedupe on LSFS volumes makes it consume MORE diskspace than a regular volume without dedupe. Unless you are very careful with your writes to it. So that still doesn't explain the space reclaim feature.

I still don't get the idea behind this but it doesn't matter either, I ordered a bunch of new spindels and we stay with 2012R2 dedupe. Thats offline, meaning I need to keep free almost 1:1 the space my prod environment consumes in order to do a full, but for now that's the best I can do. Starwind is not up to it (for us) and it's too much of a hassle to setup a dedicated dedup appliance for this. Thanks so far.
User avatar
fbifido
Posts: 125
Joined: Thu Sep 05, 2013 7:33 am

Thu Feb 26, 2015 7:59 pm

RGijsen wrote: I still don't get the idea behind this but it doesn't matter either, I ordered a bunch of new spindels and we stay with 2012R2 dedupe. Thats offline, meaning I need to keep free almost 1:1 the space my prod environment consumes in order to do a full, but for now that's the best I can do. Starwind is not up to it (for us) and it's too much of a hassle to setup a dedicated dedup appliance for this. Thanks so far.
Hi RGijsen,

Have you tried StarWind iSCSI v6
most of the feedback on this page seems to be base on that version
"http://www.starwindsoftware.com/customer-feedback"

Why not give it a try, I can seem to find a copy anywhere !
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Tue Mar 03, 2015 9:44 pm

@fbifido that is not an option - v6 DD hadn`t space reclaim feature

@RGijsen I got where you come from. The DD of StarWind is aimed to save the diskspace while protecting the data. It is the bunch of solution that comes in one package. The perfect design for us here is cheap SATAs with high capacity, Flash-based L2 and RAM as L1. As the result, you have no issues with space consumption while getting good speed.

Also, we actually did a good job with making our solution work with the MS dedup, so if LSFS way is not OK for you due to disk space consumption, it is totally OK to enable MS off-line Dedup on the volumes where the StarWind virtual disks reside.
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
Post Reply