Creating thin-provisioned disk without using LSFS

robnicholson · Fri Aug 01, 2014 12:34 pm

How do you go about creating a thin-provisioned disk that doesn't use LSFS?

Also, if you upgrade a v6 thin-provisioned disk to v8, what happens to it? Are you forced to LSFS? Not 100% happy about that because of the disk overhead of LSFS?

At least it's possible to quickly extend a flat disk in v8...

Cheers, Rob.

Fri Aug 01, 2014 9:12 pm

LSFS is the only container that does thin provisioning.

LSFS has no overhead except memory consumption compared to FLAT. But nothing comes for free: dedup and raised IOPS need to be taken from "somewhere".

We'll add LUN size extend to LSFS pretty soon.

robnicholson · Sat Aug 02, 2014 10:47 am

>LSFS has no overhead except memory consumption

But it has disk overhead which is kind of self-defeating IMO. 2-3 times more wasn't it?

The great advantage of thin-provisioning IMO is that you don't have to worry too much about disk sizes within virtual machines or iSCSI mounted disks. All you have to worry about/monitor is the storage on the SAN. Monitor one free space versus monitor lots of individual spaces. We've used v6 thin-provisioned for years and they have worked fine.

But to use thin-provisioned, we're going to need three times as much disk space potentially? That's a step forward?

I think it's a shame to loose a perfectly functional feature (thin-provisioned without LSFS) in the scramble for performance. It not always about performance

There are PB of old data sat there on disk systems that are rarely used. Sure, LSFS sounds like a great new feature on paper for performance but stepping back, it looks like it's best suited to high performance use. High performance use (e.g. SQL databases) tend to be smaller in size and therefore would probably use a fixed size.

It's not totally the end of the world for us though - we use Hyper-V 2012 and that can use thin provisioned VMDX files on top of StarWind. They even trim as well.

But, as I said with deduplication, it's not always about performance. So please consider bringing back the old thin-provisioned format (but will keep the new option of course) and maybe a lower performance de-duplication option - the same algorithms but the hash is on disk, not memory. Sure performance will be pants and yes, we personally can use Windows 2012 deduplication on the file server but it would be IMO a nice addition for (wearing developer hate) not that much work.

Cheers, Rob.

robnicholson · Sat Aug 02, 2014 10:54 am

>LSFS is the only container that does thin provisioning.

So what happens when we attempt to upgrade a v6 SAN to v8? We have a 17TB disk002_00000001.ibvd image. Do we need more disk space after this is upgraded to LSFS? Do we need lots more disk space in place during the upgrade, i.e. ibvd is copied to a new format?

Is there a whitepaper somewhere on the new disk format? Call me old fashioned, but I'm paranoid about the underlying disk system used to hold our data. A new disk structure is a pretty radical thing and it has to work otherwise I'm out of a job. Corruption, poor performance & downtime is not an option.

Cheers, Rob.

Sun Aug 03, 2014 8:59 pm

There's a definite confusion. LSFS needs more space to run compared to FLAT because it does Redirect-on-Write (every new block is written to a new place and never overwritten). So yes it requires more space initially but a) LSFS will shrink to minimalistic size after background optimization process will kick-in and b) V6 did work in exactly the same way: also Redirect-on-Write (not FLAT of course). So everything you say to compare V6 and V8 here is... well... not like you see it

There's no much point in keeping multiple code paths (FLAT, LSFS and old stuff like TP and old dedupe) so we'll get rid eventually of FLAT keeping LSFS only. *OR* we'll keep both like we do not but we're definitely not going to have more then that. Simple - pain to support and bugfix.

robnicholson wrote:>LSFS has no overhead except memory consumption

But it has disk overhead which is kind of self-defeating IMO. 2-3 times more wasn't it?

The great advantage of thin-provisioning IMO is that you don't have to worry too much about disk sizes within virtual machines or iSCSI mounted disks. All you have to worry about/monitor is the storage on the SAN. Monitor one free space versus monitor lots of individual spaces. We've used v6 thin-provisioned for years and they have worked fine.

But to use thin-provisioned, we're going to need three times as much disk space potentially? That's a step forward?

I think it's a shame to loose a perfectly functional feature (thin-provisioned without LSFS) in the scramble for performance. It not always about performance There are PB of old data sat there on disk systems that are rarely used. Sure, LSFS sounds like a great new feature on paper for performance but stepping back, it looks like it's best suited to high performance use. High performance use (e.g. SQL databases) tend to be smaller in size and therefore would probably use a fixed size.

It's not totally the end of the world for us though - we use Hyper-V 2012 and that can use thin provisioned VMDX files on top of StarWind. They even trim as well.

But, as I said with deduplication, it's not always about performance. So please consider bringing back the old thin-provisioned format (but will keep the new option of course) and maybe a lower performance de-duplication option - the same algorithms but the hash is on disk, not memory. Sure performance will be pants and yes, we personally can use Windows 2012 deduplication on the file server but it would be IMO a nice addition for (wearing developer hate) not that much work.

Cheers, Rob.

Sun Aug 03, 2014 9:01 pm

You need to migrate your old images to FLAT or LSFS.

You mean on-disk structures? Sure we have them but again... Why do you need them? Or you mean what LSFS is going to bring you? Then check product page there are two PDFs about LSFS: Turning TBs into IOPS and Eliminating I/O Blender.

robnicholson wrote:>LSFS is the only container that does thin provisioning.

So what happens when we attempt to upgrade a v6 SAN to v8? We have a 17TB disk002_00000001.ibvd image. Do we need more disk space after this is upgraded to LSFS? Do we need lots more disk space in place during the upgrade, i.e. ibvd is copied to a new format?

Is there a whitepaper somewhere on the new disk format? Call me old fashioned, but I'm paranoid about the underlying disk system used to hold our data. A new disk structure is a pretty radical thing and it has to work otherwise I'm out of a job. Corruption, poor performance & downtime is not an option.

Cheers, Rob.

robnicholson · Mon Aug 04, 2014 1:46 pm

There's a definite confusion

I think that is the understatement of the day

LSFS needs more space to run compared to FLAT because it does Redirect-on-Write (every new block is written to a new place and never overwritten).

V6 did work in exactly the same way: also Redirect-on-Write

Is that phrase totally accurate? It seems to contradict itself. To me the phrase "every new block" means a block # that has never been written to the disk before? A newly formatted Windows disk will have used (say) the first 100 blocks. Block #101 has never been written. So the thin-provisioned bit is that the disk currently uses 100 blocks on disk. When block #101 is written, the thin-provisioned disk grows by a block and the new block is added to the end. Correct so far?

Same growth occurs when block #102 is written etc. This is part of the reason that thin-provisioned is lower performance than flat - the disk has to have the logic to keep growing.

PS. I'm talking about v6 implementation here - I realise that LSFS is radically different.

So on v6, what happens is block #101 is re-written? My understanding was that it overwrote the existing block? But you're inferring that the block is "never overwritten". That means for something like a SQL database (where blocks are routinely re-written) then the thin-provisioned disk will keep growing and growing, filling more and more with old blocks. So eventually, the thin-provisioned disk will end up way, way bigger than the flat equivalent? If this is the case, then there was a big hole in my understanding of thin-provisioned disks in StarWind.

But I am a little confused on LSFS still. I realise that in order to try and convert random writes into sequential writes, any writes whether they are a new block or a re-write of an existing block are always new writes. I can appreciate how this attempts to iron out expensive random IO. However, on it's own, it means that over time you'll run out of disk space as old blocks are never overwritten. However, you seem to be inferring there is some kind of cleanup operation that is carried out to recover those replaced blocks. Is that right?

Cheers, Rob.

robnicholson · Mon Aug 04, 2014 3:31 pm

I assume that it's this "auto-de-fragmentation" option as shown in the attached screenshot which will free us those blocks that have since been re-written? Cheers, Rob.

Tue Aug 05, 2014 11:35 am

V6 and V8 do work in exactly the same way. When block #101 is overwritten actual old content is never touched, rather new content is stored in the new place and address translation map for volume is updated. That's Redirect-on-Write. With Copy-on-Write old content would be copied to a newer location and old block data replaced with a new one.

It does not work the way you think: you confuse generic Redirect-on-Write and actual log-structuring. System will not run out of free disks space as space optimization process kicks in and does trash gathering (removing unused data from old snapshots for example).

robnicholson wrote:
There's a definite confusion
I think that is the understatement of the day

LSFS needs more space to run compared to FLAT because it does Redirect-on-Write (every new block is written to a new place and never overwritten).

V6 did work in exactly the same way: also Redirect-on-Write
Is that phrase totally accurate? It seems to contradict itself. To me the phrase "every new block" means a block # that has never been written to the disk before? A newly formatted Windows disk will have used (say) the first 100 blocks. Block #101 has never been written. So the thin-provisioned bit is that the disk currently uses 100 blocks on disk. When block #101 is written, the thin-provisioned disk grows by a block and the new block is added to the end. Correct so far?

Same growth occurs when block #102 is written etc. This is part of the reason that thin-provisioned is lower performance than flat - the disk has to have the logic to keep growing.

PS. I'm talking about v6 implementation here - I realise that LSFS is radically different.

So on v6, what happens is block #101 is re-written? My understanding was that it overwrote the existing block? But you're inferring that the block is "never overwritten". That means for something like a SQL database (where blocks are routinely re-written) then the thin-provisioned disk will keep growing and growing, filling more and more with old blocks. So eventually, the thin-provisioned disk will end up way, way bigger than the flat equivalent? If this is the case, then there was a big hole in my understanding of thin-provisioned disks in StarWind.

But I am a little confused on LSFS still. I realise that in order to try and convert random writes into sequential writes, any writes whether they are a new block or a re-write of an existing block are always new writes. I can appreciate how this attempts to iron out expensive random IO. However, on it's own, it means that over time you'll run out of disk space as old blocks are never overwritten. However, you seem to be inferring there is some kind of cleanup operation that is carried out to recover those replaced blocks. Is that right?

Cheers, Rob.

Tue Aug 05, 2014 11:36 am

Defrag does a bit more but in general yes you're correct.

robnicholson wrote:I assume that it's this "auto-de-fragmentation" option as shown in the attached screenshot which will free us those blocks that have since been re-written? Cheers, Rob.