Page 1 of 2
Evaluate Deduplication- Free Version
Posted: Wed May 25, 2011 1:58 am
by Asanam
We are testing the effectiveness of deduplication in the following manner.
Having setup a target image on Windows 2003 we have iniated the connection to this disk on a Windows 7 client.
The disk size is 5GB. A folder with 'x' files is copied 'y' times which are in turn put into a folder which is again copied 'z' times. In all, the disk properties show 2.54 GB for 2208 files.
Having tried this with an un-deduplicated disk the sizes remain the same.
So, how does one evaluate the effectiveness of deduplication?
Regards,
Asanam
Re: Evaluate Deduplication- Free Version
Posted: Wed May 25, 2011 8:47 am
by anton (staff)
Deduplication saves space on your HOST not on layered above it file system (How you do think it should work in general?). So for you it's still multiplication of sizes you copy and for HOST is less space used and allocated to keep your real data.
How large is your .data or .spdata file StarWind maintains for your deduplication volume? Deduplication should be calculated ( your files size ) / size of ( .data or .spdata ). In V5.7 we'll report both deduplication and compression ratios in GUI.
Re: Evaluate Deduplication- Free Version
Posted: Wed May 25, 2011 3:43 pm
by Asanam
Thanks for the prompt response. Look forward to 5.7 but in the mean time I've done another test, this time with NTbackup files.
File1 is a backup of a folder, File2 is another backup of the same content as File1 with the addition of an outlook.pst. These are both then copied onto the drive.
This is what I see:
---------------------------Client bytes----------------Host spdata-----------Ratio
File1---------------------10,718,438,400-----------15,971,287,040-------0.671
File2---------------------12,032,265,216
File 1and 2--------------22,750,703,616-----------33,864,122,368------0.672
File 1and 2 x 2---------45,501,407,232-----------66,962,128,896------0.680
Still looking for that elusive revelation of data deduplication.
Regards
Asanam
Re: Evaluate Deduplication- Free Version
Posted: Wed May 25, 2011 3:58 pm
by anton (staff)
StarWind cannot create deduplicated content larger then source non-deduplicated. Can you provide some sort of the screenshot of directory listing of your mapped target and host holding spdata file?
Re: Evaluate Deduplication- Free Version
Posted: Wed May 25, 2011 4:24 pm
by Asanam
I certainly can supply these:
Attached:
1. Directory listing of mapped target and
2. Host holding the spdata
Regards
Asanam
Re: Evaluate Deduplication- Free Version
Posted: Wed May 25, 2011 4:40 pm
by anton (staff)
Did you re-create the target for your tests? For now all deletes are DISABLED so deduplicated "knowledge base" is never emptied. We'll change this (provide an option) before release but for now you need to have fresh created volume (never writtened) to verify what's with your data. FYI.
Re: Evaluate Deduplication- Free Version
Posted: Wed May 25, 2011 5:06 pm
by Asanam
This is a fresh target on another disk altogether. It's an external USB drive, wiped, reinitialized etc and new target altogether. Oh, and it's on another server also, SBS2003R2. Everything seems sooth and fine but the numbers certainly do not stack up.
Can you recommend a way of evaluating the dedup feature which would help in planning the sizing?
Regards
Asanam
Re: Evaluate Deduplication- Free Version
Posted: Wed May 25, 2011 5:10 pm
by anton (staff)
I don't understand how you'd managed to have such a result. Can you delete whole volume set, re-create the target and provide details after each step? Something like:
1) Step one. Zero-sized target. Size of .spdata is ...
2) Step two. First directory is copied. Size of .spdata is ...
3) Step three. Second directory is copied. Size of .spdata is ...
We're adding detailed stats now but it would not change anything for you - only something you do "with hands" would be shown in GUI.
Re: Evaluate Deduplication- Free Version
Posted: Sat Apr 21, 2012 11:39 pm
by ypae
anton (staff) wrote:Did you re-create the target for your tests? For now all deletes are DISABLED so deduplicated "knowledge base" is never emptied. We'll change this (provide an option) before release but for now you need to have fresh created volume (never writtened) to verify what's with your data. FYI.
Hello,
I have 2 questions:
1. If I have bunch of .VHDs with so many duplicate files insdie on StarWind iSCSI, would "Block Level" deduplication work effectively still?
2. When "
empty orphaned deduplication knowledge base" option would be available?
Thanks,
Young-
Re: Evaluate Deduplication- Free Version
Posted: Sun Apr 22, 2012 12:05 am
by anton (staff)
1) Yes of course. That was a plan.
2) V5.9 (you may apply for beta right now)
Re: Evaluate Deduplication- Free Version
Posted: Tue May 08, 2012 9:51 pm
by dataanywhere
Did anyone figure this issue out?
We're seeing similar results while evaluating using the free version.
A folder copied raw is just over 14GB in size when viewing from the client. Making a second copy of that folder, the .spdata file on the host is double the size.
Configuration is "new" as we just set it all up for the first time. The host we're using for testing is a Windows 7 Pro Virtual Machine within VMWare Workstation 8. The disk is a thin-provisioned 2TB vmdk. Within the VM host, we quick-formatted the drive NTFS. Then setup the deduplication virtual disk device, selected 2 TB in size. See screenshot attached for complete deduplication settings.
The client that connects to the iSCSI target is a Windows SBS 2008 server. The iscsi connection worked flawlessly, we formatted the disk (NTFS quick format), then copied the c:\windows folder into a sub directory on that disk. That equated to just over 15GB .spdata file. Copying that same folder onto the same disk (thus creating "Copy of Windows") folder results in the .spdata file growing to double the size.
We're really interested in this product.
Thanks for any input,
Geoff
Re: Evaluate Deduplication- Free Version
Posted: Tue May 08, 2012 10:40 pm
by anton (staff)
Have an impression it's because of a deduplication block size of 64KB. Can you create smaller storage and use 4KB or 8KB one? Also support people will take a closer look tomorrow...
Re: Evaluate Deduplication- Free Version
Posted: Thu May 10, 2012 7:43 am
by Bohdan (staff)
What is the cluster size (allocation unit size) of the NTFS "E" volume where the deduplicated virtual disk files are stored?
For dd block size 64K it should be also 64K.
Re: Evaluate Deduplication- Free Version
Posted: Mon Jun 04, 2012 9:19 pm
by caustix
I experience this issue too with SPdata being way to big, even on the initial copy of files to it (nothing is being moved around) Could it be because of a cluster size mismatch between VMGuest/iSCSIFS/spdata
iSCSI native disk partition formatted (where spdata resides):
Bytes Per Cluster : 65536
VMware Guest:
Bytes Per Cluster : 4096
I am using version 5.8.2013
Host DDDisk chosen 64k block size.
Server 2008 R2 on Guest VM and Starwind server
I have noticed SPdata file being always larger than size of guest disk. - almost double in size since it was started. At first I thought this was just something weird with Server 2008 R2 but I can not figure it out.
Guest usage: 3321GB
SPdata size 4700GB
SPdata test formatted to 20TB but I fear host iSCSI SPData file will grow way too large if this persists.
There is no Pagefile on that guest's drive. I don't know why it is growing so big.
Re: Evaluate Deduplication- Free Version
Posted: Mon Jun 04, 2012 11:05 pm
by caustix
I think matching cluster sizes solved the issue.
I created another DD Disk (64K, made large VMware 5 datastore, spanning a few 2TB VMDK files)
I formatted that guest parition using 64k and I am seeing some good results so far.