Page 1 of 1
Dedup Speeds
Posted: Thu Jun 14, 2012 8:00 pm
by sloopy
What are the expected dedup speeds supposed to be? Granted, right now I'm testing with a dedup image on a USB drive. My .spbitmap file is 512MB and my .spmetadata file is 2GB.
I test by copying a 20GB file. It runs around 40MB/sec for the first 2GB of xfer and then crawls down to about 12MB/sec from that point forward. I assume the .spmetadata is a cache where it is non-dedup and then the speed of the disk determines how fast it calculates it into the .spdata file as dedup data.
What would be a best practice if I plan on working with very large files? I plan on using it to store MSSQL backups. The sizes range anywhere from a few hundred megs up to 640GB. Do I need to specify a large cache when initially configuring? I thought that was RAM cache but it appears to be the size of the .spmetadata file.
For the test above I selected write-back, 50,000ms cache time, 2GB cache size, and 4K block size. If there is a better config for my use case, please let me know and I'll run some tests on that.
Re: Dedup Speeds
Posted: Thu Jun 14, 2012 10:25 pm
by anton (staff)
Upcoming version we'll use as a core for V6 will give you ~90-95% of raw disk performance for seq. writes (uncached) and RAM performance for cached. There's no need to tune current one (~50-60% of raw disk) as it's going to be replaced.
Yes. Put as much RAM write back cache as you can. For now you cannot HA dedupe cores, with V6 you will be so it's going to be safe.
4KB block size is optimal. We'll lock down 4, 8 and 16KB blocks.
Intel I/O meter is a proper test. Copying a file does seq. write you'll hardly find anywhere.
sloopy wrote:What are the expected dedup speeds supposed to be? Granted, right now I'm testing with a dedup image on a USB drive. My .spbitmap file is 512MB and my .spmetadata file is 2GB.
I test by copying a 20GB file. It runs around 40MB/sec for the first 2GB of xfer and then crawls down to about 12MB/sec from that point forward. I assume the .spmetadata is a cache where it is non-dedup and then the speed of the disk determines how fast it calculates it into the .spdata file as dedup data.
What would be a best practice if I plan on working with very large files? I plan on using it to store MSSQL backups. The sizes range anywhere from a few hundred megs up to 640GB. Do I need to specify a large cache when initially configuring? I thought that was RAM cache but it appears to be the size of the .spmetadata file.
For the test above I selected write-back, 50,000ms cache time, 2GB cache size, and 4K block size. If there is a better config for my use case, please let me know and I'll run some tests on that.
Re: Dedup Speeds
Posted: Tue Jul 31, 2012 8:12 pm
by sloopy
I'm now running v.5.8.2013 and it's still giving slow speeds for the dedup. It's a server with 4GB RAM, and just like my tests in the previous version it runs okay up until the first few GB are copied, then it slows down. It seems like that is when it reaches a bottleneck where the dedup processing runs slower than what the data is being thrown at it.
For us copying a file as a sequential write is a great example because that is exactly what we'll be using it for. Storing giant MSSQL backups up to 600+GB in size.
anton (staff) wrote:Upcoming version we'll use as a core for V6 will give you ~90-95% of raw disk performance for seq. writes (uncached) and RAM performance for cached. There's no need to tune current one (~50-60% of raw disk) as it's going to be replaced.
Yes. Put as much RAM write back cache as you can. For now you cannot HA dedupe cores, with V6 you will be so it's going to be safe.
4KB block size is optimal. We'll lock down 4, 8 and 16KB blocks.
Intel I/O meter is a proper test. Copying a file does seq. write you'll hardly find anywhere.
Re: Dedup Speeds
Posted: Tue Jul 31, 2012 8:18 pm
by anton (staff)
It's cache being filled on both sides (client and server) so you see slow downs. Is there any chance you'd share real numbers with us?
1) Raw disk performance speeds. Run ATTO for StarWind-hosting images drive.
2) Run NTtpc and IPerf to check and report network.
3) Copy file and report real speeds. For initial run and whole copy operation.
4) Wait for upcoming dedupe. With it you should see 90-95% of raw disk performance (NTFS does 10% and VMFS does 15%).
sloopy wrote:I'm now running v.5.8.2013 and it's still giving slow speeds for the dedup. It's a server with 4GB RAM, and just like my tests in the previous version it runs okay up until the first few GB are copied, then it slows down. It seems like that is when it reaches a bottleneck where the dedup processing runs slower than what the data is being thrown at it.
For us copying a file as a sequential write is a great example because that is exactly what we'll be using it for. Storing giant MSSQL backups up to 600+GB in size.
anton (staff) wrote:Upcoming version we'll use as a core for V6 will give you ~90-95% of raw disk performance for seq. writes (uncached) and RAM performance for cached. There's no need to tune current one (~50-60% of raw disk) as it's going to be replaced.
Yes. Put as much RAM write back cache as you can. For now you cannot HA dedupe cores, with V6 you will be so it's going to be safe.
4KB block size is optimal. We'll lock down 4, 8 and 16KB blocks.
Intel I/O meter is a proper test. Copying a file does seq. write you'll hardly find anywhere.