top_left top_right
bottom_left
Next Event: Unknown | Forum Rules | QGL Website | Event Registration
openFolder AusForums.com
iconwatfolderLineopenFolder LANs
iconwatfolderLineopenFolder QGL
iconwatfolderLineopenFolder QGL Forum
Author
Topic: ZFS Dedup ... I'm excited!
stinky
Posts: 3309
Location: USA
Early November a Sun blog declared that Deduplication for ZFS was complete. Shortly thereafter it was added into the development branch of OpenSolaris.

It just happened that I've started to set up a bunch of blade servers for LAMP stacks and various *nix utilities so I decided to set up a iSCSI server for hosting some nice to have but non-critical data like centos yum repositories.

I connected up a spare Dell 1950 to a MD1000 ( 12 x 400Gb 15k disks ) and was ready to go.

Installing OpenSolaris is a breeze via LiveCD ( which means you can test hardware compatibility at the same time ) and setting up a ZFS partition with dedup is 3 simple CLI commands. Add a few more commands and you're sharing blocks inside the ZFS via iSCSI

Copying our current YUM mirror gave me a dedup factor of 1.13x, We run two yum servers for redundancy, and not surprisingly after setting up our second yum repository our dedup factor jumped to 2.35x. Which means for a little less storage than a single yum mirror we've got two!

I would love to see this combined with something along the lines of backblaze pods ( http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/ ). While not quite what I'd use for an enterprise storage solution, I think it's a brilliant way to get cheap storage for a cloud based backup solution, or even a server farm hosting web applications etc.

What I'm really keen to do is implement this for backups, go back to oldschool tarballs or a solid linux backup tool like zAmanda ( hell even robocopy/rsync ) and then just do a monthly archive to tape. I hate the direction that VTL is going, and think this will give us a brilliant alternative.


system
--
HerbalLizard
Posts: 3460
Location: Queenstown, New Zealand
This post gave me wood this morning, cheers you have made my day
trog
AGN Admin
Posts: 28613
Location: Brisbane, Queensland

That sounds pretty awesome. If I ever get some spare time I really want to set up a ZFS-based file server at home.

My main issue atm is that I want RAID-esque redundancy in that I want to be able to know that if a drive dies, all I have to do is pull it out and replace it with another drive. I don't want to have to think about drive sizes, brands, etc, matching - I just want to be able to throw in random drives.

We talked about a bit on irc and couldn't figure out the answer - if using RAID-Z, can you use drives of difference size - eg, 500gb,500gb, 500gb, 1tb - as long as you're happy to "lose" the extra space on the 1tb drive?
Jim
Posts: 10896
Location: Brisbane, Queensland
the wikipedia article for zfs talks about the ability to swap in larger drives one at a time letting parity be restored. so, yes
it's the same thing we did with our nas in the office using raid5 and lvm, incidentally
trog
AGN Admin
Posts: 28615
Location: Brisbane, Queensland

the wikipedia article for zfs talks about the ability to swap in larger drives one at a time letting parity be restored. so, yes
it's the same thing we did with our nas in the office using raid5 and lvm, incidentally
I know it's possible for "regular ZFS" but Dewi was saying he wasn't sure if you could do it if you were running ZFS in RAID-Z
TicMan
Posts: 5475
Location: Melbourne, Victoria
f***ing hell this is awesome, now to find something to apply it too.
Jim
Posts: 10897
Location: Brisbane, Queensland
without raid, there isn't parity or 'healing', there's only integrity checking

Capacity expansion is normally achieved by adding groups of disks as a top-level vdev: simple device, RAID-Z, RAID-Z2, RAID-Z3, or mirrored. Newly written data will dynamically start to use all available vdevs. It is also possible to expand the array by iteratively swapping each drive in the array with a bigger drive and waiting for ZFS to heal itself — the heal time will depend on amount of stored information, not the disk size. The new free space will not be available until all the disks have been swapped.


Hogfather
Posts: 4356
Location: Cairns, Queensland
Jim - I've been wondering about this with my Thecus!

Can I do as suggested above and 'heal' my array to a larger size by replacing the drives one by one?

For example, I have 4x750MB drives at the moment. If I swap in say 1.5TB drives and allow the array heal after each swap will I finally double my storage when the last drive is rebuilt?
XaartaX
Posts: 328
Location: Adelaide, South Australia

Pity it doesn't support variable length dedupe.
Jim
Posts: 10898
Location: Brisbane, Queensland
yeh you can do that with your thecus hogfather, at least I know you can if you chose lvm/ext3 - I'm not sure what it would let you do if you had chosen ZFS. not because ZFS can't do it, but because I'm not sure how they configure the underlying ZFS, or whether fuse-zfs has limitations in that area

stinky
Posts: 3310
Location: USA
Pity it doesn't support variable length dedupe.


variable length dedup is a performance hit, on something like DataDomain ( say goodby to the better part of a million dollars ) which is predominently for backups it makes sense to do variable length as performance on backup data isn't that important ( considering magnetic tape is the benchmark ) whereas on ZFS dedup is intended for more active data which means performance is important.
stinky
Posts: 3338
Location: USA
Thought I'd post a follow up here. My filesystem is currently at 2.39x dedup.


NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
tank01 4.72T 52.7G 4.67T 1% 2.39x ONLINE -



Filesystem 1K-blocks Used Available Use% Mounted on
tank01/util01
1.5T 63G 1.5T 5% /tank01/util01
tank01/util02
1.5T 63G 1.5T 5% /tank01/util02




You can see the results of the dedupic really well in the above output from 'df'. I have quota'd 1.5T from tank01 to each of the util folders. both have pretty much the same data set ( centos repos for network installs etc ). 'df' sees them as having 63G used each, but the zpool list shows only 52.7G actually used across the zpool.

Also been reading on some other cool features like read/write cache on SSD ( L2ARC/ZIL ) which lets you use SSD as a cache for your zfs filesystem

http://blogs.sun.com/brendan/entry/test

can also do snapshotting (not uncommon on FS these days ) and data replication ( locally or piped over ssh http://www.markround.com/archives/38-ZFS-Replication.html ).

not to mention the ZFS native support of nfs/iscsi/smb sharing.

Apart from a fancy GUI to manage this stuff, it's getting super close to being a full blown SAN/NAS solution to rival many of the large vendors.

system
--
Not a new post since your last visit.
New Post Since your last visit
Back To Forum
Advertise with Us | Privacy Policy | Contact Us
© Copyright 2001-2026 AusGamers Pty Ltd. ACN 093 772 242.
Hosted by Mammoth Networks - Australian VPS Hosting
Web development by Mammoth Media.