|
![]() |
|
| Author |
|
|||||||
|
stinky
Posts: 3309
Location: USA
|
Early November a Sun blog declared that Deduplication for ZFS was complete. Shortly thereafter it was added into the development branch of OpenSolaris.
It just happened that I've started to set up a bunch of blade servers for LAMP stacks and various *nix utilities so I decided to set up a iSCSI server for hosting some nice to have but non-critical data like centos yum repositories. I connected up a spare Dell 1950 to a MD1000 ( 12 x 400Gb 15k disks ) and was ready to go. Installing OpenSolaris is a breeze via LiveCD ( which means you can test hardware compatibility at the same time ) and setting up a ZFS partition with dedup is 3 simple CLI commands. Add a few more commands and you're sharing blocks inside the ZFS via iSCSI Copying our current YUM mirror gave me a dedup factor of 1.13x, We run two yum servers for redundancy, and not surprisingly after setting up our second yum repository our dedup factor jumped to 2.35x. Which means for a little less storage than a single yum mirror we've got two! I would love to see this combined with something along the lines of backblaze pods ( http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/ ). While not quite what I'd use for an enterprise storage solution, I think it's a brilliant way to get cheap storage for a cloud based backup solution, or even a server farm hosting web applications etc. What I'm really keen to do is implement this for backups, go back to oldschool tarballs or a solid linux backup tool like zAmanda ( hell even robocopy/rsync ) and then just do a monthly archive to tape. I hate the direction that VTL is going, and think this will give us a brilliant alternative. |
|||||||
| #0 05:24am 09/12/09 |
|
|||||||
|
system
|
--
|
|||||||
| #0 |
|
|||||||
|
HerbalLizard
Posts: 3460
Location: Queenstown, New Zealand
|
This post gave me wood this morning, cheers you have made my day
|
|||||||
| #1 06:33am 09/12/09 |
|
|||||||
|
trog
AGN Admin
Posts: 28613
Location: Brisbane, Queensland
|
That sounds pretty awesome. If I ever get some spare time I really want to set up a ZFS-based file server at home. My main issue atm is that I want RAID-esque redundancy in that I want to be able to know that if a drive dies, all I have to do is pull it out and replace it with another drive. I don't want to have to think about drive sizes, brands, etc, matching - I just want to be able to throw in random drives. We talked about a bit on irc and couldn't figure out the answer - if using RAID-Z, can you use drives of difference size - eg, 500gb,500gb, 500gb, 1tb - as long as you're happy to "lose" the extra space on the 1tb drive? |
|||||||
| #2 10:12am 09/12/09 |
|
|||||||
|
Jim
Posts: 10896
Location: Brisbane, Queensland
|
the wikipedia article for zfs talks about the ability to swap in larger drives one at a time letting parity be restored. so, yes
it's the same thing we did with our nas in the office using raid5 and lvm, incidentally |
|||||||
| #3 10:43am 09/12/09 |
|
|||||||
|
trog
AGN Admin
Posts: 28615
Location: Brisbane, Queensland
|
the wikipedia article for zfs talks about the ability to swap in larger drives one at a time letting parity be restored. so, yesI know it's possible for "regular ZFS" but Dewi was saying he wasn't sure if you could do it if you were running ZFS in RAID-Z |
|||||||
| #4 10:52am 09/12/09 |
|
|||||||
|
TicMan
Posts: 5475
Location: Melbourne, Victoria
|
f***ing hell this is awesome, now to find something to apply it too.
|
|||||||
| #5 11:03am 09/12/09 |
|
|||||||
|
Jim
Posts: 10897
Location: Brisbane, Queensland
|
without raid, there isn't parity or 'healing', there's only integrity checking
Capacity expansion is normally achieved by adding groups of disks as a top-level vdev: simple device, RAID-Z, RAID-Z2, RAID-Z3, or mirrored. Newly written data will dynamically start to use all available vdevs. It is also possible to expand the array by iteratively swapping each drive in the array with a bigger drive and waiting for ZFS to heal itself — the heal time will depend on amount of stored information, not the disk size. The new free space will not be available until all the disks have been swapped. |
|||||||
| #6 11:13am 09/12/09 |
|
|||||||
|
Hogfather
Posts: 4356
Location: Cairns, Queensland
|
Jim - I've been wondering about this with my Thecus!
Can I do as suggested above and 'heal' my array to a larger size by replacing the drives one by one? For example, I have 4x750MB drives at the moment. If I swap in say 1.5TB drives and allow the array heal after each swap will I finally double my storage when the last drive is rebuilt? |
|||||||
| #7 11:38am 09/12/09 |
|
|||||||
|
XaartaX
Posts: 328
Location: Adelaide, South Australia
|
Pity it doesn't support variable length dedupe. |
|||||||
| #8 11:53am 09/12/09 |
|
|||||||
|
Jim
Posts: 10898
Location: Brisbane, Queensland
|
yeh you can do that with your thecus hogfather, at least I know you can if you chose lvm/ext3 - I'm not sure what it would let you do if you had chosen ZFS. not because ZFS can't do it, but because I'm not sure how they configure the underlying ZFS, or whether fuse-zfs has limitations in that area
|
|||||||
| #9 12:02pm 09/12/09 |
|
|||||||
|
stinky
Posts: 3310
Location: USA
|
Pity it doesn't support variable length dedupe. variable length dedup is a performance hit, on something like DataDomain ( say goodby to the better part of a million dollars ) which is predominently for backups it makes sense to do variable length as performance on backup data isn't that important ( considering magnetic tape is the benchmark ) whereas on ZFS dedup is intended for more active data which means performance is important. |
|||||||
| #10 02:31am 10/12/09 |
|
|||||||
|
stinky
Posts: 3338
Location: USA
|
Thought I'd post a follow up here. My filesystem is currently at 2.39x dedup.
You can see the results of the dedupic really well in the above output from 'df'. I have quota'd 1.5T from tank01 to each of the util folders. both have pretty much the same data set ( centos repos for network installs etc ). 'df' sees them as having 63G used each, but the zpool list shows only 52.7G actually used across the zpool. Also been reading on some other cool features like read/write cache on SSD ( L2ARC/ZIL ) which lets you use SSD as a cache for your zfs filesystem http://blogs.sun.com/brendan/entry/test can also do snapshotting (not uncommon on FS these days ) and data replication ( locally or piped over ssh http://www.markround.com/archives/38-ZFS-Replication.html ). not to mention the ZFS native support of nfs/iscsi/smb sharing. Apart from a fancy GUI to manage this stuff, it's getting super close to being a full blown SAN/NAS solution to rival many of the large vendors. |
|||||||
| #11 06:24am 09/01/10 |
|
|||||||
|
system
|
--
|
|||||||
| #11 |
|
|||||||
|
| ||||||||