[Discuss] ZFS
Edward Ned Harvey
blu at nedharvey.com
Mon Oct 3 11:04:25 EDT 2011
> From: discuss-bounces+blu=nedharvey.com at blu.org [mailto:discuss-
> bounces+blu=nedharvey.com at blu.org] On Behalf Of Tom Metro
>
> > Neither ZFS nor OCFS2 can compete for raw performance with ext4...
>
> Reference?
I was going to comment on that - and then I wasn't - and now I am.
Minimally.
In all of the above (and btrfs) there are different architectures and very
efficiently written code. This means each design will outperform the others
in certain specific ways. Take for example, zfs and btrfs both do copy on
write to support snapshots, which means they have very efficient mechanisms
for remapping logical blocks to physical blocks. Very efficient, but
nonzero. Which means (a) if you're not doing any snapshots, then the block
remapping is wasted overhead in zfs and btrfs, so for certain types of
operations, ext4 would perform better, due to less overhead. and (b) zfs
and btrfs are both able to perform write aggregation, which ext4 can't.
Which means both ZFS and btrfs will outperform ext4 for small random writes
(async mode) by an order of magnitude or two - by remapping a bunch of small
random writes into a single large sequential write. (I measured 20x-40x.)
ZFS does storage multitiering, which means you can use things like SSD's to
accelerate even the small sync mode writes (a factor of 5x-10x) and you can
use large SSD's etc to extend your cache instead of buying infinite ram.
But in the typical use case, btrfs and ZFS have snapshotting enabled, and
automated snapshots being created and destroyed, so if you've got a large
sequential file with a bunch of random writes taking place inside it (for
example a database, or an EDA simulation) then btrfs and zfs will perform
those writes much faster than ext4... But the end result is file
fragmentation, which means ext4 could sequentially read the file back much
faster. Again, by an order of magnitude or more.
The list of architectural differences, and resultant performance
differences, could go on for the length of a book.
It's unfair and inaccurate to make the generalization that one is better or
faster than the other. They're each better in specific cases. Know the
architecture gains and losses of each one, and use the best tool for
whatever job you're trying to do.
More information about the Discuss
mailing list