[Discuss] Update on Raspberry Pi4 ZFS Problems
Kent Borg
kentborg at borg.org
Sat Sep 3 16:18:01 EDT 2022
On 9/3/22 12:58, Kent Borg wrote:
> For what I am doing, the slow ports will plenty fast, I need to get
> this thing working. I'm going to try to plow forward.
I decided to try *one* more experiment, before installing and
configuring mail server software: I used fdisk to repartition the two
disks, a single Linux partition on each, put XFS on each, re-plugged
them into a fast USB port, mounted them, and fired up my file copying
torture test, in stereo:
Copy in /usr, then run 8 backgrounded "rsync -a"s to make copies,
do that on both disks at once. Once all of that was done, move
each gaggle of directories into a single directory, then fire up 7
more backgrounded "rsync -a"s to copy it, again on both disks at
once…
The amount of RAM "used" seems to be pretty stable at around 2.5GB, no,
I guess climbing slowly (those 42 long-lived rsync processes are maybe
leaking a little, or maybe efficiently using RAM). Somewhat less than
5GB for "buff/cache", which seems stable, or actually falling. (Using
less cache as the cache gradually synchronizes the processes…? And, as
they synchronize, and performance details change various things rsync
could do to optimize performance might legitimately need more RAM. But
it keeps climbing, smells like a leak to me.)
Watching /var/log/syslog for sometime now I see boring slowly stuff go
by...until I finally see an error!
Sep 3 12:28:01 la kernel: [ 4188.087156] NOHZ tick-stop error:
Non-RCU local softirq work is pending, handler #10!!!
And looking back in syslog I see three more of those in the last few
days, always handler #10, too. I guess doing NOHZ on every supported CPU
is hard to get right.
But I'm not getting any I/O errors: XFS, talking to spinning WD disks,
over fast USB, on a Pi 4, seems solid as hell.
But ZFS can't do it. ZFS might be mature and production ready and
reliable as hell...but not on this hardware with this OS.
I guess it is back to XFS on top of Linux SW raid 1 for this project.
My test is coming up on putting 300GB onto these disks, I've seem two
hourly cron messages in syslog, still no I/O errors, time to hit send on
this message.
-kb, the Kent who is disappointed it doesn't work, and that he had to
spend so much time to get to that conclusion.
More information about the Discuss
mailing list