Discussion:
ZFS RAIDZ1: resilvering at <17.3M/s => abyssal slow ...
(too old to reply)
O. Hartmann
2017-12-14 14:52:27 UTC
Permalink
Am Thu, 14 Dec 2017 15:46:17 +0100
Am Thu, 14 Dec 2017 14:09:39 +0100
I just started the rebuild/resilvering process and watch the pool crwaling at ~ 18
MB/s. At the moment, there is no load on the array, the host is a IvyBridge XEON
with 4 core/8 threads and 3,4 GHz and 16 GB of RAM. The HDDs are attached to a
on-board SATA II (300 MB/s max) Intel chip - this just for the record.
Recently, I switch on the "sync" attribute on most of the defined pools's zfs
filesystems
- I also use a SSD for ZIL/L2ARC caching, but it seems to be unused recently in
FreeBSD CURRENT's ZFS - this from a observers perspective only.
When scrubbing, I see recently also reduced performance on the pool, so I'm
wondering about the low throughput at the very moment when resilvering is in
progress.
If the "perspective" of "zpool status" is correct, then I have to wait after two
hours for another 100 hours - ~ 4 days? Ups ... I think there is something badly
misconfigured or missing.
...
This is kind of to be expected - for whatever reason, resilvers seem
to go super slow at first and then speed up significantly. Just don't
ask me how long "at first" is - I'd give it several (more) hours.
http://open-zfs.org/wiki/Scrub/Resilver_Performance
-Dimitry
It has already been started to become better ;-)

After a while now, the throughput is at 128 MBytes/s and the estimated time decreased to
~ 8 h now - that is much more appreciable than 4 days ;-)

Kind regards,
Oliver
--
O. Hartmann

Ich widerspreche der Nutzung oder Übermittlung meiner Daten fÃŒr
Werbezwecke oder fÌr die Markt- oder Meinungsforschung (§ 28 Abs. 4 BDSG).
Adam Vande More
2017-12-14 15:28:08 UTC
Permalink
Post by O. Hartmann
Am Thu, 14 Dec 2017 15:46:17 +0100
Am Thu, 14 Dec 2017 14:09:39 +0100
I just started the rebuild/resilvering process and watch the pool
crwaling at ~ 18
MB/s. At the moment, there is no load on the array, the host is a
IvyBridge XEON
with 4 core/8 threads and 3,4 GHz and 16 GB of RAM. The HDDs are
attached to a
on-board SATA II (300 MB/s max) Intel chip - this just for the
record.
Recently, I switch on the "sync" attribute on most of the defined
pools's zfs
filesystems
- I also use a SSD for ZIL/L2ARC caching, but it seems to be unused
recently in
FreeBSD CURRENT's ZFS - this from a observers perspective only.
When scrubbing, I see recently also reduced performance on the pool,
so I'm
wondering about the low throughput at the very moment when
resilvering is in
progress.
If the "perspective" of "zpool status" is correct, then I have to
wait after two
hours for another 100 hours - ~ 4 days? Ups ... I think there is
something badly
misconfigured or missing.
...
This is kind of to be expected - for whatever reason, resilvers seem
to go super slow at first and then speed up significantly. Just don't
ask me how long "at first" is - I'd give it several (more) hours.
http://open-zfs.org/wiki/Scrub/Resilver_Performance
-Dimitry
It has already been started to become better ;-)
After a while now, the throughput is at 128 MBytes/s and the estimated time decreased to
~ 8 h now - that is much more appreciable than 4 days ;-)
If you are viewing the rate with zpool status, I don't think that is a
close to realtime rate. I don't know of a way to check that either.
--
Adam
Allan Jude
2017-12-14 15:35:13 UTC
Permalink
Post by O. Hartmann
Am Thu, 14 Dec 2017 15:46:17 +0100
Am Thu, 14 Dec 2017 14:09:39 +0100
I just started the rebuild/resilvering process and watch the pool crwaling at ~ 18
MB/s. At the moment, there is no load on the array, the host is a IvyBridge XEON
with 4 core/8 threads and 3,4 GHz and 16 GB of RAM. The HDDs are attached to a
on-board SATA II (300 MB/s max) Intel chip - this just for the record.
Recently, I switch on the "sync" attribute on most of the defined pools's zfs
filesystems
- I also use a SSD for ZIL/L2ARC caching, but it seems to be unused recently in
FreeBSD CURRENT's ZFS - this from a observers perspective only.
When scrubbing, I see recently also reduced performance on the pool, so I'm
wondering about the low throughput at the very moment when resilvering is in
progress.
If the "perspective" of "zpool status" is correct, then I have to wait after two
hours for another 100 hours - ~ 4 days? Ups ... I think there is something badly
misconfigured or missing.
...
This is kind of to be expected - for whatever reason, resilvers seem
to go super slow at first and then speed up significantly. Just don't
ask me how long "at first" is - I'd give it several (more) hours.
http://open-zfs.org/wiki/Scrub/Resilver_Performance
-Dimitry
It has already been started to become better ;-)
After a while now, the throughput is at 128 MBytes/s and the estimated time decreased to
~ 8 h now - that is much more appreciable than 4 days ;-)
Kind regards,
Oliver
The time estimate is a pure average over the entire length of the scrub
or resilver operation. The very start of the operation is quite slow,
because it involves a lot of random seeks, and the read-ahead is not
very smart (patches are in progress). And yes, after you adjust the
resilver_delay, it will take time for its impact to be visible in 'zpool
status'.

At times, you are better off looking at 'gstat', to see how busy the
disks actually are. You will likely notice the bottleneck is IOPS, you
will be at the limit of at least one the entire duration of the
resilver, unless your resilver_delay is high enough to leave some
available IOPS.
--
Allan Jude
Loading...