Discussion:
zfskern{txg_thread_enter} thread using 100% or more CPU
(too old to reply)
Steve Wills
2018-04-24 23:30:22 UTC
Permalink
Hi,

Recently on multiple systems running CURRENT, I've been seeing the
system become unresponsive. Leaving top(1) running has lead me to notice
that when this happens, the system is still responding to ping and top
over ssh is still working, but no new processes can start and switching
to other tasks doesn't work. In top, I do see pid 17,
[zfskern{txg_thread_enter}] monopolizing both CPU usage and disk IO. Any
ideas how to troubleshoot this? It doesn't appear to be a hardware issue.

Steve
Greg V
2018-04-25 00:15:33 UTC
Permalink
Post by Steve Wills
Hi,
Recently on multiple systems running CURRENT, I've been seeing the
system become unresponsive. Leaving top(1) running has lead me to
notice that when this happens, the system is still responding to ping
and top over ssh is still working, but no new processes can start and
switching to other tasks doesn't work. In top, I do see pid 17,
[zfskern{txg_thread_enter}] monopolizing both CPU usage and disk IO.
Any ideas how to troubleshoot this? It doesn't appear to be a
hardware issue.
Hi,

Do you have something writing to a gzip compressed dataset? You can use
the vfssnoop DTrace script from
https://forums.freebsd.org/threads/sharing-of-dtrace-scripts.32855/#post-181816
to see who's writing what.

I don't remember if it was exactly txg_thread_enter or whatever, but
both CPU and disk sounds a lot like heavily compressed writes.

In my case, the Epiphany browser was downloading a large malware
database to ~/.config/epiphany/gsb-threats.db :D
Steve Wills
2018-05-03 20:27:04 UTC
Permalink
I finally caught this happening while I had "lockstat sleep 1" running
in a loop, the output looks like:

https://gist.github.com/swills/a2a20c2a4296a4c596ec7f329fb945ab

And top looks like this:

https://gist.github.com/swills/6e749313e52679224adec91d4841ad83

Also noticed that there are actually 2 threads of pid 17
[zfskern{txg_thread_enter}] which are reporting 57% and 42% of disk IO,
everything else is idle as far as IO. The system is not totally
unresponsive, processes that don't need IO are working, but anything
that needs IO hangs. Perhaps it's a hardware issue, but I can't find any
other evidence of it. Any ideas?

Steve
Post by Steve Wills
Hi,
Recently on multiple systems running CURRENT, I've been seeing the
system become unresponsive. Leaving top(1) running has lead me to notice
that when this happens, the system is still responding to ping and top
over ssh is still working, but no new processes can start and switching
to other tasks doesn't work. In top, I do see pid 17,
[zfskern{txg_thread_enter}] monopolizing both CPU usage and disk IO. Any
ideas how to troubleshoot this? It doesn't appear to be a hardware issue.
Steve
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-current
Loading...