Discussion:
panic: mtx_lock() of spin mutex (null) @ /usr/src/sys/net/iflib.c:3716
(too old to reply)
David Wolfskill
2018-04-11 11:39:58 UTC
Permalink
This was running:

FreeBSD g1-215.catwhisker.org 12.0-CURRENT FreeBSD 12.0-CURRENT #156 r332399M/332400:1200061: Wed Apr 11 04:17:45 PDT 2018 ***@g1-215.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY amd64

during boot, after updating from:

FreeBSD g1-215.catwhisker.org 12.0-CURRENT FreeBSD 12.0-CURRENT #155 r332354M/332357:1200061: Tue Apr 10 04:00:41 PDT 2018 ***@g1-215.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY amd64

(My build machine, which uses an re((4) NIC, did not encounter the issue.)

It appears that r332389 is implicated.

...
Unread portion of the kernel message buffer:

__curthread () at ./machine/pcpu.h:230
230 __asm("movq %%gs:%1,%0" : "=r" (td)
(kgdb) #0 __curthread () at ./machine/pcpu.h:230
#1 doadump (textdump=3) at /usr/src/sys/kern/kern_shutdown.c:361
#2 0xffffffff80433f4c in db_fncall_generic (addr=<optimized out>,
rv=<optimized out>, nargs=<optimized out>, args=<optimized out>)
at /usr/src/sys/ddb/db_command.c:609
#3 db_fncall (dummy1=<optimized out>, dummy2=<optimized out>,
dummy3=<optimized out>, dummy4=<optimized out>)
at /usr/src/sys/ddb/db_command.c:657
#4 0xffffffff80433a99 in db_command (last_cmdp=<optimized out>,
cmd_table=<optimized out>, dopager=<optimized out>)
at /usr/src/sys/ddb/db_command.c:481
#5 0xffffffff80433814 in db_command_loop ()
at /usr/src/sys/ddb/db_command.c:534
#6 0xffffffff80436a3f in db_trap (type=<optimized out>, code=<optimized out>)
at /usr/src/sys/ddb/db_main.c:250
#7 0xffffffff80b753e3 in kdb_trap (type=3, code=-61456, tf=<optimized out>)
at /usr/src/sys/kern/subr_kdb.c:697
#8 0xffffffff80f7eaa8 in trap (frame=0xfffffe00004377a0)
at /usr/src/sys/amd64/amd64/trap.c:548
#9 <signal handler called>
#10 kdb_enter (why=0xffffffff811df9d4 "panic", msg=<optimized out>)
at /usr/src/sys/kern/subr_kdb.c:479
#11 0xffffffff80b2feda in vpanic (fmt=<optimized out>, ap=0xfffffe0000437910)
at /usr/src/sys/kern/kern_shutdown.c:826
#12 0xffffffff80b2fca0 in kassert_panic (
fmt=0xffffffff811dadca "mtx_lock() of spin mutex %s @ %s:%d")
at /usr/src/sys/kern/kern_shutdown.c:723
#13 0xffffffff80b0ec93 in __mtx_lock_flags (c=0xfffff80008c85d88, opts=0,
file=0xffffffff81113c90 "/usr/src/sys/net/iflib.c", line=<optimized out>)
at /usr/src/sys/kern/kern_mutex.c:246
#14 0xffffffff80c466e1 in _task_fn_admin (context=0xfffff80008c85c00)
at /usr/src/sys/net/iflib.c:3716
#15 0xffffffff80b73849 in gtaskqueue_run_locked (queue=0xfffff80008489500)
at /usr/src/sys/kern/subr_gtaskqueue.c:331
#16 0xffffffff80b735c8 in gtaskqueue_thread_loop (arg=<optimized out>)
at /usr/src/sys/kern/subr_gtaskqueue.c:506
#17 0xffffffff80af0064 in fork_exit (
callout=0xffffffff80b73540 <gtaskqueue_thread_loop>,
arg=0xfffffe0844223008, frame=0xfffffe0000437ac0)
at /usr/src/sys/kern/kern_fork.c:1039
#18 <signal handler called>
(kgdb)

If the dump would be useful, I can put it up for access.

Peace,
david
--
David H. Wolfskill ***@catwhisker.org
Well, what did you EXPECT from Trump? He has a history of breaking promises.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.
Mark Johnston
2018-04-11 17:02:28 UTC
Permalink
Post by David Wolfskill
(My build machine, which uses an re((4) NIC, did not encounter the issue.)
It appears that r332389 is implicated.
I'm seeing this too, under bhyve with e1000 emulation. Reverting r332389
fixes the problem.
Andrey V. Elsukov
2018-04-11 17:20:02 UTC
Permalink
Post by Mark Johnston
Post by David Wolfskill
It appears that r332389 is implicated.
I'm seeing this too, under bhyve with e1000 emulation. Reverting r332389
fixes the problem.
I have this problem too. And reverting r332389 fixes it.

***@pci0:0:25:0: class=0x020000 card=0x20088086 chip=0x15028086 rev=0x04
hdr=0x00
--
WBR, Andrey V. Elsukov
K. Macy
2018-04-11 17:24:38 UTC
Permalink
Sorry about that. It looks like my review must have been missing a line.

@@ -4702,8 +4707,8 @@ iflib_register(if_ctx_t ctx)

_iflib_assert(sctx);

- CTX_LOCK_INIT(ctx, device_get_nameunit(ctx->ifc_dev));
-
+ CTX_LOCK_INIT(ctx);
+ STATE_LOCK_INIT(ctx, device_get_nameunit(ctx->ifc_dev));
ifp = ctx->ifc_ifp = if_gethandle(IFT_ETHER);
if (ifp == NULL) {
device_printf(dev, "can not allocate ifnet structure\n");
@@ -5430,8 +5435,8 @@ iflib_io_tqg_attach(struct grouptask *gt, void
*uniq, int cpu, char *name)
}

void
Post by David Wolfskill
(My build machine, which uses an re((4) NIC, did not encounter the issue.)
It appears that r332389 is implicated.
...
__curthread () at ./machine/pcpu.h:230
230 __asm("movq %%gs:%1,%0" : "=r" (td)
(kgdb) #0 __curthread () at ./machine/pcpu.h:230
#1 doadump (textdump=3) at /usr/src/sys/kern/kern_shutdown.c:361
#2 0xffffffff80433f4c in db_fncall_generic (addr=<optimized out>,
rv=<optimized out>, nargs=<optimized out>, args=<optimized out>)
at /usr/src/sys/ddb/db_command.c:609
#3 db_fncall (dummy1=<optimized out>, dummy2=<optimized out>,
dummy3=<optimized out>, dummy4=<optimized out>)
at /usr/src/sys/ddb/db_command.c:657
#4 0xffffffff80433a99 in db_command (last_cmdp=<optimized out>,
cmd_table=<optimized out>, dopager=<optimized out>)
at /usr/src/sys/ddb/db_command.c:481
#5 0xffffffff80433814 in db_command_loop ()
at /usr/src/sys/ddb/db_command.c:534
#6 0xffffffff80436a3f in db_trap (type=<optimized out>, code=<optimized out>)
at /usr/src/sys/ddb/db_main.c:250
#7 0xffffffff80b753e3 in kdb_trap (type=3, code=-61456, tf=<optimized out>)
at /usr/src/sys/kern/subr_kdb.c:697
#8 0xffffffff80f7eaa8 in trap (frame=0xfffffe00004377a0)
at /usr/src/sys/amd64/amd64/trap.c:548
#9 <signal handler called>
#10 kdb_enter (why=0xffffffff811df9d4 "panic", msg=<optimized out>)
at /usr/src/sys/kern/subr_kdb.c:479
#11 0xffffffff80b2feda in vpanic (fmt=<optimized out>, ap=0xfffffe0000437910)
at /usr/src/sys/kern/kern_shutdown.c:826
#12 0xffffffff80b2fca0 in kassert_panic (
at /usr/src/sys/kern/kern_shutdown.c:723
#13 0xffffffff80b0ec93 in __mtx_lock_flags (c=0xfffff80008c85d88, opts=0,
file=0xffffffff81113c90 "/usr/src/sys/net/iflib.c", line=<optimized out>)
at /usr/src/sys/kern/kern_mutex.c:246
#14 0xffffffff80c466e1 in _task_fn_admin (context=0xfffff80008c85c00)
at /usr/src/sys/net/iflib.c:3716
#15 0xffffffff80b73849 in gtaskqueue_run_locked (queue=0xfffff80008489500)
at /usr/src/sys/kern/subr_gtaskqueue.c:331
#16 0xffffffff80b735c8 in gtaskqueue_thread_loop (arg=<optimized out>)
at /usr/src/sys/kern/subr_gtaskqueue.c:506
#17 0xffffffff80af0064 in fork_exit (
callout=0xffffffff80b73540 <gtaskqueue_thread_loop>,
arg=0xfffffe0844223008, frame=0xfffffe0000437ac0)
at /usr/src/sys/kern/kern_fork.c:1039
#18 <signal handler called>
(kgdb)
If the dump would be useful, I can put it up for access.
Peace,
david
--
Well, what did you EXPECT from Trump? He has a history of breaking promises.
See http://www.catwhisker.org/~david/publickey.gpg for my public key.
K. Macy
2018-04-11 17:26:36 UTC
Permalink
Actually ctx lock is still a mutex. Just add the STATE_LOCK_INIT line.
-M
Post by K. Macy
Sorry about that. It looks like my review must have been missing a line.
@@ -4702,8 +4707,8 @@ iflib_register(if_ctx_t ctx)
_iflib_assert(sctx);
- CTX_LOCK_INIT(ctx, device_get_nameunit(ctx->ifc_dev));
-
+ CTX_LOCK_INIT(ctx);
+ STATE_LOCK_INIT(ctx, device_get_nameunit(ctx->ifc_dev));
ifp = ctx->ifc_ifp = if_gethandle(IFT_ETHER);
if (ifp == NULL) {
device_printf(dev, "can not allocate ifnet structure\n");
@@ -5430,8 +5435,8 @@ iflib_io_tqg_attach(struct grouptask *gt, void
*uniq, int cpu, char *name)
}
void
Post by David Wolfskill
(My build machine, which uses an re((4) NIC, did not encounter the issue.)
It appears that r332389 is implicated.
...
__curthread () at ./machine/pcpu.h:230
230 __asm("movq %%gs:%1,%0" : "=r" (td)
(kgdb) #0 __curthread () at ./machine/pcpu.h:230
#1 doadump (textdump=3) at /usr/src/sys/kern/kern_shutdown.c:361
#2 0xffffffff80433f4c in db_fncall_generic (addr=<optimized out>,
rv=<optimized out>, nargs=<optimized out>, args=<optimized out>)
at /usr/src/sys/ddb/db_command.c:609
#3 db_fncall (dummy1=<optimized out>, dummy2=<optimized out>,
dummy3=<optimized out>, dummy4=<optimized out>)
at /usr/src/sys/ddb/db_command.c:657
#4 0xffffffff80433a99 in db_command (last_cmdp=<optimized out>,
cmd_table=<optimized out>, dopager=<optimized out>)
at /usr/src/sys/ddb/db_command.c:481
#5 0xffffffff80433814 in db_command_loop ()
at /usr/src/sys/ddb/db_command.c:534
#6 0xffffffff80436a3f in db_trap (type=<optimized out>, code=<optimized out>)
at /usr/src/sys/ddb/db_main.c:250
#7 0xffffffff80b753e3 in kdb_trap (type=3, code=-61456, tf=<optimized out>)
at /usr/src/sys/kern/subr_kdb.c:697
#8 0xffffffff80f7eaa8 in trap (frame=0xfffffe00004377a0)
at /usr/src/sys/amd64/amd64/trap.c:548
#9 <signal handler called>
#10 kdb_enter (why=0xffffffff811df9d4 "panic", msg=<optimized out>)
at /usr/src/sys/kern/subr_kdb.c:479
#11 0xffffffff80b2feda in vpanic (fmt=<optimized out>, ap=0xfffffe0000437910)
at /usr/src/sys/kern/kern_shutdown.c:826
#12 0xffffffff80b2fca0 in kassert_panic (
at /usr/src/sys/kern/kern_shutdown.c:723
#13 0xffffffff80b0ec93 in __mtx_lock_flags (c=0xfffff80008c85d88, opts=0,
file=0xffffffff81113c90 "/usr/src/sys/net/iflib.c", line=<optimized out>)
at /usr/src/sys/kern/kern_mutex.c:246
#14 0xffffffff80c466e1 in _task_fn_admin (context=0xfffff80008c85c00)
at /usr/src/sys/net/iflib.c:3716
#15 0xffffffff80b73849 in gtaskqueue_run_locked (queue=0xfffff80008489500)
at /usr/src/sys/kern/subr_gtaskqueue.c:331
#16 0xffffffff80b735c8 in gtaskqueue_thread_loop (arg=<optimized out>)
at /usr/src/sys/kern/subr_gtaskqueue.c:506
#17 0xffffffff80af0064 in fork_exit (
callout=0xffffffff80b73540 <gtaskqueue_thread_loop>,
arg=0xfffffe0844223008, frame=0xfffffe0000437ac0)
at /usr/src/sys/kern/kern_fork.c:1039
#18 <signal handler called>
(kgdb)
If the dump would be useful, I can put it up for access.
Peace,
david
--
Well, what did you EXPECT from Trump? He has a history of breaking promises.
See http://www.catwhisker.org/~david/publickey.gpg for my public key.
K. Macy
2018-04-11 17:32:42 UTC
Permalink
Chalk another review fail up to shuffling patches between git and phab.
Sorry for the inconvenience.
-M
Post by K. Macy
Actually ctx lock is still a mutex. Just add the STATE_LOCK_INIT line.
-M
Post by K. Macy
Sorry about that. It looks like my review must have been missing a line.
@@ -4702,8 +4707,8 @@ iflib_register(if_ctx_t ctx)
_iflib_assert(sctx);
- CTX_LOCK_INIT(ctx, device_get_nameunit(ctx->ifc_dev));
-
+ CTX_LOCK_INIT(ctx);
+ STATE_LOCK_INIT(ctx, device_get_nameunit(ctx->ifc_dev));
ifp = ctx->ifc_ifp = if_gethandle(IFT_ETHER);
if (ifp == NULL) {
device_printf(dev, "can not allocate ifnet structure\n");
@@ -5430,8 +5435,8 @@ iflib_io_tqg_attach(struct grouptask *gt, void
*uniq, int cpu, char *name)
}
void
Post by David Wolfskill
(My build machine, which uses an re((4) NIC, did not encounter the issue.)
It appears that r332389 is implicated.
...
__curthread () at ./machine/pcpu.h:230
230 __asm("movq %%gs:%1,%0" : "=r" (td)
(kgdb) #0 __curthread () at ./machine/pcpu.h:230
#1 doadump (textdump=3) at /usr/src/sys/kern/kern_shutdown.c:361
#2 0xffffffff80433f4c in db_fncall_generic (addr=<optimized out>,
rv=<optimized out>, nargs=<optimized out>, args=<optimized out>)
at /usr/src/sys/ddb/db_command.c:609
#3 db_fncall (dummy1=<optimized out>, dummy2=<optimized out>,
dummy3=<optimized out>, dummy4=<optimized out>)
at /usr/src/sys/ddb/db_command.c:657
#4 0xffffffff80433a99 in db_command (last_cmdp=<optimized out>,
cmd_table=<optimized out>, dopager=<optimized out>)
at /usr/src/sys/ddb/db_command.c:481
#5 0xffffffff80433814 in db_command_loop ()
at /usr/src/sys/ddb/db_command.c:534
#6 0xffffffff80436a3f in db_trap (type=<optimized out>, code=<optimized out>)
at /usr/src/sys/ddb/db_main.c:250
#7 0xffffffff80b753e3 in kdb_trap (type=3, code=-61456, tf=<optimized out>)
at /usr/src/sys/kern/subr_kdb.c:697
#8 0xffffffff80f7eaa8 in trap (frame=0xfffffe00004377a0)
at /usr/src/sys/amd64/amd64/trap.c:548
#9 <signal handler called>
#10 kdb_enter (why=0xffffffff811df9d4 "panic", msg=<optimized out>)
at /usr/src/sys/kern/subr_kdb.c:479
#11 0xffffffff80b2feda in vpanic (fmt=<optimized out>, ap=0xfffffe0000437910)
at /usr/src/sys/kern/kern_shutdown.c:826
#12 0xffffffff80b2fca0 in kassert_panic (
at /usr/src/sys/kern/kern_shutdown.c:723
#13 0xffffffff80b0ec93 in __mtx_lock_flags (c=0xfffff80008c85d88, opts=0,
file=0xffffffff81113c90 "/usr/src/sys/net/iflib.c", line=<optimized out>)
at /usr/src/sys/kern/kern_mutex.c:246
#14 0xffffffff80c466e1 in _task_fn_admin (context=0xfffff80008c85c00)
at /usr/src/sys/net/iflib.c:3716
#15 0xffffffff80b73849 in gtaskqueue_run_locked (queue=0xfffff80008489500)
at /usr/src/sys/kern/subr_gtaskqueue.c:331
#16 0xffffffff80b735c8 in gtaskqueue_thread_loop (arg=<optimized out>)
at /usr/src/sys/kern/subr_gtaskqueue.c:506
#17 0xffffffff80af0064 in fork_exit (
callout=0xffffffff80b73540 <gtaskqueue_thread_loop>,
arg=0xfffffe0844223008, frame=0xfffffe0000437ac0)
at /usr/src/sys/kern/kern_fork.c:1039
#18 <signal handler called>
(kgdb)
If the dump would be useful, I can put it up for access.
Peace,
david
--
Well, what did you EXPECT from Trump? He has a history of breaking promises.
See http://www.catwhisker.org/~david/publickey.gpg for my public key.
Loading...