Discussion:
Booting UEFI ZFS is broken on arm64
(too old to reply)
Shawn Webb
2017-11-30 00:21:35 UTC
Permalink
It appears that in the latest FreeBSD 12-CURRENT/arm64 snapshot,
booting UEFI GPT ZFS on my OverDrive 1000 is broken. It boots up to
this line:

Using DTB provided by EFI at 0x801fe00000.

Then all output stops. I'm in the process of building a custom install
ISO that has DEBUG turned on in the fdt code. I hope to report back
soon-ish, unless anyone else has any ideas as to what could be wrong.

Thanks,
--
Shawn Webb
Cofounder and Security Engineer
HardenedBSD

GPG Key ID: 0x6A84658F52456EEE
GPG Key Fingerprint: 2ABA B6BD EF6A F486 BE89 3D9E 6A84 658F 5245 6EEE
Warner Losh
2017-11-30 00:33:46 UTC
Permalink
Post by Shawn Webb
It appears that in the latest FreeBSD 12-CURRENT/arm64 snapshot,
booting UEFI GPT ZFS on my OverDrive 1000 is broken. It boots up to
Using DTB provided by EFI at 0x801fe00000.
Which snapshot is that? Boot1 was broken until recently.

Warner
Shawn Webb
2017-11-30 00:34:58 UTC
Permalink
Post by Warner Losh
Post by Shawn Webb
It appears that in the latest FreeBSD 12-CURRENT/arm64 snapshot,
booting UEFI GPT ZFS on my OverDrive 1000 is broken. It boots up to
Using DTB provided by EFI at 0x801fe00000.
Which snapshot is that? Boot1 was broken until recently.
FreeBSD-12.0-CURRENT-arm64-aarch64-20171121-r326056-memstick.img

It also happens on latest HEAD, so it would appear to still be broken.
--
Shawn Webb
Cofounder and Security Engineer
HardenedBSD

GPG Key ID: 0x6A84658F52456EEE
GPG Key Fingerprint: 2ABA B6BD EF6A F486 BE89 3D9E 6A84 658F 5245 6EEE
Warner Losh
2017-11-30 00:42:52 UTC
Permalink
Post by Shawn Webb
Post by Warner Losh
Post by Shawn Webb
It appears that in the latest FreeBSD 12-CURRENT/arm64 snapshot,
booting UEFI GPT ZFS on my OverDrive 1000 is broken. It boots up to
Using DTB provided by EFI at 0x801fe00000.
Which snapshot is that? Boot1 was broken until recently.
FreeBSD-12.0-CURRENT-arm64-aarch64-20171121-r326056-memstick.img
It also happens on latest HEAD, so it would appear to still be broken.
Is this boot1.efi producing the output, or loader.efi? I'm guessing the
latter, but wanted to make sure. If so, then we're past the point where
boot1.efi would have failed (besides, it was fixed before that snapshot).

Warner
Shawn Webb
2017-11-30 00:43:58 UTC
Permalink
Post by Warner Losh
Post by Shawn Webb
Post by Warner Losh
Post by Shawn Webb
It appears that in the latest FreeBSD 12-CURRENT/arm64 snapshot,
booting UEFI GPT ZFS on my OverDrive 1000 is broken. It boots up to
Using DTB provided by EFI at 0x801fe00000.
Which snapshot is that? Boot1 was broken until recently.
FreeBSD-12.0-CURRENT-arm64-aarch64-20171121-r326056-memstick.img
It also happens on latest HEAD, so it would appear to still be broken.
Is this boot1.efi producing the output, or loader.efi? I'm guessing the
latter, but wanted to make sure. If so, then we're past the point where
boot1.efi would have failed (besides, it was fixed before that snapshot).
With DEBUG turned on for stand/fdt:

Booting [/boot/kernel/kernel]...
fdt_copy(): fdt_copy va 0x01208000
fdt_setup_fdtp(): fdt_setup_fdtp()
fdt_load_dtb_addr(): fdt_load_dtb_addr(0x801fe00000)
Using DTB provided by EFI at 0x801fe00000.
Loaded the platform dtb: 0x81f56f1630.
fdt_fixup(): fdt_fixup()

^ hangs after that message
--
Shawn Webb
Cofounder and Security Engineer
HardenedBSD

GPG Key ID: 0x6A84658F52456EEE
GPG Key Fingerprint: 2ABA B6BD EF6A F486 BE89 3D9E 6A84 658F 5245 6EEE
Warner Losh
2017-11-30 00:54:25 UTC
Permalink
Post by Shawn Webb
Post by Warner Losh
Post by Shawn Webb
On Wed, Nov 29, 2017 at 5:21 PM, Shawn Webb <
Post by Shawn Webb
It appears that in the latest FreeBSD 12-CURRENT/arm64 snapshot,
booting UEFI GPT ZFS on my OverDrive 1000 is broken. It boots up to
Using DTB provided by EFI at 0x801fe00000.
Which snapshot is that? Boot1 was broken until recently.
FreeBSD-12.0-CURRENT-arm64-aarch64-20171121-r326056-memstick.img
It also happens on latest HEAD, so it would appear to still be broken.
Is this boot1.efi producing the output, or loader.efi? I'm guessing the
latter, but wanted to make sure. If so, then we're past the point where
boot1.efi would have failed (besides, it was fixed before that snapshot).
Booting [/boot/kernel/kernel]...
fdt_copy(): fdt_copy va 0x01208000
fdt_setup_fdtp(): fdt_setup_fdtp()
fdt_load_dtb_addr(): fdt_load_dtb_addr(0x801fe00000)
Using DTB provided by EFI at 0x801fe00000.
Loaded the platform dtb: 0x81f56f1630.
fdt_fixup(): fdt_fixup()
^ hangs after that message
That doesn't sound like anything I've changed, but it could well be... I
think to find this breakage, you may need to bisect backwards along stand /
sys/boot until we find the spot where it broke.

Warner
Warner Losh
2017-12-01 21:53:53 UTC
Permalink
Post by Shawn Webb
On Wed, Nov 29, 2017 at 5:43 PM, Shawn Webb <
Post by Warner Losh
On Wed, Nov 29, 2017 at 5:34 PM, Shawn Webb <
Post by Shawn Webb
On Wed, Nov 29, 2017 at 5:21 PM, Shawn Webb <
Post by Shawn Webb
It appears that in the latest FreeBSD 12-CURRENT/arm64
snapshot,
Post by Warner Losh
Post by Shawn Webb
Post by Shawn Webb
booting UEFI GPT ZFS on my OverDrive 1000 is broken. It boots
up
Post by Warner Losh
to
Post by Shawn Webb
Post by Shawn Webb
Using DTB provided by EFI at 0x801fe00000.
Which snapshot is that? Boot1 was broken until recently.
FreeBSD-12.0-CURRENT-arm64-aarch64-20171121-r326056-memstick.img
It also happens on latest HEAD, so it would appear to still be
broken.
Post by Warner Losh
Is this boot1.efi producing the output, or loader.efi? I'm guessing
the
Post by Warner Losh
latter, but wanted to make sure. If so, then we're past the point
where
Post by Warner Losh
boot1.efi would have failed (besides, it was fixed before that
snapshot).
Booting [/boot/kernel/kernel]...
fdt_copy(): fdt_copy va 0x01208000
fdt_setup_fdtp(): fdt_setup_fdtp()
fdt_load_dtb_addr(): fdt_load_dtb_addr(0x801fe00000)
Using DTB provided by EFI at 0x801fe00000.
Loaded the platform dtb: 0x81f56f1630.
fdt_fixup(): fdt_fixup()
^ hangs after that message
That doesn't sound like anything I've changed, but it could well be...
I
think to find this breakage, you may need to bisect backwards along
stand /
sys/boot until we find the spot where it broke.
There's been several conversations on IRC about how others are hitting a
scheduler bug, at least on x86. hps' fix seems to do the trick for their
issues.
Date: Wed Nov 29 23:28:40 2017 +0000
The sched_add() function is not only used when the thread is
initially
started, but also by the turnstiles to mark a thread as runnable for
setrunnable()->sched_wakeup()->sched_add()
In r326218 code was added to allow booting from non-zero CPU numbers
by setting the ts_cpu field inside the ULE scheduler's sched_add()
function. This had an undesired side-effect that prior sched_pin()
and
sched_bind() calls got disregarded. This patch fixes the
initialization of the ts_cpu field for the ULE scheduler to only
happen once when the initial thread is constructed during system
init. Forking will then later on ensure that a valid ts_cpu value
gets
copied to all children.
Reviewed by: jhb, kib
Discussed with: nwhitehorn
MFC after: 1 month
Differential revision: https://reviews.freebsd.org/D13298
Sponsored by: Mellanox Technologies
ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
is the fix.... But the bug it fixes post-dates the snapshot, so maybe
this
isn't the same thing...
Definitely is not the same thing. I've so far got it printf'd to where
the uefi loader jumps into the kernel's entry point. So the loader
itself might be fine. Something in the kernel, then, is going funky.
Booting in verbose mode does not provide any additional input.
Here's the output I get (some of the output is from printf's I've
FreeBSD/arm64 EFI loader, Revision 1.1
EFI boot environment
Loading /boot/defaults/loader.conf
/boot/kernel/kernel text=0x7e0a78 data=0xaad80+0x443f62
syms=[0x8+0x10ec78+0x8+0x1021d4]
/boot/entropy size=0x1000
/boot/kernel/zfs.ko text=0x99070 text=0x130390 data=0x21ff8+0x9ef98
syms=[0x8+0x22c68+0x8+0x1b99b]
/boot/kernel/opensolaris.ko text=0x1330 text=0xd00 data=0x10160+0x125d0
syms=[0x8+0xff0+0x8+0x8d8]
Hit [Enter] to boot immediately, or any other key for command prompt.
Booting [/boot/kernel/kernel]...
Using DTB provided by EFI at 0x801fe00000.
fdt_copy returned. dtb_size is 9060.
bi_load finished. err: 0
dev_cleanup finished
About to call into the entry point at 0x81ee601000
You might try booting the same kernel off a small UFS partition. There's a
tiny chance that the loader didn't load it right, but more likely the
kernel is borked. Maybe DTB issues? Maybe something else... A quick test
like that would remove ZFS from the equation, even if it's just a USB
stick...

Warner
Shawn Webb
2017-12-01 21:55:31 UTC
Permalink
Post by Warner Losh
Post by Shawn Webb
On Wed, Nov 29, 2017 at 5:43 PM, Shawn Webb <
Post by Warner Losh
On Wed, Nov 29, 2017 at 5:34 PM, Shawn Webb <
Post by Shawn Webb
On Wed, Nov 29, 2017 at 5:21 PM, Shawn Webb <
Post by Shawn Webb
It appears that in the latest FreeBSD 12-CURRENT/arm64
snapshot,
Post by Warner Losh
Post by Shawn Webb
Post by Shawn Webb
booting UEFI GPT ZFS on my OverDrive 1000 is broken. It boots
up
Post by Warner Losh
to
Post by Shawn Webb
Post by Shawn Webb
Using DTB provided by EFI at 0x801fe00000.
Which snapshot is that? Boot1 was broken until recently.
FreeBSD-12.0-CURRENT-arm64-aarch64-20171121-r326056-memstick.img
It also happens on latest HEAD, so it would appear to still be
broken.
Post by Warner Losh
Is this boot1.efi producing the output, or loader.efi? I'm guessing
the
Post by Warner Losh
latter, but wanted to make sure. If so, then we're past the point
where
Post by Warner Losh
boot1.efi would have failed (besides, it was fixed before that
snapshot).
Booting [/boot/kernel/kernel]...
fdt_copy(): fdt_copy va 0x01208000
fdt_setup_fdtp(): fdt_setup_fdtp()
fdt_load_dtb_addr(): fdt_load_dtb_addr(0x801fe00000)
Using DTB provided by EFI at 0x801fe00000.
Loaded the platform dtb: 0x81f56f1630.
fdt_fixup(): fdt_fixup()
^ hangs after that message
That doesn't sound like anything I've changed, but it could well be...
I
think to find this breakage, you may need to bisect backwards along
stand /
sys/boot until we find the spot where it broke.
There's been several conversations on IRC about how others are hitting a
scheduler bug, at least on x86. hps' fix seems to do the trick for their
issues.
Date: Wed Nov 29 23:28:40 2017 +0000
The sched_add() function is not only used when the thread is
initially
started, but also by the turnstiles to mark a thread as runnable for
setrunnable()->sched_wakeup()->sched_add()
In r326218 code was added to allow booting from non-zero CPU numbers
by setting the ts_cpu field inside the ULE scheduler's sched_add()
function. This had an undesired side-effect that prior sched_pin()
and
sched_bind() calls got disregarded. This patch fixes the
initialization of the ts_cpu field for the ULE scheduler to only
happen once when the initial thread is constructed during system
init. Forking will then later on ensure that a valid ts_cpu value
gets
copied to all children.
Reviewed by: jhb, kib
Discussed with: nwhitehorn
MFC after: 1 month
Differential revision: https://reviews.freebsd.org/D13298
Sponsored by: Mellanox Technologies
ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
is the fix.... But the bug it fixes post-dates the snapshot, so maybe
this
isn't the same thing...
Definitely is not the same thing. I've so far got it printf'd to where
the uefi loader jumps into the kernel's entry point. So the loader
itself might be fine. Something in the kernel, then, is going funky.
Booting in verbose mode does not provide any additional input.
Here's the output I get (some of the output is from printf's I've
FreeBSD/arm64 EFI loader, Revision 1.1
EFI boot environment
Loading /boot/defaults/loader.conf
/boot/kernel/kernel text=0x7e0a78 data=0xaad80+0x443f62
syms=[0x8+0x10ec78+0x8+0x1021d4]
/boot/entropy size=0x1000
/boot/kernel/zfs.ko text=0x99070 text=0x130390 data=0x21ff8+0x9ef98
syms=[0x8+0x22c68+0x8+0x1b99b]
/boot/kernel/opensolaris.ko text=0x1330 text=0xd00 data=0x10160+0x125d0
syms=[0x8+0xff0+0x8+0x8d8]
Hit [Enter] to boot immediately, or any other key for command prompt.
Booting [/boot/kernel/kernel]...
Using DTB provided by EFI at 0x801fe00000.
fdt_copy returned. dtb_size is 9060.
bi_load finished. err: 0
dev_cleanup finished
About to call into the entry point at 0x81ee601000
You might try booting the same kernel off a small UFS partition. There's a
tiny chance that the loader didn't load it right, but more likely the
kernel is borked. Maybe DTB issues? Maybe something else... A quick test
like that would remove ZFS from the equation, even if it's just a USB
stick...
UFS works fine and dandy. It's ZFS that's b0rked.

Thanks,
--
Shawn Webb
Cofounder and Security Engineer
HardenedBSD

GPG Key ID: 0x6A84658F52456EEE
GPG Key Fingerprint: 2ABA B6BD EF6A F486 BE89 3D9E 6A84 658F 5245 6EEE
Warner Losh
2017-12-01 21:57:35 UTC
Permalink
Post by Shawn Webb
Post by Warner Losh
Post by Shawn Webb
On Wed, Nov 29, 2017 at 5:43 PM, Shawn Webb <
Post by Warner Losh
On Wed, Nov 29, 2017 at 5:34 PM, Shawn Webb <
Post by Shawn Webb
On Wed, Nov 29, 2017 at 5:21 PM, Shawn Webb <
Post by Shawn Webb
It appears that in the latest FreeBSD 12-CURRENT/arm64
snapshot,
Post by Warner Losh
Post by Shawn Webb
Post by Shawn Webb
booting UEFI GPT ZFS on my OverDrive 1000 is broken. It
boots
Post by Warner Losh
Post by Shawn Webb
up
Post by Warner Losh
to
Post by Shawn Webb
Post by Shawn Webb
Using DTB provided by EFI at 0x801fe00000.
Which snapshot is that? Boot1 was broken until recently.
FreeBSD-12.0-CURRENT-arm64-aarch64-20171121-r326056-
memstick.img
Post by Warner Losh
Post by Shawn Webb
Post by Warner Losh
Post by Shawn Webb
It also happens on latest HEAD, so it would appear to still be
broken.
Post by Warner Losh
Is this boot1.efi producing the output, or loader.efi? I'm
guessing
Post by Warner Losh
Post by Shawn Webb
the
Post by Warner Losh
latter, but wanted to make sure. If so, then we're past the
point
Post by Warner Losh
Post by Shawn Webb
where
Post by Warner Losh
boot1.efi would have failed (besides, it was fixed before that
snapshot).
Booting [/boot/kernel/kernel]...
fdt_copy(): fdt_copy va 0x01208000
fdt_setup_fdtp(): fdt_setup_fdtp()
fdt_load_dtb_addr(): fdt_load_dtb_addr(0x801fe00000)
Using DTB provided by EFI at 0x801fe00000.
Loaded the platform dtb: 0x81f56f1630.
fdt_fixup(): fdt_fixup()
^ hangs after that message
That doesn't sound like anything I've changed, but it could well
be...
Post by Warner Losh
Post by Shawn Webb
I
think to find this breakage, you may need to bisect backwards along
stand /
sys/boot until we find the spot where it broke.
There's been several conversations on IRC about how others are
hitting a
Post by Warner Losh
Post by Shawn Webb
scheduler bug, at least on x86. hps' fix seems to do the trick for
their
Post by Warner Losh
Post by Shawn Webb
issues.
Date: Wed Nov 29 23:28:40 2017 +0000
The sched_add() function is not only used when the thread is
initially
started, but also by the turnstiles to mark a thread as runnable
for
Post by Warner Losh
Post by Shawn Webb
setrunnable()->sched_wakeup()->sched_add()
In r326218 code was added to allow booting from non-zero CPU
numbers
Post by Warner Losh
Post by Shawn Webb
by setting the ts_cpu field inside the ULE scheduler's
sched_add()
Post by Warner Losh
Post by Shawn Webb
function. This had an undesired side-effect that prior
sched_pin()
Post by Warner Losh
Post by Shawn Webb
and
sched_bind() calls got disregarded. This patch fixes the
initialization of the ts_cpu field for the ULE scheduler to only
happen once when the initial thread is constructed during system
init. Forking will then later on ensure that a valid ts_cpu value
gets
copied to all children.
Reviewed by: jhb, kib
Discussed with: nwhitehorn
MFC after: 1 month
Differential revision: https://reviews.freebsd.org/D13298
Sponsored by: Mellanox Technologies
ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
is the fix.... But the bug it fixes post-dates the snapshot, so maybe
this
isn't the same thing...
Definitely is not the same thing. I've so far got it printf'd to where
the uefi loader jumps into the kernel's entry point. So the loader
itself might be fine. Something in the kernel, then, is going funky.
Booting in verbose mode does not provide any additional input.
Here's the output I get (some of the output is from printf's I've
FreeBSD/arm64 EFI loader, Revision 1.1
EFI boot environment
Loading /boot/defaults/loader.conf
/boot/kernel/kernel text=0x7e0a78 data=0xaad80+0x443f62
syms=[0x8+0x10ec78+0x8+0x1021d4]
/boot/entropy size=0x1000
/boot/kernel/zfs.ko text=0x99070 text=0x130390 data=0x21ff8+0x9ef98
syms=[0x8+0x22c68+0x8+0x1b99b]
/boot/kernel/opensolaris.ko text=0x1330 text=0xd00 data=0x10160+0x125d0
syms=[0x8+0xff0+0x8+0x8d8]
Hit [Enter] to boot immediately, or any other key for command prompt.
Booting [/boot/kernel/kernel]...
Using DTB provided by EFI at 0x801fe00000.
fdt_copy returned. dtb_size is 9060.
bi_load finished. err: 0
dev_cleanup finished
About to call into the entry point at 0x81ee601000
You might try booting the same kernel off a small UFS partition. There's
a
Post by Warner Losh
tiny chance that the loader didn't load it right, but more likely the
kernel is borked. Maybe DTB issues? Maybe something else... A quick test
like that would remove ZFS from the equation, even if it's just a USB
stick...
UFS works fine and dandy. It's ZFS that's b0rked.
OK. Let me know what you find... I assume the entry point matches with
what you've loaded?

Warner
Andrew Turner
2017-12-04 11:26:42 UTC
Permalink
Post by Shawn Webb
It appears that in the latest FreeBSD 12-CURRENT/arm64 snapshot,
booting UEFI GPT ZFS on my OverDrive 1000 is broken. It boots up to
Using DTB provided by EFI at 0x801fe00000.
Then all output stops. I'm in the process of building a custom install
ISO that has DEBUG turned on in the fdt code. I hope to report back
soon-ish, unless anyone else has any ideas as to what could be wrong.
This should be fixed in r326525.

Andrew

Loading...