Discussion:
Ryzen public erratas
(too old to reply)
Konstantin Belousov
2018-06-13 10:35:35 UTC
Permalink
Today I noted that AMD published the public errata document for Ryzens,
https://developer.amd.com/wp-content/resources/55449_1.12.pdf

Some of the issues listed there looks quite relevant to the potential
hangs that some people still experience with the machines. I wrote
a script which should apply the recommended workarounds to the erratas
that I find interesting.

To run it, kldload cpuctl, then apply the latest firmware update to your
CPU, then run the following shell script. Comments indicate the errata
number for the workarounds.

Please report the results. If the script helps, I will code the kernel
change to apply the workarounds.

#!/bin/sh

# Enable workarounds for erratas listed in
# https://developer.amd.com/wp-content/resources/55449_1.12.pdf

# 1057, 1109
sysctl machdep.idle_mwait=0
sysctl machdep.idle=hlt

for x in /dev/cpuctl*; do
# 1021
cpucontrol -m '0xc0011029|=0x2000' $x
# 1033
cpucontrol -m '0xc0011020|=0x10' $x
# 1049
cpucontrol -m '0xc0011028|=0x10' $x
# 1095
cpucontrol -m '0xc0011020|=0x200000000000000' $x
done
Johannes Lundberg
2018-06-13 11:06:42 UTC
Permalink
Post by Konstantin Belousov
Today I noted that AMD published the public errata document for Ryzens,
https://developer.amd.com/wp-content/resources/55449_1.12.pdf
Some of the issues listed there looks quite relevant to the potential
a script which should apply the recommended workarounds to the erratas
that I find interesting.
To run it, kldload cpuctl, then apply the latest firmware update to your
CPU, then run the following shell script. Comments indicate the errata
number for the workarounds.
Please report the results. If the script helps, I will code the kernel
change to apply the workarounds.
#!/bin/sh
# Enable workarounds for erratas listed in
# https://developer.amd.com/wp-content/resources/55449_1.12.pdf
# 1057, 1109
sysctl machdep.idle_mwait=0
sysctl machdep.idle=hlt
for x in /dev/cpuctl*; do
# 1021
cpucontrol -m '0xc0011029|=0x2000' $x
# 1033
cpucontrol -m '0xc0011020|=0x10' $x
# 1049
cpucontrol -m '0xc0011028|=0x10' $x
# 1095
cpucontrol -m '0xc0011020|=0x200000000000000' $x
done
Hi

Thanks for the fix! I'm trying it now on my Ryzen 3 2200G which does
experience some random occasional resets.

About updating to latest firmware, is this something that's done from BIOS or
from FreeBSD? If the latter, how?
Post by Konstantin Belousov
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-current
Konstantin Belousov
2018-06-13 11:46:25 UTC
Permalink
Post by Johannes Lundberg
Post by Konstantin Belousov
Today I noted that AMD published the public errata document for Ryzens,
https://developer.amd.com/wp-content/resources/55449_1.12.pdf
Some of the issues listed there looks quite relevant to the potential
a script which should apply the recommended workarounds to the erratas
that I find interesting.
To run it, kldload cpuctl, then apply the latest firmware update to your
CPU, then run the following shell script. Comments indicate the errata
number for the workarounds.
Please report the results. If the script helps, I will code the kernel
change to apply the workarounds.
#!/bin/sh
# Enable workarounds for erratas listed in
# https://developer.amd.com/wp-content/resources/55449_1.12.pdf
# 1057, 1109
sysctl machdep.idle_mwait=0
sysctl machdep.idle=hlt
for x in /dev/cpuctl*; do
# 1021
cpucontrol -m '0xc0011029|=0x2000' $x
# 1033
cpucontrol -m '0xc0011020|=0x10' $x
# 1049
cpucontrol -m '0xc0011028|=0x10' $x
# 1095
cpucontrol -m '0xc0011020|=0x200000000000000' $x
done
Hi
Thanks for the fix! I'm trying it now on my Ryzen 3 2200G which does
experience some random occasional resets.
About updating to latest firmware, is this something that's done from BIOS or
from FreeBSD? If the latter, how?
From FreeBSD, install sysutils/devcpu-data then do
service microcode_update start
and of course, you must flash latest BIOS.

The microcode_update must be applied before running this script.
Gary Jennejohn
2018-06-13 14:52:56 UTC
Permalink
On Wed, 13 Jun 2018 14:46:25 +0300
Post by Konstantin Belousov
Post by Johannes Lundberg
Post by Konstantin Belousov
Today I noted that AMD published the public errata document for Ryzens,
https://developer.amd.com/wp-content/resources/55449_1.12.pdf
Some of the issues listed there looks quite relevant to the potential
a script which should apply the recommended workarounds to the erratas
that I find interesting.
To run it, kldload cpuctl, then apply the latest firmware update to your
CPU, then run the following shell script. Comments indicate the errata
number for the workarounds.
Please report the results. If the script helps, I will code the kernel
change to apply the workarounds.
#!/bin/sh
# Enable workarounds for erratas listed in
# https://developer.amd.com/wp-content/resources/55449_1.12.pdf
# 1057, 1109
sysctl machdep.idle_mwait=0
sysctl machdep.idle=hlt
for x in /dev/cpuctl*; do
# 1021
cpucontrol -m '0xc0011029|=0x2000' $x
# 1033
cpucontrol -m '0xc0011020|=0x10' $x
# 1049
cpucontrol -m '0xc0011028|=0x10' $x
# 1095
cpucontrol -m '0xc0011020|=0x200000000000000' $x
done
Hi
Thanks for the fix! I'm trying it now on my Ryzen 3 2200G which does
experience some random occasional resets.
About updating to latest firmware, is this something that's done from BIOS or
from FreeBSD? If the latter, how?
From FreeBSD, install sysutils/devcpu-data then do
service microcode_update start
and of course, you must flash latest BIOS.
The microcode_update must be applied before running this script.
I added before and after outputs to my version of the script and
saw that my BIOS is setting all the relevant bits at start up.

So, a BIOS update might help.
--
Gary Jennejohn
Eitan Adler
2018-06-13 11:16:55 UTC
Permalink
Post by Konstantin Belousov
Today I noted that AMD published the public errata document for Ryzens,
https://developer.amd.com/wp-content/resources/55449_1.12.pdf
Some of the issues listed there looks quite relevant to the potential
a script which should apply the recommended workarounds to the erratas
that I find interesting.
To run it, kldload cpuctl, then apply the latest firmware update to your
CPU, then run the following shell script. Comments indicate the errata
number for the workarounds.
Please report the results. If the script helps, I will code the kernel
change to apply the workarounds.
#!/bin/sh
# Enable workarounds for erratas listed in
# https://developer.amd.com/wp-content/resources/55449_1.12.pdf
# 1057, 1109
sysctl machdep.idle_mwait=0
sysctl machdep.idle=hlt
Is this needed if it was previously machdep.idle: acpi ?
--
Eitan Adler
Eitan Adler
2018-06-19 05:44:13 UTC
Permalink
Post by Eitan Adler
Post by Konstantin Belousov
Today I noted that AMD published the public errata document for Ryzens,
https://developer.amd.com/wp-content/resources/55449_1.12.pdf
Some of the issues listed there looks quite relevant to the potential
a script which should apply the recommended workarounds to the erratas
that I find interesting.
To run it, kldload cpuctl, then apply the latest firmware update to your
CPU, then run the following shell script. Comments indicate the errata
number for the workarounds.
Please report the results. If the script helps, I will code the kernel
change to apply the workarounds.
#!/bin/sh
# Enable workarounds for erratas listed in
# https://developer.amd.com/wp-content/resources/55449_1.12.pdf
# 1057, 1109
sysctl machdep.idle_mwait=0
sysctl machdep.idle=hlt
Is this needed if it was previously machdep.idle: acpi ?
This might explain why I've never seen the lockup issues mentioned by
other people. What would cause my machine to differ from others?
--
Eitan Adler
Gary Jennejohn
2018-06-19 09:50:30 UTC
Permalink
On Mon, 18 Jun 2018 22:44:13 -0700
Post by Eitan Adler
Post by Eitan Adler
Post by Konstantin Belousov
Today I noted that AMD published the public errata document for Ryzens,
https://developer.amd.com/wp-content/resources/55449_1.12.pdf
Some of the issues listed there looks quite relevant to the potential
a script which should apply the recommended workarounds to the erratas
that I find interesting.
To run it, kldload cpuctl, then apply the latest firmware update to your
CPU, then run the following shell script. Comments indicate the errata
number for the workarounds.
Please report the results. If the script helps, I will code the kernel
change to apply the workarounds.
#!/bin/sh
# Enable workarounds for erratas listed in
# https://developer.amd.com/wp-content/resources/55449_1.12.pdf
# 1057, 1109
sysctl machdep.idle_mwait=0
sysctl machdep.idle=hlt
Is this needed if it was previously machdep.idle: acpi ?
This might explain why I've never seen the lockup issues mentioned by
other people. What would cause my machine to differ from others?
I had sysctl machdep.idle_mwait=1 and machdep.idle=acpi before
applying the shell script. I had multiple lockups every week,
sometimes multiple lockups per day.

With the idle settings from the script it still locks up, but
not as often.

I suspect I also need to update the CPU firmware, although I
expect that the new BIOS version I installed last week would
have done that already.
--
Gary Jennejohn
Mike Tancsa
2018-06-13 20:41:02 UTC
Permalink
Post by Konstantin Belousov
Today I noted that AMD published the public errata document for Ryzens,
https://developer.amd.com/wp-content/resources/55449_1.12.pdf
Some of the issues listed there looks quite relevant to the potential
a script which should apply the recommended workarounds to the erratas
that I find interesting.
To run it, kldload cpuctl, then apply the latest firmware update to your
CPU, then run the following shell script. Comments indicate the errata
number for the workarounds.
Hi,

tl;dr: The Microcode changes seem to fix a hard lockup I was able to
reliable reproduce back in Feb.



The BIOS on my AMD is pretty up to date. I think it has the same
microcode as whats in the ports. x86info -a shows

***@ryzenbsd11:/home/mdtancsa # x86info -a | grep -i microc
Microcode patch level: 0x8001137
***@ryzenbsd11:/home/mdtancsa #

after running the microcode update and


***@ryzenbsd11:/home/mdtancsa # /usr/local/etc/rc.d/microcode_update
onestart
Updating CPU Microcode...
Done.
***@ryzenbsd11:/home/mdtancsa # x86info -a | grep -i microc
Microcode patch level: 0x8001137
***@ryzenbsd11:/home/mdtancsa #

However, the dmesg after the microcode update adds this line

AMD Extended Feature Extensions ID EBX=0x1007<CLZERO,IRPerf,XSaveErPtr>




CPU: AMD Ryzen 5 1600X Six-Core Processor (3593.36-MHz
K8-class CPU)
Origin="AuthenticAMD" Id=0x800f11 Family=0x17 Model=0x1 Stepping=1

Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>

Features2=0x7ed8320b<SSE3,PCLMULQDQ,MON,SSSE3,FMA,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM>
AMD
Features2=0x35c233ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,SKINIT,WDT,TCE,Topology,PCXC,PNXC,DBE,PL2I,MWAITX>
Structured Extended
Features=0x209c01a9<FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA>
XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
SVM: NP,NRIP,VClean,AFlush,DAssist,NAsids=32768
TSC: P-state invariant, performance statistics

I ran the script

***@ryzenbsd11:/home/mdtancsa # cat fix.sh
#!/bin/sh

# Enable workarounds for erratas listed in
# https://developer.amd.com/wp-content/resources/55449_1.12.pdf

# 1057, 1109
sysctl machdep.idle_mwait=0
sysctl machdep.idle=hlt

for x in /dev/cpuctl*; do
# 1021
cpucontrol -m '0xc0011029|=0x2000' $x
# 1033
cpucontrol -m '0xc0011020|=0x10' $x
# 1049
cpucontrol -m '0xc0011028|=0x10' $x
# 1095
cpucontrol -m '0xc0011020|=0x200000000000000' $x
echo $x
done
***@ryzenbsd11:/home/mdtancsa # sh ./fix.sh
machdep.idle_mwait: 1 -> 0
machdep.idle: acpi -> hlt
/dev/cpuctl0
/dev/cpuctl1
/dev/cpuctl10
/dev/cpuctl11
/dev/cpuctl2
/dev/cpuctl3
/dev/cpuctl4
/dev/cpuctl5
/dev/cpuctl6
/dev/cpuctl7
/dev/cpuctl8
/dev/cpuctl9
***@ryzenbsd11:/home/mdtancsa #

Using a FreeBSD stable from back in Feb, I was able to crash Ryzen and
Epyc based systems
(https://lists.freebsd.org/pipermail/freebsd-stable/2018-February/088439.html)
by generating a lot of traffic between the hypervisor and guests. The
same tests on an intel based box ran just fine.

e.g. start 3 guests in bhyve (amd64) and run combos of iperf3 between
them. It would not take too long, but the box would hard lock-- i.e.
blank screen, no crash dump etc.

With the latest micro code update, I have been running the same sort of
tests and so far so good. I will let them run overnight to see if things
are now stable on STABLE.

---Mike
Post by Konstantin Belousov
Please report the results. If the script helps, I will code the kernel
change to apply the workarounds.
#!/bin/sh
# Enable workarounds for erratas listed in
# https://developer.amd.com/wp-content/resources/55449_1.12.pdf
# 1057, 1109
sysctl machdep.idle_mwait=0
sysctl machdep.idle=hlt
for x in /dev/cpuctl*; do
# 1021
cpucontrol -m '0xc0011029|=0x2000' $x
# 1033
cpucontrol -m '0xc0011020|=0x10' $x
# 1049
cpucontrol -m '0xc0011028|=0x10' $x
# 1095
cpucontrol -m '0xc0011020|=0x200000000000000' $x
done
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-current
--
-------------------
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, ***@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
Eric van Gyzen
2018-06-14 13:36:03 UTC
Permalink
Post by Konstantin Belousov
Today I noted that AMD published the public errata document for Ryzens,
https://developer.amd.com/wp-content/resources/55449_1.12.pdf
Some of the issues listed there looks quite relevant to the potential
a script which should apply the recommended workarounds to the erratas
that I find interesting.
To run it, kldload cpuctl, then apply the latest firmware update to your
CPU, then run the following shell script. Comments indicate the errata
number for the workarounds.
Please report the results. If the script helps, I will code the kernel
change to apply the workarounds.
Kostik: This thread on the -stable list has a lot of positive feedback:

https://lists.freebsd.org/pipermail/freebsd-stable/2018-June/089110.html

Eric
Mike Tancsa
2018-06-14 14:24:17 UTC
Permalink
Post by Eric van Gyzen
Post by Konstantin Belousov
Today I noted that AMD published the public errata document for Ryzens,
https://developer.amd.com/wp-content/resources/55449_1.12.pdf
Some of the issues listed there looks quite relevant to the potential
a script which should apply the recommended workarounds to the erratas
that I find interesting.
To run it, kldload cpuctl, then apply the latest firmware update to your
CPU, then run the following shell script. Comments indicate the errata
number for the workarounds.
Please report the results. If the script helps, I will code the kernel
change to apply the workarounds.
https://lists.freebsd.org/pipermail/freebsd-stable/2018-June/089110.html
I have a couple of Epyc boxes that showed the same lockup behaviour. I
will re-install FreeBSD on them and see if their microcode updates fix
this issue as well...

Should I run the same cpuctl commands on those CPUs ? BTW, I am happy
to loan one out to you in the FreeBSD netperf cluster for a few weeks

---Mike
--
-------------------
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, ***@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
Konstantin Belousov
2018-06-14 15:03:54 UTC
Permalink
Post by Mike Tancsa
Post by Eric van Gyzen
Post by Konstantin Belousov
Today I noted that AMD published the public errata document for Ryzens,
https://developer.amd.com/wp-content/resources/55449_1.12.pdf
Some of the issues listed there looks quite relevant to the potential
a script which should apply the recommended workarounds to the erratas
that I find interesting.
To run it, kldload cpuctl, then apply the latest firmware update to your
CPU, then run the following shell script. Comments indicate the errata
number for the workarounds.
Please report the results. If the script helps, I will code the kernel
change to apply the workarounds.
https://lists.freebsd.org/pipermail/freebsd-stable/2018-June/089110.html
I have a couple of Epyc boxes that showed the same lockup behaviour. I
will re-install FreeBSD on them and see if their microcode updates fix
this issue as well...
I am not sure about only microcode update. Depending on the BIOS
vendor and current BIOS, you may need all three: BIOS update, microcode
update using cpucontrol/devcpu-data, and running the script I posted.
In the best case, some of this is just redundand.
Post by Mike Tancsa
Should I run the same cpuctl commands on those CPUs ? BTW, I am happy
to loan one out to you in the FreeBSD netperf cluster for a few weeks
---Mike
--
-------------------
Mike Tancsa, tel +1 519 651 3400 x203
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
Mike Tancsa
2018-06-14 15:12:04 UTC
Permalink
Post by Konstantin Belousov
I am not sure about only microcode update. Depending on the BIOS
vendor and current BIOS, you may need all three: BIOS update, microcode
update using cpucontrol/devcpu-data, and running the script I posted.
In the best case, some of this is just redundand.
Thanks, I will run the tests on the Epyc system over the next few days.
It took a little longer to crash the Epyc than the Ryzen. The Ryzen is
still going now for 20hrs. Previously 5-10 min were enough to trigger
the hard lockup.

---Mike
--
-------------------
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, ***@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
Mike Tancsa
2018-06-14 21:12:17 UTC
Permalink
Post by Konstantin Belousov
I am not sure about only microcode update. Depending on the BIOS
vendor and current BIOS, you may need all three: BIOS update, microcode
update using cpucontrol/devcpu-data, and running the script I posted.
In the best case, some of this is just redundand.
OK, before and after shows the same microcode rev

CPU: AMD EPYC 7281 16-Core Processor (2100.06-MHz
K8-class CPU)
Origin="AuthenticAMD" Id=0x800f12 Family=0x17 Model=0x1 Stepping=2

Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>

Features2=0x7ed8320b<SSE3,PCLMULQDQ,MON,SSSE3,FMA,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM>
AMD
Features2=0x35c233ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,SKINIT,WDT,TCE,Topology,PCXC,PNXC,DBE,PL2I,MWAITX>
Structured Extended
Features=0x209c01a9<FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA>
XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
AMD Extended Feature Extensions ID EBX=0x1007<CLZERO,IRPerf,XSaveErPtr>
SVM: NP,NRIP,VClean,AFlush,DAssist,NAsids=32768
TSC: P-state invariant, performance statistics

# x86info -a | grep -i micro
Microcode patch level: 0x8001227
#

I then ran the fix script. I will let the box grind away over the
weekend to see if it survives. Previously, a couple of hours would lock
it up. I am running it now. One thing I did notice is a bunch of these
showing up

Jun 14 17:11:18 r11epyc kernel: fpudna: fpcurthread == curthread


---Mike
--
-------------------
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, ***@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
Oliver Pinter
2018-06-14 21:16:36 UTC
Permalink
Post by Mike Tancsa
Post by Konstantin Belousov
I am not sure about only microcode update. Depending on the BIOS
vendor and current BIOS, you may need all three: BIOS update, microcode
update using cpucontrol/devcpu-data, and running the script I posted.
In the best case, some of this is just redundand.
OK, before and after shows the same microcode rev
CPU: AMD EPYC 7281 16-Core Processor (2100.06-MHz
K8-class CPU)
Origin="AuthenticAMD" Id=0x800f12 Family=0x17 Model=0x1 Stepping=2
Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,
APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
Features2=0x7ed8320b<SSE3,PCLMULQDQ,MON,SSSE3,FMA,CX16,
SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM>
AMD
Features2=0x35c233ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,
Prefetch,OSVW,SKINIT,WDT,TCE,Topology,PCXC,PNXC,DBE,PL2I,MWAITX>
Structured Extended
Features=0x209c01a9<FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,
ADX,SMAP,CLFLUSHOPT,SHA>
XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
AMD Extended Feature Extensions ID EBX=0x1007<CLZERO,IRPerf,XSaveErPtr>
SVM: NP,NRIP,VClean,AFlush,DAssist,NAsids=32768
TSC: P-state invariant, performance statistics
# x86info -a | grep -i micro
Microcode patch level: 0x8001227
#
I then ran the fix script. I will let the box grind away over the
weekend to see if it survives. Previously, a couple of hours would lock
it up. I am running it now. One thing I did notice is a bunch of these
showing up
Jun 14 17:11:18 r11epyc kernel: fpudna: fpcurthread == curthread
This is a side effect of enabled eager FPU switch, it's orthogonal and
already fixed - the printf has been removed - in current.
Post by Mike Tancsa
---Mike
--
-------------------
Mike Tancsa, tel +1 519 651 3400 x203
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-current
Mike Tancsa
2018-06-18 14:09:29 UTC
Permalink
Post by Konstantin Belousov
Please report the results. If the script helps, I will code the kernel
change to apply the workarounds.
The hard lockups I was seeing on Ryzen and Epyc boxes are now gone with
the microcode and script below.

Not sure if its one or some combo of the settings, but all the steps
below have made my 2 test systems stable on RELENG_11 anyways.

This was on a Ryzen 5 1600X (ASUS PRIME X370-PRO BIOS from 04/19/2018)
CPU Microcode patch level: 0x8001137

And
EPYC 7281 16-Core (Supermicro H11SSL-i BIOS 04/27/2018 )
Microcode patch level: 0x8001227



Details of the issue were discussed at

https://lists.freebsd.org/pipermail/freebsd-virtualization/2018-March/006187.html
and
https://lists.freebsd.org/pipermail/freebsd-stable/2018-January/088174.html

TL;DR : Generating traffic via iperf3 between VMs either on bhyve or
VirtualBox would make the box lockup-- no crash, just a blank screen

---Mike
Post by Konstantin Belousov
#!/bin/sh
# Enable workarounds for erratas listed in
# https://developer.amd.com/wp-content/resources/55449_1.12.pdf
# 1057, 1109
sysctl machdep.idle_mwait=0
sysctl machdep.idle=hlt
for x in /dev/cpuctl*; do
# 1021
cpucontrol -m '0xc0011029|=0x2000' $x
# 1033
cpucontrol -m '0xc0011020|=0x10' $x
# 1049
cpucontrol -m '0xc0011028|=0x10' $x
# 1095
cpucontrol -m '0xc0011020|=0x200000000000000' $x
done
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-current
--
-------------------
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, ***@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada
Loading...