- Only adjust the ID registers when no irqchip has been created once
per VM run, instead of doing it once per vcpu, as this otherwise
triggers a pretty bad conbsistency check failure in the sysreg code.
- Make sure the per-vcpu Fine Grain Traps are computed before we load
the system registers on the HW, as we otherwise start running without
anything set until the first preemption of the vcpu.
x86:
- Fix selftests failure on AMD, checking for an optimization that was not
happening anymore.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCgAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmkcpToUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroMG4AgAl68nLWOdc6iRq/JRiiTOEx/urykc
/nKqUz7YlefEbOB64F/ZADvmluIOTs85/bdutCTO/VzJ8lkimyhISygZzA/y6Gav
XgAIekZ/QhriIqqfcrMGET+ug3EnOxCAo/M8kWmBtra7EPTrejUJhKtensBd4TXv
GTcU+yxZJF7jLE84a8CuWVbHdSfyiLYP5V6cDeMtuqvZiR5cxqyzuL+KFri4jZ72
2jxovTRpxjOh4n759Oe+eqEhl6tgWBEfOVvDMOP7hkcFGlFpsIkKGL5zR0b+xFNr
jAm5AGbJMY8n8qSM2s0OS0+Md/3kiyhvL121XmV0iGRRc1Ceq3SWFB5pRg==
=5CNi
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Arm:
- Only adjust the ID registers when no irqchip has been created once
per VM run, instead of doing it once per vcpu, as this otherwise
triggers a pretty bad conbsistency check failure in the sysreg code
- Make sure the per-vcpu Fine Grain Traps are computed before we load
the system registers on the HW, as we otherwise start running
without anything set until the first preemption of the vcpu
x86:
- Fix selftests failure on AMD, checking for an optimization that was
not happening anymore"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: SVM: Fix redundant updates of LBR MSR intercepts
KVM: arm64: VHE: Compute fgt traps before activating them
KVM: arm64: Finalize ID registers only once per VM
Don't update the LBR MSR intercept bitmaps if they're already up-to-date,
as unconditionally updating the intercepts forces KVM to recalculate the
MSR bitmaps for vmcb02 on every nested VMRUN. The redundant updates are
functionally okay; however, they neuter an optimization in Hyper-V
nested virtualization enlightenments and this manifests as a self-test
failure.
In particular, Hyper-V lets L1 mark "nested enlightenments" as clean, i.e.
tell KVM that no changes were made to the MSR bitmap since the last VMRUN.
The hyperv_svm_test KVM selftest intentionally changes the MSR bitmap
"without telling KVM about it" to verify that KVM honors the clean hint,
correctly fails because KVM notices the changed bitmap anyway:
==== Test Assertion Failure ====
x86/hyperv_svm_test.c:120: vmcb->control.exit_code == 0x081
pid=193558 tid=193558 errno=4 - Interrupted system call
1 0x0000000000411361: assert_on_unhandled_exception at processor.c:659
2 0x0000000000406186: _vcpu_run at kvm_util.c:1699
3 (inlined by) vcpu_run at kvm_util.c:1710
4 0x0000000000401f2a: main at hyperv_svm_test.c:175
5 0x000000000041d0d3: __libc_start_call_main at libc-start.o:?
6 0x000000000041f27c: __libc_start_main_impl at ??:?
7 0x00000000004021a0: _start at ??:?
vmcb->control.exit_code == SVM_EXIT_VMMCALL
Do *not* fix this by skipping svm_hv_vmcb_dirty_nested_enlightenments()
when svm_set_intercept_for_msr() performs a no-op change. changes to
the L0 MSR interception bitmap are only triggered by full CPUID updates
and MSR filter updates, both of which should be rare. Changing
svm_set_intercept_for_msr() risks hiding unintended pessimizations
like this one, and is actually more complex than this change.
Fixes: fbe5e5f030c2 ("KVM: nSVM: Always recalculate LBR MSR intercepts in svm_update_lbrv()")
Cc: stable@vger.kernel.org
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251112013017.1836863-1-yosry.ahmed@linux.dev
[Rewritten commit message based on mailing list discussion. - Paolo]
Reviewed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- Only adjust the ID registers when no irqchip has been created once
per VM run, instead of doing it once per vcpu, as this otherwise
triggers a pretty bad conbsistency check failure in the sysreg code.
- Make sure the per-vcpu Fine Grain Traps are computed before we load
the system registers on the HW, as we otherwise start running without
anything set until the first preemption of the vcpu.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmkXEhAACgkQI9DQutE9
ekNXYRAAs5z912VxL1RrmvVuHhDb1ggVJj+gPJS5cMBVBgZWT4A18EvT9D8pD0Wy
3gqpxWzi8HlPPip2Lh0b57H24ObpFlfJ2Le+i+A4dVARjHilM2bo9/NfoBO0EC+I
pm3MbLGw9fQyP7TbIQ7uVSrRMtyVvrQT/Z4g8GkJ/QidY32Rp6CkhJID3uHMuraG
GzuB6VOUGDk7LPvKyMvMPvQ5IctckSylcZkAr+2lmKMUYrtwKRIbnYBiHrSLcPfH
RQ7iekzDEQoFZppt96ucPiNgO22ZYA72hrbHig9+YLz7kU6/X4LlQDkm7vzgGSJm
5zUJD7+BkLhapUulVtNbl6TzKkH/uo3PsXK0F+kvJ5AMOFW+kWaUs868LL1/I4O6
ruMOPtZ8s3hjC+cOyxhrYxJ++rtoHe3Lyp8C7zqXwFVbStqdyNcr78SQyMthoyGz
UJRr9FMw7aGkuaS8JWSCWI+Dw2VFQsYgFm/5LCZ5QFpWKdGX3gi2jZvr22/8m+6a
nk1u9OmbAgqI+vhRwiXWWh4KyKUHq9cTUTWZ5ytpCrRaPMuf4Ixpv+9Ysb7SUcdu
CpYrBg/67ntb9bFwxJAb9ZwqKmjJMdTSN6SGx6pbM62eoXMTpjko9rJxLPee9ad8
yO9RSvyYj3F9adw+g0GdkKYDudqdpFV34ZTsvttuUEzCAMk3L1Q=
=MCeE
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-fixes-6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.18, take #3
- Only adjust the ID registers when no irqchip has been created once
per VM run, instead of doing it once per vcpu, as this otherwise
triggers a pretty bad conbsistency check failure in the sysreg code.
- Make sure the per-vcpu Fine Grain Traps are computed before we load
the system registers on the HW, as we otherwise start running without
anything set until the first preemption of the vcpu.
The recent fix to properly initialize the tags of the huge zero folio
had an unfortunate not-so-subtle side effect: it caused the actual
*contents* of the huge zero folio to not be initialized at all when the
hardware didn't support the memory tagging.
The reason was the unfortunate semantics of tag_clear_highpage(): on
hardware that didn't do the tagging, it would silently just not do
anything at all. And since this is done only on arm64 with MTE support,
that basically meant most hardware.
It wasn't necessarily immediately obvious since the huge zero page isn't
necessarily very heavily used - or because it might already be zero
because all-zeroes is the most common pattern. But it ends up causing
random odd user space failures when you do hit it.
The unfortunate semantics have been around for a while, but became a
real bug only when we started actively using __GFP_ZEROTAGS in the
generic get_huge_zero_folio() function - before that, it had only ever
been used in code that checked that the hardware supported it.
Fix this by simply changing the semantics of tag_clear_highpage() to
return whether it actually successfully did something or not. While at
it, also make it initialize multiple pages in one go, since that's
actually what the only caller wants it to do and it simplifies the whole
logic.
Fixes: adfb6609c680 ("mm/huge_memory: initialise the tags of the huge zero folio")
Link: https://lore.kernel.org/all/20251117082023.90176-1-00107082@163.com/
Reviewed-by: David Hildenbrand (Red Hat) <david@kernel.org>
Reported-and-tested-by: David Wang <00107082@163.com>
Reported-and-tested-by: Carlos Llamas <cmllamas@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On VHE, the Fine Grain Traps registers are written to hardware in
kvm_arch_vcpu_load()->..->__activate_traps_hfgxtr(), but the fgt array is
computed later, in kvm_vcpu_load_fgt(). This can lead to zero being written
to the FGT registers the first time a VCPU is loaded. Also, any changes to
the fgt array will be visible only after the VCPU is scheduled out, and
then back in, which is not the intended behaviour.
Fix it by computing the fgt array just before the fgt traps are written
to hardware.
Fixes: fb10ddf35c1c ("KVM: arm64: Compute per-vCPU FGTs at vcpu_load()")
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Oliver Upton <oupton@kernel.org>
Link: https://patch.msgid.link/20251112102853.47759-1-alexandru.elisei@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
Owing to the ID registers being global to the VM, there is no point
in computing them more than once. However, recent changes making
use of kvm_set_vm_id_reg() outlined that we repeatedly hammer
the ID registers when we shouldn't.
Gate the ID reg update on the VM having never run.
Fixes: 50e7cce81b9b2 ("KVM: arm64: Limit clearing of ID_{AA64PFR0,PFR1}_EL1.GIC to userspace irqchip")
Fixes: 5cb57a1aff755 ("KVM: arm64: Zero ID_AA64PFR0_EL1.GIC when no GICv3 is presented to the guest")
Closes: https://lore.kernel.org/r/aRHf6x5umkTYhYJ3@finisterre.sirena.org.uk
Reported-by: Mark Brown <broonie@kernel.org>
Tested-by: Mark Brown <broonie@kernel.org>
Link: https://patch.msgid.link/20251110173010.1918424-1-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-11-11 12:24:22 +00:00
8 changed files with 34 additions and 24 deletions
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.