Signed-off-by: Carlos Maiolino <cem@kernel.org>
-----BEGIN PGP SIGNATURE-----
iJUEABMJAB0WIQSmtYVZ/MfVMGUq1GNcsMJ8RxYuYwUCaUJ+8AAKCRBcsMJ8RxYu
Yw3RAYDYFGJuIs6GpKVsvMbk8DIyN8A+/B9115w2nkvMC8zERJgMbXHp9huM03iR
OxlbrJ0Bf3SCTGyOR3wlW85INfk0pEl7o3yrKc1dZp2fjY+IMKm1qnhEvOMubrUT
7qUBvhU+rw==
=QcWi
-----END PGP SIGNATURE-----
Merge tag 'xfs-fixes-6.19-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Carlos Maiolino:
"This contains a few fixes for zoned devices support, an UAF and a
compiler warning, and some cleaning up"
* tag 'xfs-fixes-6.19-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix the zoned RT growfs check for zone alignment
xfs: validate that zoned RT devices are zone aligned
xfs: fix XFS_ERRTAG_FORCE_ZERO_RANGE for zoned file system
xfs: fix a memory leak in xfs_buf_item_init()
xfs: fix stupid compiler warning
xfs: fix a UAF problem in xattr repair
xfs: ignore discard return value
Drops unused parameter from kunit_device_register_internal and makes
FAULT_TEST default to n when PANIC_ON_OOPS.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmlGXcYACgkQCwJExA0N
Qxyb4hAAsRXFZlfcBMlpHI5nNOkX6vfcOUlhI44LqfzBMBJMVQKvKqOL/3QZV2iE
PZ4LuYey6+0L62pY0/gKTxzNTHeYRTcjI5X9cimerDqOXc5WUa2Tb4x3npxgZMxS
WExyS/Xfr9cGsOWKzQv8c0J6IFeUlvIK4HKLJPAdwCeF3KK3ZzgPZ9tQgB40mbAg
IeVBIlC1UL9Rs7yR0JNxvQsr8zEMr2rG/r8YD8ktyVmoCxbVinNTMK+6uu+uERrG
WFb6UYIFmgq5Xh7G7Fsr/NnlKXRqhCwc4/8k7taAvIH8moMQrUAr6pDLCAVvYEL+
MxoYefTDgvZqbyemGuLQFl7VqxA+tUJrNrBos2XBgE/uTG9sbFjuZWDDaG98HoWM
cHgnrw2t8ybvDWl489PBJeMANNS0fbH6tMKLV0Z8SPOPFlLnVPctRfIgJqGfbYLj
vfBv8oQywqyi98jrBIjI4stKDikYzpCEduk1RoQFMnZQOuC2OFg18w3v5n75CTD2
7n6mYAMrOxZOOgwcKR/E5iYf3GHIkIbJUHH6aSdXUEDAxBo/oQybDQ4xl44Fntd0
Q2iPbPaM5tTf5OtS466Dx33Oz7omxXBLXGQ/Zj0Yn7x4SO0Ya4A9/g7fGFM5yBAN
C7Dn11hiwj+wAphaLrxvOJ2y9hA8BQRz+keIvL44VmYDHiHj/ik=
=Oplf
-----END PGP SIGNATURE-----
Merge tag 'linux_kselftest-kunit-fixes-6.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit fixes from Shuah Khan:
"Drop unused parameter from kunit_device_register_internal and make
FAULT_TEST default to n when PANIC_ON_OOPS"
* tag 'linux_kselftest-kunit-fixes-6.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: make FAULT_TEST default to n when PANIC_ON_OOPS
kunit: Drop unused parameter from kunit_device_register_internal
other series:
- Simplify checks in arch_kfence_init_pool() since force_pte_mapping()
already takes BBML2-noabort (break-before-make Level 2 with no aborts
generated) into account
- Remove unneeded SVE/SME fallback preserve/store handling in the arm64
EFI. With the recent updates, the fallback path is only taken for EFI
runtime calls from hardirq or NMI contexts. In practice, this only
happens under panic/oops/emergency_restart() and no restoring of the
user state expected. There's a corresponding lkdtm update to trigger
a BUG() or panic() from hardirq context together with a fixup not to
confuse clang/objtool about the control flow
GCS (guarded control stacks) fix: flush the GCS locking state on exec,
otherwise the new task will not be able to enable GCS (locked as
disabled).
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmlGb0YACgkQa9axLQDI
XvGNTA/+MFft7b7Seb0DlanyxMfnKmLqoPbdciSCQh5I79fklWDMCqh/B1KVUxhG
cnY45GyDEO/AC46wYvqqm6frS4sYHwiDybIJwLPGcxmPaoeNacVQkWYjiM6sU68m
RjcFe5xN1+3y+d37ZPgoClU6YNDqNcxv1jLHzQly3eO/8LLOGi6S8bEX9v8SGLWR
4PJki0YSOIJQYEehmAglPwroRgJXeHcaLboF/I/ULPZjhrDMIz2BSoNKreGfzgGI
0+myd6CeGjxcAKf2JI9zmBHiCmgGGi9gMz5BXm6yo+jGoyeMaeTi5/w4r/D4GJlp
kS2RbI+LpWEFN4yKk23YVMM7fYmI9ZCTEsRw0osHV2uqMSt2WNGadE4klG8uV6oK
h0VMKY5QB0SZVc0uT+S44zkK9iLVPbn8FgCAbzMNBE9gTgf/nT6l09VsdgNrzK90
CTwwvt9S0hiRk9VMa57WAWodb8lxp0IUE0eP4ptc/Y7EL8ibgTZcm0GVxG9OLMBD
J+RXQZ1VugeUoYncEH23KYSo89kjRJ7k6tAkz1B6bK1ULnIoQsebyNFhqITCx9t8
dc/3hqsgijiyMhZchLZtvqYh/tfSFBsKTN5cQj2qQCxTlnX7Pi1UvbnSwXeGGdki
lEWTEtkqKQdE2wryYkMlvMllCLibmr7/FClMncj0pEPQjz/FJDA=
=vAzy
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
"Two left-over updates that could not go into -rc1 due to conflicts
with other series:
- Simplify checks in arch_kfence_init_pool() since
force_pte_mapping() already takes BBML2-noabort (break-before-make
Level 2 with no aborts generated) into account
- Remove unneeded SVE/SME fallback preserve/store handling in the
arm64 EFI. With the recent updates, the fallback path is only taken
for EFI runtime calls from hardirq or NMI contexts. In practice,
this only happens under panic/oops/emergency_restart() and no
restoring of the user state expected.
There's a corresponding lkdtm update to trigger a BUG() or panic()
from hardirq context together with a fixup not to confuse
clang/objtool about the control flow
GCS (guarded control stacks) fix: flush the GCS locking state on exec,
otherwise the new task will not be able to enable GCS (locked as
disabled)"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
lkdtm/bugs: Do not confuse the clang/objtool with busy wait loop
arm64/gcs: Flush the GCS locking state on exec
arm64/efi: Remove unneeded SVE/SME fallback preserve/store handling
lkdtm/bugs: Add cases for BUG and PANIC occurring in hardirq context
arm64: mm: Simplify check in arch_kfence_init_pool()
- Add a missing "break" to fix param parsing in the rseq selftest.
- Apply runtime updates to the _current_ CPUID when userspace is setting
CPUID, e.g. as part of vCPU hotplug, to fix a false positive and to avoid
dropping the pending update.
- Disallow toggling KVM_MEM_GUEST_MEMFD on an existing memslot, as it's not
supported by KVM and leads to a use-after-free due to KVM failing to unbind
the memslot from the previously-associated guest_memfd instance.
- Harden against similar KVM_MEM_GUEST_MEMFD goofs, and prepare for supporting
flags-only changes on KVM_MEM_GUEST_MEMFD memlslots, e.g. for dirty logging.
- Set exit_code[63:32] to -1 (all 0xffs) when synthesizing a nested
SVM_EXIT_ERR (a.k.a. VMEXIT_INVALID) #VMEXIT, as VMEXIT_INVALID is defined
as -1ull (a 64-bit value).
- Update SVI when activating APICv to fix a bug where a post-activation EOI
for an in-service IRQ would effective be lost due to SVI being stale.
- Immediately refresh APICv controls (if necessary) on a nested VM-Exit
instead of deferring the update via KVM_REQ_APICV_UPDATE, as the request is
effectively ignored because KVM thinks the vCPU already has the correct
APICv settings.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCgAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmlEQXEUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroOscwf/ahimZsBtFTtd6yZpZ84E1O7P8g31
aZ4p9CRcx+I0aMQKOfS98R3G1w8FtMpcX11aPWvIi4Vkx/4PVlZr+ds8cfiMV7Z+
Hdd7k/uR4eC+KcL0XoGcmukBlHYi/8lZeijUcID+Iueo7MMlyX5G/VQ1RzBZwP++
VFObgtzoAmlKrR5AaFeRMBnY+fiH6wMjPSwY7fI5pr4qNHrxB06wgtX5+C1Y3Poa
7ETMrfX2XtyfG53N+1YJMGoaXqBkFaGCateMJQNOu/QQzOZhUuBCkC8muzbCTckO
Y405j7GfpyUmhf/mdLI8BlMYrn5Mwdnvo9iJ3U5fGUN/Eh4XmPIYKy7sMg==
=Fa0R
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull x86 kvm fixes from Paolo Bonzini:
"x86 fixes. Everyone else is already in holiday mood apparently.
- Add a missing 'break' to fix param parsing in the rseq selftest
- Apply runtime updates to the _current_ CPUID when userspace is
setting CPUID, e.g. as part of vCPU hotplug, to fix a false
positive and to avoid dropping the pending update
- Disallow toggling KVM_MEM_GUEST_MEMFD on an existing memslot, as
it's not supported by KVM and leads to a use-after-free due to KVM
failing to unbind the memslot from the previously-associated
guest_memfd instance
- Harden against similar KVM_MEM_GUEST_MEMFD goofs, and prepare for
supporting flags-only changes on KVM_MEM_GUEST_MEMFD memlslots,
e.g. for dirty logging
- Set exit_code[63:32] to -1 (all 0xffs) when synthesizing a nested
SVM_EXIT_ERR (a.k.a. VMEXIT_INVALID) #VMEXIT, as VMEXIT_INVALID is
defined as -1ull (a 64-bit value)
- Update SVI when activating APICv to fix a bug where a
post-activation EOI for an in-service IRQ would effective be lost
due to SVI being stale
- Immediately refresh APICv controls (if necessary) on a nested
VM-Exit instead of deferring the update via KVM_REQ_APICV_UPDATE,
as the request is effectively ignored because KVM thinks the vCPU
already has the correct APICv settings"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: nVMX: Immediately refresh APICv controls as needed on nested VM-Exit
KVM: VMX: Update SVI during runtime APICv activation
KVM: nSVM: Set exit_code_hi to -1 when synthesizing SVM_EXIT_ERR (failed VMRUN)
KVM: nSVM: Clear exit_code_hi in VMCB when synthesizing nested VM-Exits
KVM: Harden and prepare for modifying existing guest_memfd memslots
KVM: Disallow toggling KVM_MEM_GUEST_MEMFD on an existing memslot
KVM: selftests: Add a CPUID testcase for KVM_SET_CPUID2 with runtime updates
KVM: x86: Apply runtime updates to current CPUID during KVM_SET_CPUID{,2}
KVM: selftests: Add missing "break" in rseq_test's param parsing
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCaUT1mQAKCRCAXGG7T9hj
vpIDAQCqbWQJmr7TJSKm/xKdMU4IzaLmOCe5m0UvjHiLmIeKMgEAzqLvwMhs8rjz
A/ELYnIqIhkwTm990eDT1YF/y1XYkQY=
=6qmH
-----END PGP SIGNATURE-----
Merge tag 'for-linus-6.19-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
"Just a single patch fixing a sparse warning"
* tag 'for-linus-6.19-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: Fix sparse warning in enlighten_pv.c
-----BEGIN PGP SIGNATURE-----
iQFPBAABCAA5FiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmlFenQbFIAAAAAABAAO
bWFudTIsMi41KzEuMTEsMiwyAAoJELvgsHXSRYiageUH/2CUIrKSYRn7uy2Pe5R0
ycPv9EFZquvbDKROfXWJ1RteTeVYUXKymdInBw/ToyxQ8rj4pASqQUpnpXxaHQj3
INv+SlS7GE95C8ynwfVVhjpOL6axgN8L56WOnY/7rFMs+HUmZmvMNo9yxfMd8sLI
3FjriPkmacj50kP8LdyOxLOybBE1Vu0BRAzo3SZpCP23VMV6DJGqct9MYLw01mQr
ENTKxWRfSQs9STukaqb7iDKVfSw/z7Fs02l79+lxAV2XQkykPScXM5/cT1JLiChQ
WIXckT+r6dRkdfRzeCpyUl1p0j1ClHw5DmtNayfFjasqX+/2dkRN1offyn75++Kb
7CY=
=pLK7
-----END PGP SIGNATURE-----
Merge tag 'slab-for-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab fix from Vlastimil Babka:
- A stable fix for a missing tag reset that can happen in
kfree_nolock() with KASAN+SLUB_TINY configs (Deepanshu Kartikey)
* tag 'slab-for-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
mm/slub: reset KASAN tag in defer_free() before accessing freed memory
Including:
- iommupt: Fix an oops found by syzcaller in the new generic IO-page-table code.
- AMD-Vi: Fix IO_PAGE_FAULTs in kdump kernels triggered by re-using
domain-ids from previous kernel.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmlFUUwACgkQK/BELZcB
GuNQPg//VSAjn/Ssq1tWSQrYD96GXRLcEc3NzO1JhPbLQOg/kCAvV7MaV3UgJAYd
1AvD32agtO0zRri7IgnvwkD85Zodp8C6TDMF+QU+mCBEv5BcgtENnspccq6XxNXW
CuOLEcuEJIpyCKdQ/v3xqlgA/HsQm8e9Pa1xwWixN05gie6lcZdqiqrzL79KbRRJ
INgZ8X9SVNVH4R2PQzj/+rDBij8gKRBVGfRGyrThaxuMPwBq5Eysgf4qpX/XK5Kd
CP8/duhxrd18ZhRPazbaYd0X90ea32/KYzzLtWbNZzpue6T1D+rnyeM0zU50oKT6
bIYJjol6JEhtNJjQFXnZHkx8iPaBW+UQTK2+9HUKecjsvnCtAhVE4Uzw7sdWB9V0
p2ui06/yQI8jadriWm2kExGv4iN85bm0oebC3UPDt2coCLVH6BFuLDBt+IK6M+z2
5qD1xaDggQDY1jLsnSnUNJo1PzS8hfPONHEglUZOL+A8i1fYjYNZ8Q8zsX7DiYTz
X2UjNSlpyFEI8g9X8f3wgihHcVUGEdi4RuJCl9qHE+o7Vq6OFjpvAW125rCjHjVT
Hk6pE6kEvflz4cYZ3y6nWBrVQapQ2FKVWxghBq5LPsigzQTabHtAgpeHFCPby7OH
cqpqrf3xIwCjMDHxD1mkxOfMZi7TEJGnUwUy47kBAGpsD58BMVs=
=jIcT
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu fixes from Joerg Roedel:
- iommupt: Fix an oops found by syzcaller in the new generic
IO-page-table code.
- AMD-Vi: Fix IO_PAGE_FAULTs in kdump kernels triggered by re-using
domain-ids from previous kernel.
* tag 'iommu-fixes-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
amd/iommu: Make protection domain ID functions non-static
amd/iommu: Preserve domain ids inside the kdump kernel
iommupt: Return ERR_PTR from _table_alloc()
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmlEvbAQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgphTLD/9Gx4wnirRvzo1Vz77Odhqqbipww/gjDV0G
HASzF4KczfusTei4NABfuSVA4/LS28csZOYq0ZeDmFVg+YPQcmr4DNw/eJzzi1Xh
TDmYuf8y7jAJjwWmX0fgVF2uDg3/kp3tWKfBxSFqpqHrZUkOLr8IY3y8/uJYKakm
r8K9HsS7hyPhx4xdfRQixUil5TgQxBtCgwTX+JRv/K9AMnEVHkBkJ+m4GSmjJ3n8
tr5lL8fQmIQcsSWqTvMH/m2xeLG7gr9LE33wfu2sCLvNJ4CAYFGT9aMf2F0nGbzK
Uu5XOyq1rJ8vKWJuDVfFsjOca4fyyvaKX4bqycv89Z7MVzQOXsjbs+4sHFycsfYC
P+OBmxLb2+rowNsawyzObT9wnd37xptl+tF5nXL0LL6W4PWHmUgnDuM6Vg5zpNv+
n0nO6y7I0iL9o4hgqN8kUOr5e7IjvJccJbratOA/jVXmAw1I/Gg6HDo+W7rEqsVP
0hrlE+RuJSRypbCO2pcvviahaDaCWo3gQmp9TZCBmgqgoiI0WStPAgYrG424+69Y
Hoq5HBmtN4GV3EnU0Ybuawzl2OKtQwwr9DrRpvrxdN9qm4E44CD6eTmE5fEnrTnX
Rcs5VosBAavPTbl8LFDtj2h6qPak+5VHHVZwimLNdOyCLTNf7D/kiAjmOHRBbmdt
AopIuvCipA==
=lk7H
-----END PGP SIGNATURE-----
Merge tag 'block-6.19-20251218' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- ublk selftests for missing coverage
- two fixes for the block integrity code
- fix for the newly added newly added PR read keys ioctl, limiting the
memory that can be allocated
- work around for a deadlock that can occur with ublk, where partition
scanning ends up recursing back into file closure, which needs the
same mutex grabbed. Not the prettiest thing in the world, but an
acceptable work-around until we can eliminate the reliance on
disk->open_mutex for this
- fix for a race between enabling writeback throttling and new IO
submissions
- move a bit of bio flag handling code. No changes, but needed for a
patchset for a future kernel
- fix for an init time id leak failure in rnbd
- loop/zloop state check fix
* tag 'block-6.19-20251218' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
block: validate interval_exp integrity limit
block: validate pi_offset integrity limit
block: rnbd-clt: Fix leaked ID in init_dev()
ublk: fix deadlock when reading partition table
block: add allocation size check in blkdev_pr_read_keys()
Documentation: admin-guide: blockdev: replace zone_capacity with zone_capacity_mb when creating devices
zloop: use READ_ONCE() to read lo->lo_state in queue_rq path
loop: use READ_ONCE() to read lo->lo_state without locking
block: fix race between wbt_enable_default and IO submission
selftests: ublk: add user copy test cases
selftests: ublk: add support for user copy to kublk
selftests: ublk: forbid multiple data copy modes
selftests: ublk: don't share backing files between ublk servers
selftests: ublk: use auto_zc for PER_IO_DAEMON tests in stress_04
selftests: ublk: fix fio arguments in run_io_and_recover()
selftests: ublk: remove unused ios map in seq_io.bt
selftests: ublk: correct last_rw map type in seq_io.bt
selftests: ublk: fix overflow in ublk_queue_auto_zc_fallback()
block: move around bio flagging helpers
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmlEvc8QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpuG5D/4wvdxdfrYL0cM8A18dS+Ngl5kCRsWWulV1
mWyOcPilyUQ266g879nD3YFJW/C1CbCYhvrX446agKnFMbGyQNmt5OyiXQUEDS/W
LlgCWSn3Rum8a5i3m92zvgp24cZI9DSWnuq0GnwcEvMauNsRkgowZ6hy89Mtthvy
DUCimJawGCxd++yuN2LChe356WpiT10ed9calfCEDdq4iOuYKSMNJRtUFl0jaFUG
0GlWPnxaRjuukn1So3/6XiVRtBoC5ZPCApMitG+Pu5AfIeBz4LfHyZS5VpOm1GSo
hZm7PEOHiW4Jt6xiyZH38l9gBZuqUZ3w1HDJItGVvioYWrPHV1ZwUezkNEfdlSkj
oiH4oujIFt6P+yRCbhpETvYizaS68y/EseKbyjPg5eSJE3UCquMD5jKhCdESm16l
gZIxbiT7EgT19ivFeFcT10G7sQZ/JphdkS/q9HXuR1tVWSHIm+8x90zmVo5Dli8r
Tio/uJrAEOD0fepPNFv7CwUwIie7nipIdxG/mt+bDBH8DMAii85pW+J/1jwl4OHh
b3Z8Kq9ZVi9x2bKJ6ai87DhH2+nbVGGlSvIU+fQ8hKeTIloYsXO4wycZxNheheRo
bXiS0rmxpQvErhHRY2FNTyqXd5qLYJZzdUBRf1thNTNzoDzlFLLQzMAwK26Zdvce
6p4NiDtgaQ==
=heZQ
-----END PGP SIGNATURE-----
Merge tag 'io_uring-6.19-20251218' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fix from Jens Axboe:
"Just a single fix this week, for an issue with the calculation of the
number of segments in the ublk kbuf import path"
* tag 'io_uring-6.19-20251218' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring: fix nr_segs calculation in io_import_kbuf
The reset_history attributes are write only. Hence don't report them as
readable just to return -EOPNOTSUPP later on.
Fixes: cbc29538dbf7 ("hwmon: Add driver for LTC4282")
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20251219-ltc4282-fix-reset-history-v1-1-8eab974c124b@analog.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
It's a requirement that DT overlays be applied at build time in order to
validate them as overlays are not validated on their own.
Add missing target for mt8395-radxa hd panel overlay.
Fixes: 4c8ff61199a7 ("arm64: dts: mediatek: mt8395-radxa-nio-12l: Add Radxa 8 HD panel")
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Acked-by: AngeloGioacchino Del Regno <angelogiaocchino.delregno@collabora.com>
Link: https://patch.msgid.link/20251205215940.19287-1-linux@fw-web.de
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Build devicetree binaries for testing overlays and providing users
full dtb without using overlays for Bananapi R4 (pro) variants.
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Link: https://patch.msgid.link/20251119175124.48947-3-linux@fw-web.de
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Since commit eb972eab0794 ("lkdtm/bugs: Add cases for BUG and PANIC
occurring in hardirq context"), building with clang for x86_64 results
in the following warnings:
vmlinux.o: warning: objtool: lkdtm_PANIC_IN_HARDIRQ(): unexpected end of section .text.lkdtm_PANIC_IN_HARDIRQ
vmlinux.o: warning: objtool: lkdtm_BUG_IN_HARDIRQ(): unexpected end of section .text.lkdtm_BUG_IN_HARDIRQ
caused by busy "while (wait_for_...);" loops. Add READ_ONCE() and
cpu_relax() to better indicate the intention and avoid any unwanted
compiler optimisations.
Fixes: eb972eab0794 ("lkdtm/bugs: Add cases for BUG and PANIC occurring in hardirq context")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202512190111.jxFSqxUH-lkp@intel.com/
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
So that both iommu.c and init.c can utilize them. Also define a new
function 'pdom_id_destroy()' to destroy 'pdom_ids' instead of directly
calling ida functions.
Signed-off-by: Sairaj Kodilkar <sarunkod@amd.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Currently AMD IOMMU driver does not reserve domain ids programmed in the
DTE while reusing the device table inside kdump kernel. This can cause
reallocation of these domain ids for newer domains that are created by
the kdump kernel, which can lead to potential IO_PAGE_FAULTs
Hence reserve these ids inside pdom_ids.
Fixes: 38e5f33ee359 ("iommu/amd: Reuse device table for kdump")
Signed-off-by: Sairaj Kodilkar <sarunkod@amd.com>
Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
- Add a missing "break" to fix param parsing in the rseq selftest.
- Apply runtime updates to the _current_ CPUID when userspace is setting
CPUID, e.g. as part of vCPU hotplug, to fix a false positive and to avoid
dropping the pending update.
- Disallow toggling KVM_MEM_GUEST_MEMFD on an existing memslot, as it's not
supported by KVM and leads to a use-after-free due to KVM failing to unbind
the memslot from the previously-associated guest_memfd instance.
- Harden against similar KVM_MEM_GUEST_MEMFD goofs, and prepare for supporting
flags-only changes on KVM_MEM_GUEST_MEMFD memlslots, e.g. for dirty logging.
- Set exit_code[63:32] to -1 (all 0xffs) when synthesizing a nested
SVM_EXIT_ERR (a.k.a. VMEXIT_INVALID) #VMEXIT, as VMEXIT_INVALID is defined
as -1ull (a 64-bit value).
- Update SVI when activating APICv to fix a bug where a post-activation EOI
for an in-service IRQ would effective be lost due to SVI being stale.
- Immediately refresh APICv controls (if necessary) on a nested VM-Exit
instead of deferring the update via KVM_REQ_APICV_UPDATE, as the request is
effectively ignored because KVM thinks the vCPU already has the correct
APICv settings.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmk5p18ACgkQOlYIJqCj
N/0YlBAAvnhGVmqVc3nhd313mo4YGk+Z1RxpO1sJAsGJu42Ir/QqMC9aPHy9ejcS
hfoIXzPFdVJEztuBUWRje9mvocQnXSAWjXFTaoqJE/LXVnh96Txhh4nvCJSQzyrn
5/0uk5ZD5dfPVyPtYk6G9w3q9kgYv3Q6O4UEU48ru0q6wcu5FmRshULfHVZnlyNa
ZALY4k8QzsdSzB0XWusmD0OQpjGRyUR79mqEzUybg4E/b0LAK9Nv6Fr5YgGq7g0N
AU1T1t+/hMN7x4/24RrxNBO+skPKmCi7nM3iVKqilQSO2fZzrNVUOkr/tGb+5EL6
iw4JHJQjp9LlzRVxP3QNZp8Bg+knMVRdbSzAkdomDRguMgpu/TGAe7TtkzfEyVel
VAQUVpDaThp0FK5wAdyMKvpOQqTGjl3KKtM2zs187v+eJJcjJQnIsTk4zW5mmvk4
Y6YOqbulSNAqVOmJj7oqrDxWgjD75PtXlPFEoOsJM0AuL/sHBo8bKT18cGBwVGmH
lJoNfkS45kofG4i0zIBwzQuvKjDIQU7ZdVXa2MLL1aqDlyu/66Hll0vCJLAMvHz5
eb65WQ6Br97e0BNuzVJJNyTGQ3Pr9DdSkTpPkOalwQ3VEyZwcKm3OsB/N0FsgP2V
ta7vZQ5b6Sn568A9LAgXGhcnQ7mA31VBoNCkLLowxIT4kxVKGUY=
=4iuO
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-fixes-6.19-rc1' of https://github.com/kvm-x86/linux into HEAD
KVM fixes for 6.19-rc1
- Add a missing "break" to fix param parsing in the rseq selftest.
- Apply runtime updates to the _current_ CPUID when userspace is setting
CPUID, e.g. as part of vCPU hotplug, to fix a false positive and to avoid
dropping the pending update.
- Disallow toggling KVM_MEM_GUEST_MEMFD on an existing memslot, as it's not
supported by KVM and leads to a use-after-free due to KVM failing to unbind
the memslot from the previously-associated guest_memfd instance.
- Harden against similar KVM_MEM_GUEST_MEMFD goofs, and prepare for supporting
flags-only changes on KVM_MEM_GUEST_MEMFD memlslots, e.g. for dirty logging.
- Set exit_code[63:32] to -1 (all 0xffs) when synthesizing a nested
SVM_EXIT_ERR (a.k.a. VMEXIT_INVALID) #VMEXIT, as VMEXIT_INVALID is defined
as -1ull (a 64-bit value).
- Update SVI when activating APICv to fix a bug where a post-activation EOI
for an in-service IRQ would effective be lost due to SVI being stale.
- Immediately refresh APICv controls (if necessary) on a nested VM-Exit
instead of deferring the update via KVM_REQ_APICV_UPDATE, as the request is
effectively ignored because KVM thinks the vCPU already has the correct
APICv settings.
msleep is not very accurate in terms of how long it actually sleeps,
whereas usleep_range is precise. Replace the timeslice sleep for
long-running workloads with the more accurate usleep_range to avoid
jitter if the sleep period is less than 20ms.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20251212182847.1683222-3-matthew.brost@intel.com
(cherry picked from commit ca415c4d4c17ad676a2c8981e1fcc432221dce79)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
When imported dma-bufs are destroyed, TTM is not fully
individualizing the dma-resv, but it *is* copying the fences that
need to be waited for before declaring idle. So in the case where
the bo->resv != bo->_resv we can still drop the preempt-fences, but
make sure we do that on bo->_resv which contains the fence-pointer
copy.
In the case where the copying fails, bo->_resv will typically not
contain any fences pointers at all, so there will be nothing to
drop. In that case, TTM would have ensured all fences that would
have been copied are signaled, including any remaining preempt
fences.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Fixes: fa0af721bd1f ("drm/ttm: test private resv obj on release/destroy")
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.16+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Tested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251217093441.5073-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 425fe550fb513b567bd6d01f397d274092a9c274)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
An EU stall property value of 0 is invalid and will cause a NPD.
Reported-by: Peter Senna Tschudin <peter.senna@linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6453
Fixes: 1537ec85ebd7 ("drm/xe/uapi: Introduce API for EU stall sampling")
Cc: stable@vger.kernel.org
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Harish Chegondi <harish.chegondi@intel.com>
Link: https://patch.msgid.link/20251212061850.1565459-4-ashutosh.dixit@intel.com
(cherry picked from commit 5bf763e908bf795da4ad538d21c1ec41f8021f76)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
An OA property value of 0 is invalid and will cause a NPD.
Reported-by: Peter Senna Tschudin <peter.senna@linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6452
Fixes: cc4e6994d5a2 ("drm/xe/oa: Move functions up so they can be reused for config ioctl")
Cc: stable@vger.kernel.org
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Harish Chegondi <harish.chegondi@intel.com>
Link: https://patch.msgid.link/20251212061850.1565459-3-ashutosh.dixit@intel.com
(cherry picked from commit 7a100e6ddcc47c1f6ba7a19402de86ce24790621)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
The xe_sriov_vfio_migration_supported() function is type bool so
returning -EPERM means returning true. Return false instead.
Fixes: bd45d46ffc8f ("drm/xe/pf: Export helpers for VFIO")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/aTLEZ4g-FD-iMQ2V@stanley.mountain
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
(cherry picked from commit 0a2404c8f6a3a120f79c57ef8a3302c8e8bc34d9)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reports can be written out to the OA buffer using ways other than periodic
sampling. These include mmio trigger and context switches. To support these
use cases, when periodic sampling is not enabled,
OAG_OAGLBCTXCTRL_COUNTER_RESUME must be set.
Fixes: 1db9a9dc90ae ("drm/xe/oa: OA stream initialization (OAG)")
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patch.msgid.link/20251205212613.826224-4-ashutosh.dixit@intel.com
(cherry picked from commit 88d98e74adf3e20f678bb89581a5c3149fdbdeaa)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
A 10ms timeslice for long-running workloads is far too long and causes
significant jitter in benchmarks when the system is shared. Adjust the
value to 5ms for preempt-fencing VMs, as the resume step there is quite
costly as memory is moved around, and set it to zero for pagefault VMs,
since switching back to pagefault mode after dma-fence mode is
relatively fast.
Also change min_run_period_ms to 'unsiged int' type rather than 's64' as
only positive values make sense.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20251212182847.1683222-2-matthew.brost@intel.com
(cherry picked from commit 33a5abd9a68394aa67f9618b20eee65ee8702ff4)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
The OA open parameters did not validate num_syncs, allowing
userspace to pass arbitrarily large values, potentially
leading to excessive allocations.
Add check to ensure that num_syncs does not exceed DRM_XE_MAX_SYNCS,
returning -EINVAL when the limit is violated.
v2: use XE_IOCTL_DBG() and drop duplicated check. (Ashutosh)
Fixes: c8507a25cebd ("drm/xe/oa/uapi: Define and parse OA sync properties")
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251205234715.2476561-6-shuicheng.lin@intel.com
(cherry picked from commit e057b2d2b8d815df3858a87dffafa2af37e5945b)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
The exec and vm_bind ioctl allow userspace to specify an arbitrary
num_syncs value. Without bounds checking, a very large num_syncs
can force an excessively large allocation, leading to kernel warnings
from the page allocator as below.
Introduce DRM_XE_MAX_SYNCS (set to 1024) and reject any request
exceeding this limit.
"
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1217 at mm/page_alloc.c:5124 __alloc_frozen_pages_noprof+0x2f8/0x2180 mm/page_alloc.c:5124
...
Call Trace:
<TASK>
alloc_pages_mpol+0xe4/0x330 mm/mempolicy.c:2416
___kmalloc_large_node+0xd8/0x110 mm/slub.c:4317
__kmalloc_large_node_noprof+0x18/0xe0 mm/slub.c:4348
__do_kmalloc_node mm/slub.c:4364 [inline]
__kmalloc_noprof+0x3d4/0x4b0 mm/slub.c:4388
kmalloc_noprof include/linux/slab.h:909 [inline]
kmalloc_array_noprof include/linux/slab.h:948 [inline]
xe_exec_ioctl+0xa47/0x1e70 drivers/gpu/drm/xe/xe_exec.c:158
drm_ioctl_kernel+0x1f1/0x3e0 drivers/gpu/drm/drm_ioctl.c:797
drm_ioctl+0x5e7/0xc50 drivers/gpu/drm/drm_ioctl.c:894
xe_drm_ioctl+0x10b/0x170 drivers/gpu/drm/xe/xe_device.c:224
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:598 [inline]
__se_sys_ioctl fs/ioctl.c:584 [inline]
__x64_sys_ioctl+0x18b/0x210 fs/ioctl.c:584
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xbb/0x380 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
...
"
v2: Add "Reported-by" and Cc stable kernels.
v3: Change XE_MAX_SYNCS from 64 to 1024. (Matt & Ashutosh)
v4: s/XE_MAX_SYNCS/DRM_XE_MAX_SYNCS/ (Matt)
v5: Do the check at the top of the exec func. (Matt)
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Reported-by: Koen Koning <koen.koning@intel.com>
Reported-by: Peter Senna Tschudin <peter.senna@linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6450
Cc: <stable@vger.kernel.org> # v6.12+
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Michal Mrozek <michal.mrozek@intel.com>
Cc: Carl Zhang <carl.zhang@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Ivan Briano <ivan.briano@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251205234715.2476561-5-shuicheng.lin@intel.com
(cherry picked from commit b07bac9bd708ec468cd1b8a5fe70ae2ac9b0a11c)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Various code assumes that the integrity interval is at least 1 sector
and evenly divides the logical block size. Add these checks to
blk_validate_integrity_limits(). This guards against block drivers that
report invalid interval_exp values.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The PI tuple must be contained within the metadata value, so validate
that pi_offset + pi_tuple_size <= metadata_size. This guards against
block drivers that report invalid pi_offset values.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If kstrdup() fails in init_dev(), then the newly allocated ID is lost.
Fixes: 64e8a6ece1a5 ("block/rnbd-clt: Dynamically alloc buffer for pathname & blk_symlink_name")
Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
syzkaller noticed that with fault injection a failure inside
iommu_alloc_pages_node_sz() oops's in PT_FEAT_DMA_INCOHERENT because it goes
on to make NULL incoherent. Closer inspection shows the return value has
become confused, the alloc routines on the iommupt side expect ERR_PTR while
iommu_alloc_pages_node_sz() returns NULL.
Error out early to fix both issues.
Fixes: aefd967dab64 ("iommupt: Use the incoherent start/stop functions for PT_FEAT_DMA_INCOHERENT")
Fixes: dcd6a011a8d5 ("iommupt: Add map_pages op")
Fixes: cdb39d918579 ("iommupt: Add the basic structure of the iommu implementation")
Reported-by: syzbot+e06bb7478e687f235ad7@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/693a39de.050a0220.4004e.02ce.GAE@google.com/
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
When one process(such as udev) opens ublk block device (e.g., to read
the partition table via bdev_open()), a deadlock[1] can occur:
1. bdev_open() grabs disk->open_mutex
2. The process issues read I/O to ublk backend to read partition table
3. In __ublk_complete_rq(), blk_update_request() or blk_mq_end_request()
runs bio->bi_end_io() callbacks
4. If this triggers fput() on file descriptor of ublk block device, the
work may be deferred to current task's task work (see fput() implementation)
5. This eventually calls blkdev_release() from the same context
6. blkdev_release() tries to grab disk->open_mutex again
7. Deadlock: same task waiting for a mutex it already holds
The fix is to run blk_update_request() and blk_mq_end_request() with bottom
halves disabled. This forces blkdev_release() to run in kernel work-queue
context instead of current task work context, and allows ublk server to make
forward progress, and avoids the deadlock.
Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver")
Link: https://github.com/ublk-org/ublksrv/issues/170 [1]
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
[axboe: rewrite comment in ublk]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
io_import_kbuf() calculates nr_segs incorrectly when iov_offset is
non-zero after iov_iter_advance(). It doesn't account for the partial
consumption of the first bvec.
The problem comes when meet the following conditions:
1. Use UBLK_F_AUTO_BUF_REG feature of ublk.
2. The kernel will help to register the buffer, into the io uring.
3. Later, the ublk server try to send IO request using the registered
buffer in the io uring, to read/write to fuse-based filesystem, with
O_DIRECT.
>From a userspace perspective, the ublk server thread is blocked in the
kernel, and will see "soft lockup" in the kernel dmesg.
When ublk registers a buffer with mixed-size bvecs like [4K]*6 + [12K]
and a request partially consumes a bvec, the next request's nr_segs
calculation uses bvec->bv_len instead of (bv_len - iov_offset).
This causes fuse_get_user_pages() to loop forever because nr_segs
indicates fewer pages than actually needed.
Specifically, the infinite loop happens at:
fuse_get_user_pages()
-> iov_iter_extract_pages()
-> iov_iter_extract_bvec_pages()
Since the nr_segs is miscalculated, the iov_iter_extract_bvec_pages
returns when finding that i->nr_segs is zero. Then
iov_iter_extract_pages returns zero. However, fuse_get_user_pages does
still not get enough data/pages, causing infinite loop.
Example:
- Bvecs: [4K, 4K, 4K, 4K, 4K, 4K, 12K, ...]
- Request 1: 32K at offset 0, uses 6*4K + 8K of the 12K bvec
- Request 2: 32K at offset 32K
- iov_offset = 8K (8K already consumed from 12K bvec)
- Bug: calculates using 12K, not (12K - 8K) = 4K
- Result: nr_segs too small, infinite loop in fuse_get_user_pages.
Fix by accounting for iov_offset when calculating the first segment's
available length.
Fixes: b419bed4f0a6 ("io_uring/rsrc: ensure segments counts are correct on kbuf buffers")
Signed-off-by: huang-jl <huang-jl@deepseek.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
blkdev_pr_read_keys() takes num_keys from userspace and uses it to
calculate the allocation size for keys_info via struct_size(). While
there is a check for SIZE_MAX (integer overflow), there is no upper
bound validation on the allocation size itself.
A malicious or buggy userspace can pass a large num_keys value that
doesn't trigger overflow but still results in an excessive allocation
attempt, causing a warning in the page allocator when the order exceeds
MAX_PAGE_ORDER.
Fix this by introducing PR_KEYS_MAX to limit the number of keys to
a sane value. This makes the SIZE_MAX check redundant, so remove it.
Also switch to kvzalloc/kvfree to handle larger allocations gracefully.
Fixes: 22a1ffea5f80 ("block: add IOC_PR_READ_KEYS ioctl")
Tested-by: syzbot+660d079d90f8a1baf54d@syzkaller.appspotmail.com
Reported-by: syzbot+660d079d90f8a1baf54d@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=660d079d90f8a1baf54d
Link: https://lore.kernel.org/all/20251212013510.3576091-1-kartikey406@gmail.com/T/ [v1]
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
MMC_SDHCI_ESDHC_IMX requires ARCH_MXC despite also being used on
ARCH_S32, which results in unmet dependencies when compiling strictly
for ARCH_S32. Resolve this by adding ARCH_S32 as an alternative to
ARCH_MXC in the driver's dependencies.
Fixes: 5c4f00627c9a ("mmc: sdhci-esdhc-imx: add NXP S32G2 support")
Cc: stable@bvger.kernel.org
Signed-off-by: Jared Kangas <jkangas@redhat.com>
Reviewed-by: Haibo Chen <haibo.chen@nxp.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
On Xilinx/AMD platforms, the CD stable bit take slightly longer than
one second(about an additional 100ms) to assert after a host
controller reset. Although no functional failure observed with the
existing one second delay but to ensure reliable initialization, increase
the CD stable timeout to 2 seconds.
Fixes: e251709aaddb ("mmc: sdhci-of-arasan: Ensure CD logic stabilization before power-up")
Cc: stable@vger.kernel.org
Signed-off-by: Sai Krishna Potthuri <sai.krishna.potthuri@amd.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The grofs code for zoned RT subvolums already tries to check for zone
alignment, but gets it wrong by using the old instead of the new mount
structure.
Fixes: 01b71e64bb87 ("xfs: support growfs on zoned file systems")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Cc: stable@vger.kernel.org # v6.15
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Garbage collection assumes all zones contain the full amount of blocks.
Mkfs already ensures this happens, but make the kernel check it as well
to avoid getting into trouble due to fuzzers or mkfs bugs.
Fixes: 2167eaabe2fa ("xfs: define the zoned on-disk format")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Cc: stable@vger.kernel.org # v6.15
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Pass character "0" rather than NULL terminator to properly format
queue restoration SMI events. Currently, the NULL terminator precedes
the newline character that is intended to delineate separate events
in the SMI event buffer, which can break userspace parsers.
Signed-off-by: Brian Kocoloski <brian.kocoloski@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6e7143e5e6e21f9d5572e0390f7089e6d53edf3c)
User-configured SCLK(GPU core clock)frequencies were not persisting
across S0ix suspend/resume cycles on smu v14 hardware.
The issue occurred because of the code resetting clock frequency
to zero during resume.
This patch addresses the problem by:
- Preserving user-configured values in driver and sets the
clock frequency across resume
- Preserved settings are sent to the hardware during resume
Signed-off-by: mythilam <mythilam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 20ba98326f4c69e6bf8d1f42942ece485a675b27)
[why]
need to enable APG_CLOCK_ENABLE enable first
also need to wake up az from D3 before access az block
Reviewed-by: Swapnil Patel <swapnil.patel@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit bf5e396957acafd46003318965500914d5f4edfa)
[Why]
Different platforms use different NBIO header files,
causing display code to use differnt offset and read
wrong accelerated status.
[How]
- Unified NBIO offset header file across platform.
- Correct scratch registers offsets to proper locations.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4667
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 576e032e909c8a6bb3d907b4ef5f6abe0f644199)
Cc: stable@vger.kernel.org
If console suspend has been disabled using `no_console_suspend` also
wake up during thaw() so that some messages can be seen for debugging.
Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4191
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 63387cbbb714d9f0d179d9d4560de1408d0906de)
My name is stamped into maintainership for a big slew of DT
bindings. Now that it is changing, switch it over to my
kernel.org mail address, which will hopefully be stable for the
rest of my life.
Signed-off-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20251216-maintainers-dt-v1-1-0b5ab102c9bb@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Commit 6ea891a6dd37 ("cpufreq: dt-platdev: Simplify with
of_machine_get_match_data()") broke several platforms which did not have
OPPv2 proprety, because it incorrectly checked for device match data
after first matching from "allowlist". Almost all of "allowlist" match
entries do not have match data and it is expected to create platform
device for them with empty data.
Fix this by first checking if platform is on the allowlist with
of_machine_device_match() and only then taking the match data. This
duplicates the number of checks (we match against the allowlist twice),
but makes the code here much smaller.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Closes: https://lore.kernel.org/all/CAMuHMdVJD4+J9QpUUs-sX0feKfuPD72CO0dcqN7shvF_UYpZ3Q@mail.gmail.com/
Reported-by: Pavel Pisa <pisa@fel.cvut.cz>
Closes: https://lore.kernel.org/all/6hnk7llbwdezh74h74fhvofbx4t4jihel5kvr6qwx2xuxxbjys@rmwbd7lkhrdz/
Fixes: 6ea891a6dd37 ("cpufreq: dt-platdev: Simplify with of_machine_get_match_data()")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Tested-by: Pavel Pisa <pisa@fel.cvut.cz>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20251210051718.132795-2-krzysztof.kozlowski@oss.qualcomm.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
The gate bindings have an artificial split between a "syscon" and clock
provider node. Allow "reg" properties so this split can be removed.
Reviewed-by: Chunyan Zhang <zhang.lyra@gmail.com>
Link: https://patch.msgid.link/20251029155615.1167903-1-robh@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Common boolean properties need to be only allowed in the binding
(":true"), because their type is already defined by core DT schema.
Simplify dma-coherent property to match common syntax.
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://patch.msgid.link/20251115122120.35315-4-krzk@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Commit 8a6e02d0c00e ("of: reserved_mem: Restructure how the reserved
memory regions are processed") changed the processing order of reserved
memory regions, causing elfcorehdr to overlap with dynamically allocated
reserved memory regions during kdump kernel boot.
The issue occurs because:
1. kexec-tools allocates elfcorehdr in the last crashkernel reserved
memory region and passes it to the second kernel
2. The problematic commit moved dynamic reserved memory allocation
(like bman-fbpr) to occur during fdt_scan_reserved_mem(), before
elfcorehdr reservation in fdt_reserve_elfcorehdr()
3. bman-fbpr with 16MB alignment requirement can get allocated at
addresses that overlap with the elfcorehdr location
4. When fdt_reserve_elfcorehdr() tries to reserve elfcorehdr memory,
overlap detection identifies the conflict and skips reservation
5. kdump kernel fails with "Unable to handle kernel paging request"
because elfcorehdr memory is not properly reserved
The boot log:
Before 8a6e02d0c00e:
OF: fdt: Reserving 1 KiB of memory at 0xf4fff000 for elfcorehdr
OF: reserved mem: 0xf3000000..0xf3ffffff bman-fbpr
After 8a6e02d0c00e:
OF: reserved mem: 0xf4000000..0xf4ffffff bman-fbpr
OF: fdt: elfcorehdr is overlapped
Fix this by ensuring elfcorehdr reservation occurs before dynamic
reserved memory allocation.
Fixes: 8a6e02d0c00e ("of: reserved_mem: Restructure how the reserved memory regions are processed")
Signed-off-by: Jianpeng Chang <jianpeng.chang.cn@windriver.com>
Link: https://patch.msgid.link/20251205015934.700016-1-jianpeng.chang.cn@windriver.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
We handle backlight so need that dependency.
Fixes: 7911d8cab554 ("drm/panel: visionox-rm69299: Add backlight support")
Reported-by: kernelci.org bot <bot@kernelci.org>
Signed-off-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patch.msgid.link/20251017-visionox-rm69299-bl-v2-1-9dfa06606754@sigxcpu.org
The new XFS_ERRTAG_FORCE_ZERO_RANGE error tag added by commit
ea9989668081 ("xfs: error tag to force zeroing on debug kernels") fails
to account for the zoned space reservation rules and this reliably fails
xfs/131 because the zeroing operation returns -EIO.
Fix this by reserving enough space to zero the entire range, which
requires a bit of (fairly ugly) reshuffling to do the error injection
early enough to affect the space reservation.
Fixes: ea9989668081 ("xfs: error tag to force zeroing on debug kernels")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
xfs_buf_item_get_format() may allocate memory for bip->bli_formats,
free the memory in the error path.
Fixes: c3d5f0c2fb85 ("xfs: complain if anyone tries to create a too-large buffer log item")
Cc: stable@vger.kernel.org
Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
gcc 14.2 warns about:
xfs_attr_item.c: In function ‘xfs_attr_recover_work’:
xfs_attr_item.c:785:9: warning: ‘ip’ may be used uninitialized [-Wmaybe-uninitialized]
785 | xfs_trans_ijoin(tp, ip, 0);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
xfs_attr_item.c:740:42: note: ‘ip’ was declared here
740 | struct xfs_inode *ip;
| ^~
I think this is bogus since xfs_attri_recover_work either returns a real
pointer having initialized ip or an ERR_PTR having not touched it, but
the tools are smarter than me so let's just null-init the variable
anyway.
Cc: stable@vger.kernel.org # v6.8
Fixes: e70fb328d52772 ("xfs: recreate work items when recovering intent items")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
The xchk_setup_xattr_buf function can allocate a new value buffer, which
means that any reference to ab->value before the call could become a
dangling pointer. Fix this by moving an assignment to after the buffer
setup.
Cc: stable@vger.kernel.org # v6.10
Fixes: e47dcf113ae348 ("xfs: repair extended attributes")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
__blkdev_issue_discard() always returns 0, making all error checking
in XFS discard functions dead code.
Change xfs_discard_extents() return type to void, remove error variable,
error checking, and error logging for the __blkdev_issue_discard() call
in same function.
Update xfs_trim_perag_extents() and xfs_trim_rtgroup_extents() to
ignore the xfs_discard_extents() return value and error checking
code.
Update xfs_discard_rtdev_extents() to ignore __blkdev_issue_discard()
return value and error checking code.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chaitanya Kulkarni <ckulkarnilinux@gmail.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
The sparse tool issues a warning for arch/x76/xen/enlighten_pv.c:
arch/x86/xen/enlighten_pv.c:120:9: sparse: sparse: incorrect type
in initializer (different address spaces)
expected void const [noderef] __percpu *__vpp_verify
got bool *
This is due to the percpu variable xen_in_preemptible_hcall being
exported via EXPORT_SYMBOL_GPL() instead of EXPORT_PER_CPU_SYMBOL_GPL().
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202512140856.Ic6FetG6-lkp@intel.com/
Fixes: fdfd811ddde3 ("x86/xen: allow privcmd hypercalls to be preempted")
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20251215115112.15072-1-jgross@suse.com>
The "zone_capacity=%umb" option is no longer used. The effective option
is now "zone_capacity_mb=%u", so update the documentation accordingly.
Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In the queue_rq path, zlo->state is accessed without locking, and direct
access may read stale data. This patch uses READ_ONCE() to read
zlo->state and data_race() to silence code checkers, and changes all
assignments to use WRITE_ONCE().
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When lo->lo_mutex is not held, direct access may read stale data. This
patch uses READ_ONCE() to read lo->lo_state and data_race() to silence
code checkers, and changes all assignments to use WRITE_ONCE().
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
As describe in the help string, the user might want to disable these
tests if they don't like to see stacktraces/BUG etc in their kernel log.
However, if they enable PANIC_ON_OOPS, these tests also crash the
machine, which it's safe to assume _almost_ nobody wants.
One might argue that _absolutely_ nobody ever wants their kernel to
crash so this should just be a hard dependency instead of a default.
However, since this is rather special code that's anyway concerned with
deliberately doing "bad" things, the normal rules don't seem to apply,
hence prefer flexibility and allow users to set up a crashing Kconfig if
they so choose.
Link: https://lore.kernel.org/r/20251207-kunit-fault-no-panic-v1-1-2ac932f26864@google.com
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
The DSI host must be enabled before our prepare function can run, which
has to send its init sequence over DSI. Without enabling the host first
the panel will not probe.
Fixes: 9e15123eca79 ("drm/msm/dsi: Stop unconditionally powering up DSI hosts at modeset")
Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Martin Botka <martin.botka@somainline.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patch.msgid.link/20251130-sony-akari-fix-panel-v1-1-1d27c60a55f5@somainline.org
If gio_device_register fails, gio_dev_put() is required to
drop the gio_dev device reference.
Fixes: e84de0c61905 ("MIPS: GIO bus support for SGI IP22/28")
Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
The recent io_remap_pfn_range() rework applied the static and inline
specifiers to the implementation of io_remap_pfn_range_pfn() on MIPS
Alchemy, mirroring the same change on other platforms. However, this
function is defined in a source file and that definition causes a
conflict with its declaration. Fix this by dropping the specifiers.
Fixes: c707a68f9468 ("mm: abstract io_remap_pfn_range() based on PFN")
Signed-off-by: Thierry Reding <treding@nvidia.com>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
During GT reset recovery in do_gt_restart(), xe_uc_start() was called
before xe_reg_sr_apply_mmio() restored engine-specific registers. This
created a race window where the scheduler could run jobs before hardware
state was fully restored.
This caused failures in eudebug tests (xe_exec_sip_eudebug@breakpoint-
waitsip-*) where TD_CTL register (containing TD_CTL_GLOBAL_DEBUG_ENABLE)
wasn't restored before jobs started executing. Breakpoints would fail to
trigger SIP entry because the debug enable bit wasn't set yet.
Fix by moving xe_uc_start() after all MMIO register restoration,
including engine registers and CCS mode configuration, ensuring all
hardware state is fully restored before any jobs can be scheduled.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Jan Maslak <jan.maslak@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251210145618.169625-2-jan.maslak@intel.com
(cherry picked from commit 825aed0328588b2837636c1c5a0c48795d724617)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
There are some corner cases where flushing transient
data may take slightly longer than the 150us timeout
we currently allow. Update the driver to use a 300us
timeout instead based on the latest guidance from
the hardware team. An update to the bspec to formally
document this is expected to arrive soon.
Fixes: c01c6066e6fa ("drm/xe/device: implement transient flush")
Signed-off-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/0201b1d6ec64d3651fcbff1ea21026efa915126a.1765487866.git.jagmeet.randhawa@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit d69d3636f5f7a84bae7cd43473b3701ad9b7d544)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Ensure VF migration recovery work is only queued when no recovery is
already queued and teardown is not in progress.
Fixes: b47c0c07c350 ("drm/xe/vf: Teardown VF post migration worker on driver unload")
Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20251210052546.622809-5-satyanarayana.k.v.p@intel.com
(cherry picked from commit 8d8cf42b03f149dcb545b547906306f3b474565e)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Some Xe bos are allocated with extra backing-store for the CCS
metadata. It's never been the intention to share the CCS metadata
when exporting such bos as dma-buf. Don't include it in the
dma-buf sg-table.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://patch.msgid.link/20251209204920.224374-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit a4ebfb9d95d78a12512b435a698ee6886d712571)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
MEI GSC interrupt comes from i915 or xe driver. It has top half and
bottom half. Top half is called from i915/xe interrupt handler. It
should be in irq disabled context.
With RT kernel(PREEMPT_RT enabled), by default IRQ handler is in
threaded IRQ. MEI GSC top half might be in threaded IRQ context.
generic_handle_irq_safe API could be called from either IRQ or
process context, it disables local IRQ then calls MEI GSC interrupt
top half.
This change fixes B580 GPU boot issue with RT enabled.
Fixes: e02cea83d32d ("drm/xe/gsc: add Battlemage support")
Tested-by: Baoli Zhang <baoli.zhang@intel.com>
Signed-off-by: Junxiao Chang <junxiao.chang@intel.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251107033152.834960-1-junxiao.chang@intel.com
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
(cherry picked from commit 3efadf028783a49ab2941294187c8b6dd86bf7da)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
If wait for ring space started just before migration, it can delay
the recovery process, by waiting without bailout path for up to 2
seconds.
Two second wait for recovery is not acceptable, and if the ring was
completely filled even without the migration temporarily stopping
execution, then such a wait will result in up to a thousand new jobs
(assuming constant flow) being added while the wait is happening.
While this will not cause data corruption, it will lead to warning
messages getting logged due to reset being scheduled on a GT under
recovery. Also several seconds of unresponsiveness, as the backlog
of jobs gets progressively executed.
Add a bailout condition, to make sure the recovery starts without
much delay. The recovery is expected to finish in about 100 ms when
under moderate stress, so the condition verification period needs to be
below that - settling at 64 ms.
The theoretical max time which the recovery can take depends on how
many requests can be emitted to engine rings and be pending execution.
While stress testing, it was possible to reach 10k pending requests
on rings when a platform with two GTs was used. This resulted in max
recovery time of 5 seconds. But in real life situations, it is very
unlikely that the amount of pending requests will ever exceed 100,
and for that the recovery time will be around 50 ms - well within our
claimed limit of 100ms.
Fixes: a4dae94aad6a ("drm/xe/vf: Wakeup in GuC backend on VF post migration recovery")
Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20251204200820.2206168-1-tomasz.lis@intel.com
(cherry picked from commit a00e305fba02a915cf2745bf6ef3f55537e65d57)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
The newly introduced "reasons" attribute already signifies possible
reasons for throttling and makes the prefix in individual attribute
names redundant while emitting them as an array. Skip the prefix.
Fixes: 83ccde67a3f7 ("drm/xe/gt_throttle: Avoid TOCTOU when monitoring reasons")
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Sk Anirban <sk.anirban@intel.com>
Link: https://patch.msgid.link/20251203123355.571606-1-raag.jadav@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit b64a14334ef3ebbcf70d11bc67d0934bdc0e390d)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
The Xe driver fails to build when CONFIG_DRM_XE_GPUSVM is disabled
but CONFIG_DRM_GPUSVM is turned on, due to the clash of two commits:
In file included from drivers/gpu/drm/xe/xe_vm_madvise.c:8:
drivers/gpu/drm/xe/xe_svm.h: In function 'xe_svm_init':
include/linux/stddef.h:8:14: error: passing argument 5 of 'drm_gpusvm_init' makes integer from pointer without a cast [-Wint-conversion]
drivers/gpu/drm/xe/xe_svm.h:217:38: note: in expansion of macro 'NULL'
217 | NULL, NULL, 0, 0, 0, NULL, NULL, 0);
| ^~~~
In file included from drivers/gpu/drm/xe/xe_bo_types.h:11,
from drivers/gpu/drm/xe/xe_bo.h:11,
from drivers/gpu/drm/xe/xe_vm_madvise.c:11:
include/drm/drm_gpusvm.h:254:35: note: expected 'long unsigned int' but argument is of type 'void *'
254 | unsigned long mm_start, unsigned long mm_range,
| ~~~~~~~~~~~~~~^~~~~~~~
In file included from drivers/gpu/drm/xe/xe_vm_madvise.c:14:
drivers/gpu/drm/xe/xe_svm.h:216:16: error: too many arguments to function 'drm_gpusvm_init'; expected 10, have 11
216 | return drm_gpusvm_init(&vm->svm.gpusvm, "Xe SVM (simple)", &vm->xe->drm,
| ^~~~~~~~~~~~~~~
217 | NULL, NULL, 0, 0, 0, NULL, NULL, 0);
| ~
include/drm/drm_gpusvm.h:251:5: note: declared here
Adapt the caller to the new argument list by removing the extraneous
NULL argument.
Fixes: 9e9787414882 ("drm/xe/userptr: replace xe_hmm with gpusvm")
Fixes: 10aa5c806030 ("drm/gpusvm, drm/xe: Fix userptr to not allow device private pages")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20251204094704.1030933-1-arnd@kernel.org
(cherry picked from commit 29bce9c8b41d5c378263a927acb9a9074d0e7a0e)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Do not reference the loop variable job after the loop has exited.
Instead, save the job from the last iteration of the loop.
Fixes: 3d98a7164da6 ("drm/xe/vf: Start re-emission from first unsignaled job during VF migration")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202511291102.jnnKP6IB-lkp@intel.com/
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Link: https://patch.msgid.link/20251203011809.968893-1-matthew.brost@intel.com
(cherry picked from commit 76ce2313709f13a6adbcaa1a43a8539c8f509f6a)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Wa_14020316580 was getting clobbered by power gating init code
later in the driver load sequence. Move the Wa so that
it applies correctly.
Fixes: 7cd05ef89c9d ("drm/xe/xe2hpm: Add initial set of workarounds")
Suggested-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20251129052548.70766-1-vinay.belgaumkar@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit 8b5502145351bde87f522df082b9e41356898ba3)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Ensure gt->freq is released when sysfs_create_files() fails
in xe_gt_freq_init(). Without this, the kobject would leak.
Add kobject_put() before returning the error.
Fixes: fdc81c43f0c1 ("drm/xe: use devm_add_action_or_reset() helper")
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Alex Zuo <alex.zuo@intel.com>
Reviewed-by: Xin Wang <x.wang@intel.com>
Link: https://patch.msgid.link/20251114205638.2184529-2-shuicheng.lin@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit 251be5fb4982ebb0f5a81b62d975bd770f3ad5c2)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
When we exec a new task we forget to flush the set of locked GCS mode bits.
Since we do flush the rest of the state this means that if GCS is locked
the new task will be unable to enable GCS, it will be locked as being
disabled. Add the expected flush.
Fixes: fc84bc5378a8 ("arm64/gcs: Context switch GCS state for EL0")
Cc: <stable@vger.kernel.org> # 6.13.x
Reported-by: Yury Khrustalev <Yury.Khrustalev@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Tested-by: Yury Khrustalev <yury.khrustalev@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Since commit 7137a203b251 ("arm64/fpsimd: Permit kernel mode NEON with
IRQs off"), the only condition under which the fallback path is taken
for FP/SIMD preserve/restore across a EFI runtime call is when it is
called from hardirq or NMI context.
In practice, this only happens when the EFI pstore driver is called to
dump the kernel log buffer into a EFI variable under a panic, oops or
emergency_restart() condition, and none of these can be expected to
result in a return to user space for the task in question.
This means that the existing EFI-specific logic for preserving and
restoring SVE/SME state is pointless, and can be removed.
Instead, kill the task, so that an exceedingly unlikely inadvertent
return to user space does not proceed with a corrupted FP/SIMD state.
Also, retain the preserve and restore of the base FP/SIMD state, as that
might belong to kernel mode use of FP/SIMD. (Note that EFI runtime calls
are never invoked reentrantly, even in this case, and so any interrupted
kernel mode FP/SIMD usage will be unrelated to EFI)
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add lkdtm cases to trigger a BUG() or panic() from hardirq context. This
is useful for testing pstore behavior being invoked from such contexts.
Reviewed-by: Kees Cook <kees@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
TL;DR: checking force_pte_mapping() in arch_kfence_init_pool() is
sufficient
Commit ce2b3a50ad92 ("arm64: mm: Don't sleep in
split_kernel_leaf_mapping() when in atomic context") recently added
an arm64 implementation of arch_kfence_init_pool() to ensure that
the KFENCE pool is PTE-mapped. Assuming that the pool was not
initialised early, block splitting is necessary if the linear
mapping is not fully PTE-mapped, in other words if
force_pte_mapping() is false.
arch_kfence_init_pool() currently makes another check: whether
BBML2-noabort is supported, i.e. whether we are *able* to split
block mappings. This check is however unnecessary, because
force_pte_mapping() is always true if KFENCE is enabled and
BBML2-noabort is not supported. This must be the case by design,
since KFENCE requires PTE-mapped pages in all cases. We can
therefore remove that check.
The situation is different in split_kernel_leaf_mapping(), as that
function is called unconditionally regardless of the configuration.
If BBML2-noabort is not supported, it cannot do anything and bails
out. If force_pte_mapping() is true, there is nothing to do and it
also bails out, but these are independent checks.
Commit 53357f14f924 ("arm64: mm: Tidy up force_pte_mapping()")
grouped these checks into a helper, split_leaf_mapping_possible().
This isn't so helpful as only split_kernel_leaf_mapping() should
check both. Revert the parts of that commit that introduced the
helper, reintroducing the more accurate comments in
split_kernel_leaf_mapping().
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Pull in rc1 to include all changes since the merge window closed,
and grab all fixes and changes from drm/drm-next.
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
Enable use of common SDHCI-related properties such as sdhci-caps-mask as
found in the AST2600 EVB DTS.
Cc: stable@vger.kernel.org # v6.2+
Signed-off-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The driver computes conversion intervals using the formula:
interval = (1 << (7 - rate)) * 125ms
where 'rate' is the sensor's conversion rate register value. According to
the datasheet, the power-on reset value of this register is 0x8, which
could be assigned to the register, after handling i2c general call.
Using this default value causes a result greater than the bit width of
left operand and an undefined behaviour in the calculation above, since
shifting by values larger than the bit width is undefined behaviour as
per C language standard.
Limit the maximum usable 'rate' value to 7 to prevent undefined
behaviour in calculations.
Found by Linux Verification Center (linuxtesting.org) with Svace.
Note (groeck):
This does not matter in practice unless someone overwrites the chip
configuration from outside the driver while the driver is loaded.
The conversion time register is initialized with a value of 5 (500ms)
when the driver is loaded, and the driver never writes a bad value.
Fixes: ca53e7640de7 ("hwmon: (tmp401) Convert to _info API")
Signed-off-by: Alexey Simakov <bigalex934@gmail.com>
Link: https://lore.kernel.org/r/20251211164342.6291-1-bigalex934@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The ibmpex_high_low_store() function retrieves driver data using
dev_get_drvdata() and uses it without validation. This creates a race
condition where the sysfs callback can be invoked after the data
structure is freed, leading to use-after-free.
Fix by adding a NULL check after dev_get_drvdata(), and reordering
operations in the deletion path to prevent TOCTOU.
Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Reported-by: Junrui Luo <moonafterrain@outlook.com>
Fixes: 57c7c3a0fdea ("hwmon: IBM power meter driver")
Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
Link: https://lore.kernel.org/r/MEYPR01MB7886BE2F51BFE41875B74B60AFA0A@MEYPR01MB7886.ausprd01.prod.outlook.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The fan nominal speed returned by SMM is limited to 16 bits, but the
driver allows the fan multiplier to be set via a module parameter.
Clamp the computed fan multiplier so that fan_nominal_speed *
i8k_fan_mult always fits into a signed 32-bit integer and refuse to
initialize the driver if the value is too large.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 20bdeebc88269 ("hwmon: (dell-smm) Introduce helper function for data init")
Signed-off-by: Denis Sergeev <denserg.edu@gmail.com>
Link: https://lore.kernel.org/r/20251209063706.49008-1-denserg.edu@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
When wbt_enable_default() is moved out of queue freezing in elevator_change(),
it can cause the wbt inflight counter to become negative (-1), leading to hung
tasks in the writeback path. Tasks get stuck in wbt_wait() because the counter
is in an inconsistent state.
The issue occurs because wbt_enable_default() could race with IO submission,
allowing the counter to be decremented before proper initialization. This manifests
as:
rq_wait[0]:
inflight: -1
has_waiters: True
rwb_enabled() checks the state, which can be updated exactly between wbt_wait()
(rq_qos_throttle()) and wbt_track()(rq_qos_track()), then the inflight counter
will become negative.
And results in hung task warnings like:
task:kworker/u24:39 state:D stack:0 pid:14767
Call Trace:
rq_qos_wait+0xb4/0x150
wbt_wait+0xa9/0x100
__rq_qos_throttle+0x24/0x40
blk_mq_submit_bio+0x672/0x7b0
...
Fix this by:
1. Splitting wbt_enable_default() into:
- __wbt_enable_default(): Returns true if wbt_init() should be called
- wbt_enable_default(): Wrapper for existing callers (no init)
- wbt_init_enable_default(): New function that checks and inits WBT
2. Using wbt_init_enable_default() in blk_register_queue() to ensure
proper initialization during queue registration
3. Move wbt_init() out of wbt_enable_default() which is only for enabling
disabled wbt from bfq and iocost, and wbt_init() isn't needed. Then the
original lock warning can be avoided.
4. Removing the ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT flag and its handling
code since it's no longer needed
This ensures WBT is properly initialized before any IO can be submitted,
preventing the counter from going negative.
Cc: Nilay Shroff <nilay@linux.ibm.com>
Cc: Yu Kuai <yukuai@fnnas.com>
Cc: Guangwu Zhang <guazhang@redhat.com>
Fixes: 78c271344b6f ("block: move wbt_enable_default() out of queue freezing from sched ->exit()")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The ublk selftests cover every data copy mode except user copy. Add
tests for user copy based on the existing test suite:
- generic_14 ("basic recover function verification (user copy)") based
on generic_04 and generic_05
- null_03 ("basic IO test with user copy") based on null_01 and null_02
- loop_06 ("write and verify over user copy") based on loop_01 and
loop_03
- loop_07 ("mkfs & mount & umount with user copy") based on loop_02 and
loop_04
- stripe_05 ("write and verify test on user copy") based on stripe_03
- stripe_06 ("mkfs & mount & umount on user copy") based on stripe_02
and stripe_04
- stress_06 ("run IO and remove device (user copy)") based on stress_01
and stress_03
- stress_07 ("run IO and kill ublk server (user copy)") based on
stress_02 and stress_04
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The ublk selftests mock ublk server kublk supports every data copy mode
except user copy. Add support for user copy to kublk, enabled via the
--user_copy (-u) command line argument. On writes, issue pread() calls
to copy the write data into the ublk_io's buffer before dispatching the
write to the target implementation. On reads, issue pwrite() calls to
copy read data from the ublk_io's buffer before committing the request.
Copy in 2 KB chunks to provide some coverage of the offseting logic.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The kublk mock ublk server allows multiple data copy mode arguments to
be passed on the command line (--zero_copy, --get_data, and --auto_zc).
The ublk device will be created with all the requested feature flags,
however kublk will only use one of the modes to interact with request
data (arbitrarily preferring auto_zc over zero_copy over get_data). To
clarify the intent of the test, don't allow multiple data copy modes to
be specified. --zero_copy and --auto_zc are allowed together for
--auto_zc_fallback, which uses both copy modes.
Don't set UBLK_F_USER_COPY for zero_copy, as it's a separate feature.
Fix the test cases in test_stress_05 passing --get_data along with
--zero_copy or --auto_zc.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
stress_04 is missing a wait between blocks of tests, meaning multiple
ublk servers will be running in parallel using the same backing files.
Add a wait after each section to ensure each backing file is in use by a
single ublk server at a time.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
stress_04 is described as "run IO and kill ublk server(zero copy)" but
the --per_io_tasks tests cases don't use zero copy. Plus, one of the
test cases is duplicated. Add --auto_zc to these test cases and
--auto_zc_fallback to one of the duplicated ones. This matches the test
cases in stress_03.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
run_io_and_recover() invokes fio with --size="${size}", but the variable
size doesn't exist. Thus, the argument expands to --size=, which causes
fio to exit immediately with an error without issuing any I/O. Pass the
value for size as the first argument to the function.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The ios map populated by seq_io.bt is never read, so remove it.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The last_rw map is initialized with a value of 0 but later assigned the
value args.sector + args.nr_sector, which has type sector_t = u64.
bpftrace complains about the type mismatch between int64 and uint64:
trace/seq_io.bt:18:3-59: ERROR: Type mismatch for @last_rw: trying to assign value of type 'uint64' when map already contains a value of type 'int64'
@last_rw[$dev, str($2)] = (args.sector + args.nr_sector);
Cast the initial value to uint64 so bpftrace will load the program.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The functions ublk_queue_use_zc(), ublk_queue_use_auto_zc(), and
ublk_queue_auto_zc_fallback() were returning int, but performing
bitwise AND on q->flags which is __u64.
When a flag bit is set in the upper 32 bits (beyond INT_MAX), the
result of the bitwise AND operation could overflow when cast to int,
leading to incorrect boolean evaluation.
For example, if UBLKS_Q_AUTO_BUF_REG_FALLBACK is 0x8000000000000000:
- (u64)flags & 0x8000000000000000 = 0x8000000000000000
- Cast to int: undefined behavior / incorrect value
- Used in if(): may evaluate incorrectly
Fix by:
1. Changing return type from int to bool for semantic correctness
2. Using !! to explicitly convert to boolean (0 or 1)
This ensures the functions return proper boolean values regardless
of which bit position the flags occupy in the 64-bit field.
Fixes: c3a6d48f86da ("selftests: ublk: remove ublk queue self-defined flags")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We'll need bio_flagged() earlier in bio.h for later patches, move it
together with all related helpers, and mark the bio_flagged()'s bio
argument as const.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add missing drm_gem_object_put() call when drm_gem_object_lookup()
successfully returns an object. This fixes a GEM object reference
leak that can prevent driver modules from unloading when using
prime buffers.
Fixes: 53096728b891 ("drm: Add DRM prime interface to reassign GEM handle")
Cc: <stable@vger.kernel.org> # v6.18+
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20251212134133.475218-1-karol.wachowski@linux.intel.com
Fedora/CentOS/RHEL CI is reporting intermittent failures while running
the KUnit tests present in drm_hdmi_state_helper_test.c [1].
While the specific test causing the failure change between runs, all of
them are caused by drm_kunit_helper_enable_crtc_connector() returning
-EDEADLK. The error trace always follow this structure:
# <test name>: ASSERTION FAILED at
# drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c:<line>
Expected ret == 0, but
ret == -35 (0xffffffffffffffdd)
As documented, if the drm_kunit_helper_enable_crtc_connector() function
returns -EDEADLK (-35), the entire atomic sequence must be restarted.
Handle this error code for all function calls.
Closes: https://datawarehouse.cki-project.org/issue/4039 [1]
Fixes: 6a5c0ad7e08e ("drm/tests: hdmi_state_helpers: Switch to new helper")
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Link: https://patch.msgid.link/20251104102258.10026-1-jose.exposito89@gmail.com
When CONFIG_SLUB_TINY is enabled, kfree_nolock() calls kasan_slab_free()
before defer_free(). On ARM64 with MTE (Memory Tagging Extension),
kasan_slab_free() poisons the memory and changes the tag from the
original (e.g., 0xf3) to a poison tag (0xfe).
When defer_free() then tries to write to the freed object to build the
deferred free list via llist_add(), the pointer still has the old tag,
causing a tag mismatch and triggering a KASAN use-after-free report:
BUG: KASAN: slab-use-after-free in defer_free+0x3c/0xbc mm/slub.c:6537
Write at addr f3f000000854f020 by task kworker/u8:6/983
Pointer tag: [f3], memory tag: [fe]
Fix this by calling kasan_reset_tag() before accessing the freed memory.
This is safe because defer_free() is part of the allocator itself and is
expected to manipulate freed memory for bookkeeping purposes.
Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
Cc: stable@vger.kernel.org
Reported-by: syzbot+7a25305a76d872abcfa1@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7a25305a76d872abcfa1
Tested-by: syzbot+7a25305a76d872abcfa1@syzkaller.appspotmail.com
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Link: https://patch.msgid.link/20251210022024.3255826-1-kartikey406@gmail.com
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
If an APICv status updated was pended while L2 was active, immediately
refresh vmcs01's controls instead of pending KVM_REQ_APICV_UPDATE as
kvm_vcpu_update_apicv() only calls into vendor code if a change is
necessary.
E.g. if APICv is inhibited, and then activated while L2 is running:
kvm_vcpu_update_apicv()
|
-> __kvm_vcpu_update_apicv()
|
-> apic->apicv_active = true
|
-> vmx_refresh_apicv_exec_ctrl()
|
-> vmx->nested.update_vmcs01_apicv_status = true
|
-> return
Then L2 exits to L1:
__nested_vmx_vmexit()
|
-> kvm_make_request(KVM_REQ_APICV_UPDATE)
vcpu_enter_guest(): KVM_REQ_APICV_UPDATE
-> kvm_vcpu_update_apicv()
|
-> __kvm_vcpu_update_apicv()
|
-> return // because if (apic->apicv_active == activate)
Reported-by: Chao Gao <chao.gao@intel.com>
Closes: https://lore.kernel.org/all/aQ2jmnN8wUYVEawF@intel.com
Fixes: 7c69661e225c ("KVM: nVMX: Defer APICv updates while L2 is active until L1 is active")
Cc: stable@vger.kernel.org
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
[sean: write changelog]
Link: https://patch.msgid.link/20251205231913.441872-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
The APICv (apic->apicv_active) can be activated or deactivated at runtime,
for instance, because of APICv inhibit reasons. Intel VMX employs different
mechanisms to virtualize LAPIC based on whether APICv is active.
When APICv is activated at runtime, GUEST_INTR_STATUS is used to configure
and report the current pending IRR and ISR states. Unless a specific vector
is explicitly included in EOI_EXIT_BITMAP, its EOI will not be trapped to
KVM. Intel VMX automatically clears the corresponding ISR bit based on the
GUEST_INTR_STATUS.SVI field.
When APICv is deactivated at runtime, the VM_ENTRY_INTR_INFO_FIELD is used
to specify the next interrupt vector to invoke upon VM-entry. The
VMX IDT_VECTORING_INFO_FIELD is used to report un-invoked vectors on
VM-exit. EOIs are always trapped to KVM, so the software can manually clear
pending ISR bits.
There are scenarios where, with APICv activated at runtime, a guest-issued
EOI may not be able to clear the pending ISR bit.
Taking vector 236 as an example, here is one scenario.
1. Suppose APICv is inactive. Vector 236 is pending in the IRR.
2. To handle KVM_REQ_EVENT, KVM moves vector 236 from the IRR to the ISR,
and configures the VM_ENTRY_INTR_INFO_FIELD via vmx_inject_irq().
3. After VM-entry, vector 236 is invoked through the guest IDT. At this
point, the data in VM_ENTRY_INTR_INFO_FIELD is no longer valid. The guest
interrupt handler for vector 236 is invoked.
4. Suppose a VM exit occurs very early in the guest interrupt handler,
before the EOI is issued.
5. Nothing is reported through the IDT_VECTORING_INFO_FIELD because
vector 236 has already been invoked in the guest.
6. Now, suppose APICv is activated. Before the next VM-entry, KVM calls
kvm_vcpu_update_apicv() to activate APICv.
7. Unfortunately, GUEST_INTR_STATUS.SVI is not configured, although
vector 236 is still pending in the ISR.
8. After VM-entry, the guest finally issues the EOI for vector 236.
However, because SVI is not configured, vector 236 is not cleared.
9. ISR is stalled forever on vector 236.
Here is another scenario.
1. Suppose APICv is inactive. Vector 236 is pending in the IRR.
2. To handle KVM_REQ_EVENT, KVM moves vector 236 from the IRR to the ISR,
and configures the VM_ENTRY_INTR_INFO_FIELD via vmx_inject_irq().
3. VM-exit occurs immediately after the next VM-entry. The vector 236 is
not invoked through the guest IDT. Instead, it is saved to the
IDT_VECTORING_INFO_FIELD during the VM-exit.
4. KVM calls kvm_queue_interrupt() to re-queue the un-invoked vector 236
into vcpu->arch.interrupt. A KVM_REQ_EVENT is requested.
5. Now, suppose APICv is activated. Before the next VM-entry, KVM calls
kvm_vcpu_update_apicv() to activate APICv.
6. Although APICv is now active, KVM still uses the legacy
VM_ENTRY_INTR_INFO_FIELD to re-inject vector 236. GUEST_INTR_STATUS.SVI is
not configured.
7. After the next VM-entry, vector 236 is invoked through the guest IDT.
Finally, an EOI occurs. However, due to the lack of GUEST_INTR_STATUS.SVI
configuration, vector 236 is not cleared from the ISR.
8. ISR is stalled forever on vector 236.
Using QEMU as an example, vector 236 is stuck in ISR forever.
(qemu) info lapic 1
dumping local APIC state for CPU 1
LVT0 0x00010700 active-hi edge masked ExtINT (vec 0)
LVT1 0x00010400 active-hi edge masked NMI
LVTPC 0x00000400 active-hi edge NMI
LVTERR 0x000000fe active-hi edge Fixed (vec 254)
LVTTHMR 0x00010000 active-hi edge masked Fixed (vec 0)
LVTT 0x000400ec active-hi edge tsc-deadline Fixed (vec 236)
Timer DCR=0x0 (divide by 2) initial_count = 0 current_count = 0
SPIV 0x000001ff APIC enabled, focus=off, spurious vec 255
ICR 0x000000fd physical edge de-assert no-shorthand
ICR2 0x00000000 cpu 0 (X2APIC ID)
ESR 0x00000000
ISR 236
IRR 37(level) 236
The issue isn't applicable to AMD SVM as KVM simply writes vmcb01 directly
irrespective of whether L1 (vmcs01) or L2 (vmcb02) is active (unlike VMX,
there is no need/cost to switch between VMCBs). In addition,
APICV_INHIBIT_REASON_IRQWIN ensures AMD SVM AVIC is not activated until
the last interrupt is EOI'd.
Fix the bug by configuring Intel VMX GUEST_INTR_STATUS.SVI if APICv is
activated at runtime.
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Chao Gao <chao.gao@intel.com>
Link: https://patch.msgid.link/20251110063212.34902-1-dongli.zhang@oracle.com
[sean: call out that SVM writes vmcb01 directly, tweak comment]
Link: https://patch.msgid.link/20251205231913.441872-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Set exit_code_hi to -1u as a temporary band-aid to fix a long-standing
(effectively since KVM's inception) bug where KVM treats the exit code as
a 32-bit value, when in reality it's a 64-bit value. Per the APM, offset
0x70 is a single 64-bit value:
070h 63:0 EXITCODE
And a sane reading of the error values defined in "Table C-1. SVM Intercept
Codes" is that negative values use the full 64 bits:
–1 VMEXIT_INVALID Invalid guest state in VMCB.
–2 VMEXIT_BUSYBUSY bit was set in the VMSA
–3 VMEXIT_IDLE_REQUIREDThe sibling thread is not in an idle state
-4 VMEXIT_INVALID_PMC Invalid PMC state
And that interpretation is confirmed by testing on Milan and Turin (by
setting bits in CR0[63:32] to generate VMEXIT_INVALID on VMRUN).
Furthermore, Xen has treated exitcode as a 64-bit value since HVM support
was adding in 2006 (see Xen commit d1bd157fbc ("Big merge the HVM
full-virtualisation abstractions.")).
Cc: Jim Mattson <jmattson@google.com>
Cc: Yosry Ahmed <yosry.ahmed@linux.dev>
Cc: stable@vger.kernel.org
Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251113225621.1688428-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Explicitly clear exit_code_hi in the VMCB when synthesizing "normal"
nested VM-Exits, as the full exit code is a 64-bit value (spoiler alert),
and all exit codes for non-failing VMRUN use only bits 31:0.
Cc: Jim Mattson <jmattson@google.com>
Cc: Yosry Ahmed <yosry.ahmed@linux.dev>
Cc: stable@vger.kernel.org
Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251113225621.1688428-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Unbind guest_memfd memslots if KVM commits a MOVE or FLAGS_ONLY memslot
change to harden against use-after-free, and to prepare for eventually
supporting dirty logging on guest_memfd memslots, at which point
FLAGS_ONLY changes will be expected/supported.
Add two separate WARNs, once to yell if a guest_memfd memslot is moved
(which KVM is never expected to allow/support), and again if the unbind()
is triggered, to help detect uAPI goofs prior to deliberately allowing
FLAGS_ONLY changes.
Link: https://patch.msgid.link/20251202020334.1171351-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reject attempts to disable KVM_MEM_GUEST_MEMFD on a memslot that was
initially created with a guest_memfd binding, as KVM doesn't support
toggling KVM_MEM_GUEST_MEMFD on existing memslots. KVM prevents enabling
KVM_MEM_GUEST_MEMFD, but doesn't prevent clearing the flag.
Failure to reject the new memslot results in a use-after-free due to KVM
not unbinding from the guest_memfd instance. Unbinding on a FLAGS_ONLY
change is easy enough, and can/will be done as a hardening measure (in
anticipation of KVM supporting dirty logging on guest_memfd at some point),
but fixing the use-after-free would only address the immediate symptom.
==================================================================
BUG: KASAN: slab-use-after-free in kvm_gmem_release+0x362/0x400 [kvm]
Write of size 8 at addr ffff8881111ae908 by task repro/745
CPU: 7 UID: 1000 PID: 745 Comm: repro Not tainted 6.18.0-rc6-115d5de2eef3-next-kasan #3 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Call Trace:
<TASK>
dump_stack_lvl+0x51/0x60
print_report+0xcb/0x5c0
kasan_report+0xb4/0xe0
kvm_gmem_release+0x362/0x400 [kvm]
__fput+0x2fa/0x9d0
task_work_run+0x12c/0x200
do_exit+0x6ae/0x2100
do_group_exit+0xa8/0x230
__x64_sys_exit_group+0x3a/0x50
x64_sys_call+0x737/0x740
do_syscall_64+0x5b/0x900
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7f581f2eac31
</TASK>
Allocated by task 745 on cpu 6 at 9.746971s:
kasan_save_stack+0x20/0x40
kasan_save_track+0x13/0x50
__kasan_kmalloc+0x77/0x90
kvm_set_memory_region.part.0+0x652/0x1110 [kvm]
kvm_vm_ioctl+0x14b0/0x3290 [kvm]
__x64_sys_ioctl+0x129/0x1a0
do_syscall_64+0x5b/0x900
entry_SYSCALL_64_after_hwframe+0x4b/0x53
Freed by task 745 on cpu 6 at 9.747467s:
kasan_save_stack+0x20/0x40
kasan_save_track+0x13/0x50
__kasan_save_free_info+0x37/0x50
__kasan_slab_free+0x3b/0x60
kfree+0xf5/0x440
kvm_set_memslot+0x3c2/0x1160 [kvm]
kvm_set_memory_region.part.0+0x86a/0x1110 [kvm]
kvm_vm_ioctl+0x14b0/0x3290 [kvm]
__x64_sys_ioctl+0x129/0x1a0
do_syscall_64+0x5b/0x900
entry_SYSCALL_64_after_hwframe+0x4b/0x53
Reported-by: Alexander Potapenko <glider@google.com>
Fixes: a7800aa80ea4 ("KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory")
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251202020334.1171351-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add a CPUID testcase to verify that KVM allows KVM_SET_CPUID2 after (or in
conjunction with) runtime updates. This is a regression test for the bug
introduced by commit 93da6af3ae56 ("KVM: x86: Defer runtime updates of
dynamic CPUID bits until CPUID emulation"), where KVM would incorrectly
reject KVM_SET_CPUID due to a not handling a pending runtime update on the
current CPUID, resulting in a false mismatch between the "old" and "new"
CPUID entries.
Link: https://lore.kernel.org/all/20251128123202.68424a95@imammedo
Link: https://patch.msgid.link/20251202015049.1167490-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
When handling KVM_SET_CPUID{,2}, do runtime CPUID updates on the vCPU's
current CPUID (and caps) prior to swapping in the incoming CPUID state so
that KVM doesn't lose pending updates if the incoming CPUID is rejected,
and to prevent a false failure on the equality check.
Note, runtime updates are unconditionally performed on the incoming/new
CPUID (and associated caps), i.e. clearing the dirty flag won't negatively
affect the new CPUID.
Fixes: 93da6af3ae56 ("KVM: x86: Defer runtime updates of dynamic CPUID bits until CPUID emulation")
Reported-by: Igor Mammedov <imammedo@redhat.com>
Closes: https://lore.kernel.org/all/20251128123202.68424a95@imammedo
Cc: stable@vger.kernel.org
Acked-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Link: https://patch.msgid.link/20251202015049.1167490-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
In commit 0297cdc12a87 ("KVM: selftests: Add option to rseq test to
override /dev/cpu_dma_latency"), a 'break' is missed before the option
'l' in the argument parsing loop, which leads to an unexpected core
dump in atoi_paranoid(). It tries to get the latency from non-existent
argument.
host$ ./rseq_test -u
Random seed: 0x6b8b4567
Segmentation fault (core dumped)
Add a 'break' before the option 'l' in the argument parsing loop to avoid
the unexpected core dump.
Fixes: 0297cdc12a87 ("KVM: selftests: Add option to rseq test to override /dev/cpu_dma_latency")
Cc: stable@vger.kernel.org # v6.15+
Signed-off-by: Gavin Shan <gshan@redhat.com>
Link: https://patch.msgid.link/20251124050427.1924591-1-gshan@redhat.com
[sean: describe code change in shortlog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-12-02 08:49:10 -08:00
223 changed files with 1268 additions and 524 deletions
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.