1
0
mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-01-11 17:10:13 +00:00

1412358 Commits

Author SHA1 Message Date
Jason Gunthorpe
e6a973af11 iommufd/selftest: Check for overflow in IOMMU_TEST_OP_ADD_RESERVED
syzkaller found it could overflow math in the test infrastructure and
cause a WARN_ON by corrupting the reserved interval tree. This only
effects test kernels with CONFIG_IOMMUFD_TEST.

Validate the user input length in the test ioctl.

Fixes: f4b20bb34c83 ("iommufd: Add kernel support for testing iommufd")
Link: https://patch.msgid.link/r/0-v1-cd99f6049ba5+51-iommufd_syz_add_resv_jgg@nvidia.com
Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Reported-by: syzbot+57fdb0cf6a0c5d1f15a2@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/69368129.a70a0220.38f243.008f.GAE@google.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-12-16 11:53:40 -04:00
Niklas Söderlund
67549b73f1 dt-bindings: gpu: img,powervr-rogue: Document GE7800 GPU in Renesas R-Car V3U
Document Imagination Technologies PowerVR Rogue GE7800 BNVC 15.5.1.64
present in Renesas R-Car R8A779A0 V3U SoC.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Reviewed-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20251106212342.2771579-2-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2025-12-16 07:59:35 -06:00
Krzysztof Kozlowski
0f5796dac1 cpufreq: dt-platdev: Fix creating device on OPPv1 platforms
Commit 6ea891a6dd37 ("cpufreq: dt-platdev: Simplify with
of_machine_get_match_data()") broke several platforms which did not have
OPPv2 proprety, because it incorrectly checked for device match data
after first matching from "allowlist".  Almost all of "allowlist" match
entries do not have match data and it is expected to create platform
device for them with empty data.

Fix this by first checking if platform is on the allowlist with
of_machine_device_match() and only then taking the match data.  This
duplicates the number of checks (we match against the allowlist twice),
but makes the code here much smaller.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Closes: https://lore.kernel.org/all/CAMuHMdVJD4+J9QpUUs-sX0feKfuPD72CO0dcqN7shvF_UYpZ3Q@mail.gmail.com/
Reported-by: Pavel Pisa <pisa@fel.cvut.cz>
Closes: https://lore.kernel.org/all/6hnk7llbwdezh74h74fhvofbx4t4jihel5kvr6qwx2xuxxbjys@rmwbd7lkhrdz/
Fixes: 6ea891a6dd37 ("cpufreq: dt-platdev: Simplify with of_machine_get_match_data()")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Tested-by: Pavel Pisa <pisa@fel.cvut.cz>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20251210051718.132795-2-krzysztof.kozlowski@oss.qualcomm.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2025-12-16 07:59:30 -06:00
Rob Herring (Arm)
512e156856 dt-bindings: clock: sprd,sc9860-clk: Allow "reg" for gate clocks
The gate bindings have an artificial split between a "syscon" and clock
provider node. Allow "reg" properties so this split can be removed.

Reviewed-by: Chunyan Zhang <zhang.lyra@gmail.com>
Link: https://patch.msgid.link/20251029155615.1167903-1-robh@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2025-12-16 07:59:30 -06:00
Krzysztof Kozlowski
7fff398df4 dt-bindings: display/ti: Simplify dma-coherent property
Common boolean properties need to be only allowed in the binding
(":true"), because their type is already defined by core DT schema.
Simplify dma-coherent property to match common syntax.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://patch.msgid.link/20251115122120.35315-4-krzk@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2025-12-16 07:59:30 -06:00
Jianpeng Chang
3e8ade58b7 arm64: kdump: Fix elfcorehdr overlap caused by reserved memory processing reorder
Commit 8a6e02d0c00e ("of: reserved_mem: Restructure how the reserved
memory regions are processed") changed the processing order of reserved
memory regions, causing elfcorehdr to overlap with dynamically allocated
reserved memory regions during kdump kernel boot.

The issue occurs because:
1. kexec-tools allocates elfcorehdr in the last crashkernel reserved
   memory region and passes it to the second kernel
2. The problematic commit moved dynamic reserved memory allocation
   (like bman-fbpr) to occur during fdt_scan_reserved_mem(), before
   elfcorehdr reservation in fdt_reserve_elfcorehdr()
3. bman-fbpr with 16MB alignment requirement can get allocated at
   addresses that overlap with the elfcorehdr location
4. When fdt_reserve_elfcorehdr() tries to reserve elfcorehdr memory,
   overlap detection identifies the conflict and skips reservation
5. kdump kernel fails with "Unable to handle kernel paging request"
   because elfcorehdr memory is not properly reserved

The boot log:
Before 8a6e02d0c00e:
  OF: fdt: Reserving 1 KiB of memory at 0xf4fff000 for elfcorehdr
  OF: reserved mem: 0xf3000000..0xf3ffffff bman-fbpr

After 8a6e02d0c00e:
  OF: reserved mem: 0xf4000000..0xf4ffffff bman-fbpr
  OF: fdt: elfcorehdr is overlapped

Fix this by ensuring elfcorehdr reservation occurs before dynamic
reserved memory allocation.

Fixes: 8a6e02d0c00e ("of: reserved_mem: Restructure how the reserved memory regions are processed")
Signed-off-by: Jianpeng Chang <jianpeng.chang.cn@windriver.com>
Link: https://patch.msgid.link/20251205015934.700016-1-jianpeng.chang.cn@windriver.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2025-12-16 07:59:30 -06:00
Rafael J. Wysocki
359afc8eb0 PM: runtime: Do not clear needs_force_resume with enabled runtime PM
Commit 89d9cec3b1e9 ("PM: runtime: Clear power.needs_force_resume in
pm_runtime_reinit()") added provisional clearing of power.needs_force_resume
to pm_runtime_reinit(), but it is done unconditionally which is a
mistake because pm_runtime_reinit() may race with driver probing
and removal [1].

To address this, notice that power.needs_force_resume should never
be set when runtime PM is enabled and so it only needs to be cleared
when runtime PM is disabled, and update pm_runtime_init() to only
clear that flag when runtime PM is disabled.

Fixes: 89d9cec3b1e9 ("PM: runtime: Clear power.needs_force_resume in pm_runtime_reinit()")
Reported-by: Ed Tsai <ed.tsai@mediatek.com>
Closes: https://lore.kernel.org/linux-pm/20251215122154.3180001-1-ed.tsai@mediatek.com/ [1]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: 6.17+ <stable@vger.kernel.org> # 6.17+
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://patch.msgid.link/12807571.O9o76ZdvQC@rafael.j.wysocki
2025-12-16 12:58:57 +01:00
Guido Günther
2bfca4fe1f drm/panel: visionox-rm69299: Depend on BACKLIGHT_CLASS_DEVICE
We handle backlight so need that dependency.

Fixes: 7911d8cab554 ("drm/panel: visionox-rm69299: Add backlight support")
Reported-by: kernelci.org bot <bot@kernelci.org>
Signed-off-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patch.msgid.link/20251017-visionox-rm69299-bl-v2-1-9dfa06606754@sigxcpu.org
2025-12-16 11:28:52 +01:00
Christoph Hellwig
8dc15b7a6e xfs: fix XFS_ERRTAG_FORCE_ZERO_RANGE for zoned file system
The new XFS_ERRTAG_FORCE_ZERO_RANGE error tag added by commit
ea9989668081 ("xfs: error tag to force zeroing on debug kernels") fails
to account for the zoned space reservation rules and this reliably fails
xfs/131 because the zeroing operation returns -EIO.

Fix this by reserving enough space to zero the entire range, which
requires a bit of (fairly ugly) reshuffling to do the error injection
early enough to affect the space reservation.

Fixes: ea9989668081 ("xfs: error tag to force zeroing on debug kernels")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-12-16 09:21:38 +01:00
Haoxiang Li
fc40459de8 xfs: fix a memory leak in xfs_buf_item_init()
xfs_buf_item_get_format() may allocate memory for bip->bli_formats,
free the memory in the error path.

Fixes: c3d5f0c2fb85 ("xfs: complain if anyone tries to create a too-large buffer log item")
Cc: stable@vger.kernel.org
Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-12-16 08:50:11 +01:00
Darrick J. Wong
f067250520 xfs: fix stupid compiler warning
gcc 14.2 warns about:

xfs_attr_item.c: In function ‘xfs_attr_recover_work’:
xfs_attr_item.c:785:9: warning: ‘ip’ may be used uninitialized [-Wmaybe-uninitialized]
  785 |         xfs_trans_ijoin(tp, ip, 0);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~
xfs_attr_item.c:740:42: note: ‘ip’ was declared here
  740 |         struct xfs_inode                *ip;
      |                                          ^~

I think this is bogus since xfs_attri_recover_work either returns a real
pointer having initialized ip or an ERR_PTR having not touched it, but
the tools are smarter than me so let's just null-init the variable
anyway.

Cc: stable@vger.kernel.org # v6.8
Fixes: e70fb328d52772 ("xfs: recreate work items when recovering intent items")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-12-16 08:50:07 +01:00
Darrick J. Wong
5990fd7569 xfs: fix a UAF problem in xattr repair
The xchk_setup_xattr_buf function can allocate a new value buffer, which
means that any reference to ab->value before the call could become a
dangling pointer.  Fix this by moving an assignment to after the buffer
setup.

Cc: stable@vger.kernel.org # v6.10
Fixes: e47dcf113ae348 ("xfs: repair extended attributes")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-12-16 08:50:00 +01:00
Chaitanya Kulkarni
2145f447b7 xfs: ignore discard return value
__blkdev_issue_discard() always returns 0, making all error checking
in XFS discard functions dead code.

Change xfs_discard_extents() return type to void, remove error variable,
error checking, and error logging for the __blkdev_issue_discard() call
in same function.

Update xfs_trim_perag_extents() and xfs_trim_rtgroup_extents() to
ignore the xfs_discard_extents() return value and error checking
code.

Update xfs_discard_rtdev_extents() to ignore __blkdev_issue_discard()
return value and error checking code.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chaitanya Kulkarni <ckulkarnilinux@gmail.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-12-16 08:49:56 +01:00
Linus Torvalds
40fbbd64bb a couple of shmem rename fixes - recent regression from tree-in-dcache
series and older breakage from stable directory offsets stuff.
 
 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCaUD1egAKCRBZ7Krx/gZQ
 693YAQDWMzqUs8bJx95frxidF4fJ658K/bZWHuG9eLDvhF2CxAEAoIWt5EbJ1dE0
 NEIg/+kdVpKpk1DH7SZTzc+Mgbe4aAE=
 =/LtU
 -----END PGP SIGNATURE-----

Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull shmem rename fixes from Al Viro:
 "A couple of shmem rename fixes - recent regression from tree-in-dcache
  series and older breakage from stable directory offsets stuff"

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  shmem: fix recovery on rename failures
  shmem_whiteout(): fix regression from tree-in-dcache series
2025-12-16 19:44:36 +12:00
Linus Torvalds
53ec4a79ff seven smb3 server fixes
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmlA0jkACgkQiiy9cAdy
 T1HXXQv/SiJ9wKH6PDJZ6MRGYJLpHoq6BHUj6Uob2x7fc9LXGTlKwFJ8NAWBN5/1
 Po6MrL28C4Lkm+KJttH/D/9FpsBEWmViMeHuiu8SahdC90TaoNi9hu4lBQvCGSOm
 D59dWuG7KCXVgu3i6zWTKf2G2OkkRwHGKQ66TRvJ317HD0mzQfamke0SBLgN1/VJ
 nKrZw7fuBLNf5x2Yxtn01idGSwROCTqSLG1i6V4wlfX4mLT9ZJAgfbzK7bhReT8U
 ph2OZqFhKMSzZJQE/6VHw2A51LFfWZPNnp4Cl4AEkIVHzhWqzipGggUs782rGQcW
 cHG/1Zawk03ap+7omuyhjgaFjQ02N1W2D+avdSKAjVpFCX+qsAf1RHw+N3+alA8g
 JNuI4O4rtrHHznqaZ2xdgaWHpKp1K+ku2gjZYwTmt0L0ewcPRzvmpWJPT9r+1yFb
 TwLGWPSVpR9jYViUF0X2cmlLYFaiKvKIFgRGn08UD4OrEQupy5p1tIJSzFnqf7E/
 9tKxoXte
 =RiDE
 -----END PGP SIGNATURE-----

Merge tag 'v6.19-rc1-ksmbd-server-fixes' of git://git.samba.org/ksmbd

Pull smb server fixes from Steve French:

 - Fix set xattr name validation

 - Fix session refcount leak

 - Minor cleanup

 - smbdirect (RDMA) fixes: improve receive completion, and connect

* tag 'v6.19-rc1-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: fix buffer validation by including null terminator size in EA length
  ksmbd: Fix refcount leak when invalid session is found on session lookup
  ksmbd: remove redundant DACL check in smb_check_perm_dacl
  ksmbd: convert comma to semicolon
  smb: server: defer the initial recv completion logic to smb_direct_negotiate_recv_work()
  smb: server: initialize recv_io->cqe.done = recv_done just once
  smb: smbdirect: introduce smbdirect_socket.connect.{lock,work}
2025-12-16 19:34:09 +12:00
Linus Torvalds
115fada16b for-6.19-rc1-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmlATDcACgkQxWXV+ddt
 WDvWIA/7BP1o+7bGSY/HIwnVIwNyEd1YCpUrLZAd61C7t2OEZKCO13HtM9cZSIRb
 z++k8iCkNl5Z3uZH/cQfUcCuA2sip9HSGaVrfeQ3qVnB/s3DMGb/mu1yBCKUo9/f
 nCxpsIC7BD6cTaiAzCJ6JgvCjOSieG111s0/LGjDcjlM7YQoA8p/Jan1nlvfc8It
 sTQ+ZCsuO/xHViQnfmT/XIyy5bSuSABb5LR78wp8wumngTL3ooBXGwWifrNT/egi
 E06Hhnqopg1PjBDtQtmInJ1gh1E0capQ5j1Z6TDeMYCPeUOuPpRqLVrRP3bIM4jN
 vDu5dZpM9r542Wpj/vZvs/UqmhczUmbQfjLfWdr+KORrl6RA9pkHXyFTFIsTKhGi
 vtAsmMnu5FwKSlnZU1i/EuvcF89KEPx4jKRGQWiKUPwAuBUAkVa4xhsI/mAUmwv5
 +Z+hQxPuIAdmcbLblsI0mnhCGjMTx+qUQQdhY2r7U2bOKhEds+XekABb9KBrjOdj
 k8UEQZJwwWkcPSunYsOpBYBI1SIV8UeHtp8d2xrat90+Ome7feL1VFEjV/rOKc6w
 f7hUeYZPVNQMcXdfNRkoXHK/zqKxpMF5lz9Tq3mzfF6XoseC+gSW44dWQnHPnI8X
 kCj0bpg7o2WgGuc3UWIXrVXZmSEWhn30Go6UfsGqHT7xiULQvSc=
 =lAhb
 -----END PGP SIGNATURE-----

Merge tag 'for-6.19-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - fix missing btrfs_path release after printing a relocation error
   message

 - fix extent changeset leak on mmap write after failure to reserve
   metadata

 - fix fs devices list structure freeing, it could be potentially leaked
   under some circumstances

 - tree log fixes:
     - fix incremental directory logging where inodes for new dentries
       were incorrectly skipped
     - don't log conflicting inode if it's a directory moved in the
       current transaction

 - regression fixes:
     - fix incorrect btrfs_path freeing when it's auto-cleaned
     - revert commit simplifying preallocation of temporary structures
       in qgroup functions, some cases were not handled properly

* tag 'for-6.19-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fix changeset leak on mmap write after failure to reserve metadata
  btrfs: fix memory leak of fs_devices in degraded seed device path
  btrfs: fix a potential path leak in print_data_reloc_error()
  Revert "btrfs: add ASSERTs on prealloc in qgroup functions"
  btrfs: do not skip logging new dentries when logging a new name
  btrfs: don't log conflicting inode if it's a dir moved in the current transaction
  btrfs: tests: fix double btrfs_path free in remove_extent_ref()
2025-12-16 19:28:20 +12:00
Linus Torvalds
dbf89321bf sched_ext: Fixes for v6.19-rc1
- Fix memory leak when destroying helper kthread workers during scheduler
   disable.
 
 - Fix bypass depth accounting on scx_enable() failure which could leave
   the system permanently in bypass mode.
 
 - Fix missing preemption handling when moving tasks to local DSQs via
   scx_bpf_dsq_move().
 
 - Misc fixes including NULL check for put_prev_task(), flushing stdout in
   selftests, and removing unused code.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCaUA9aQ4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGXWKAP9VeREr2ceRFd3WK5vFsFzCGh1L8rRu+t352MGL
 qWXkzQD/apLFSo5RQZ0tE8qORwKM9hNhS8QkVJhlsv5VZRWMBQI=
 =ueoE
 -----END PGP SIGNATURE-----

Merge tag 'sched_ext-for-6.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext

Pull sched_ext fixes from Tejun Heo:

 - Fix memory leak when destroying helper kthread workers during
   scheduler disable

 - Fix bypass depth accounting on scx_enable() failure which could leave
   the system permanently in bypass mode

 - Fix missing preemption handling when moving tasks to local DSQs via
   scx_bpf_dsq_move()

 - Misc fixes including NULL check for put_prev_task(), flushing stdout
   in selftests, and removing unused code

* tag 'sched_ext-for-6.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
  sched_ext: Remove unused code in the do_pick_task_scx()
  selftests/sched_ext: flush stdout before test to avoid log spam
  sched_ext: Fix missing post-enqueue handling in move_local_task_to_local_dsq()
  sched_ext: Factor out local_dsq_post_enq() from dispatch_enqueue()
  sched_ext: Fix bypass depth leak on scx_enable() failure
  sched/ext: Avoid null ptr traversal when ->put_prev_task() is called with NULL next
  sched_ext: Fix the memleak for sch->helper objects
2025-12-16 19:24:35 +12:00
Linus Torvalds
6b63f90fa2 cgroup: Fixes for v6.19-rc1
- Fix a race condition in css_rstat_updated() where CMPXCHG without LOCK
   prefix could cause lnode corruption when the flusher runs concurrently
   on another CPU. The issue was introduced in 6.17 and causes memcg stats
   to become corrupted in production.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCaUA8mg4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGXhaAQDSk/ywLzJG807gzvhC5QLcY8kYzUUFHJ4TQAFp
 08UqOAEA9yDgBHPqkp6ucjEJMG+2esE9A5NJB2LeEjG8ZZOCDQM=
 =Kcfd
 -----END PGP SIGNATURE-----

Merge tag 'cgroup-for-6.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup fix from Tejun Heo:

 - Fix a race condition in css_rstat_updated() where CMPXCHG without
   LOCK prefix could cause lnode corruption when the flusher runs
   concurrently on another CPU. The issue was introduced in 6.17 and
   causes memcg stats to become corrupted in production.

* tag 'cgroup-for-6.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: rstat: use LOCK CMPXCHG in css_rstat_updated
2025-12-16 19:21:17 +12:00
Juergen Gross
e5aff444e3 x86/xen: Fix sparse warning in enlighten_pv.c
The sparse tool issues a warning for arch/x76/xen/enlighten_pv.c:

   arch/x86/xen/enlighten_pv.c:120:9: sparse: sparse: incorrect type
     in initializer (different address spaces)
     expected void const [noderef] __percpu *__vpp_verify
     got bool *

This is due to the percpu variable xen_in_preemptible_hcall being
exported via EXPORT_SYMBOL_GPL() instead of EXPORT_PER_CPU_SYMBOL_GPL().

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202512140856.Ic6FetG6-lkp@intel.com/
Fixes: fdfd811ddde3 ("x86/xen: allow privcmd hypercalls to be preempted")
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20251215115112.15072-1-jgross@suse.com>
2025-12-16 07:48:40 +01:00
Al Viro
e1b4c6a583 shmem: fix recovery on rename failures
maple_tree insertions can fail if we are seriously short on memory;
simple_offset_rename() does not recover well if it runs into that.
The same goes for simple_offset_rename_exchange().

Moreover, shmem_whiteout() expects that if it succeeds, the caller will
progress to d_move(), i.e. that shmem_rename2() won't fail past the
successful call of shmem_whiteout().

Not hard to fix, fortunately - mtree_store() can't fail if the index we
are trying to store into is already present in the tree as a singleton.

For simple_offset_rename_exchange() that's enough - we just need to be
careful about the order of operations.

For simple_offset_rename() solution is to preinsert the target into the
tree for new_dir; the rest can be done without any potentially failing
operations.

That preinsertion has to be done in shmem_rename2() rather than in
simple_offset_rename() itself - otherwise we'd need to deal with the
possibility of failure after successful shmem_whiteout().

Fixes: a2e459555c5f ("shmem: stable directory offsets")
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-12-16 00:57:29 -05:00
Niklas Cassel
ba624ba88d ata: libata-core: Disable LPM on ST2000DM008-2FR102
According to a user report, the ST2000DM008-2FR102 has problems with LPM.

Reported-by: Emerson Pinter <e@pinter.dev>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220693
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2025-12-16 14:23:10 +09:00
Jason Gunthorpe
b80fab2813 iommufd/selftest: Do not leak the hwpt if IOMMU_TEST_OP_MD_CHECK_MAP fails
If the input validation fails it returned without freeing the hwpt
refcount causing a leak. This triggers a WARN_ON when closing the fd:

  WARNING: drivers/iommu/iommufd/main.c:369 at iommufd_fops_release+0x385/0x430, CPU#1: repro/724

Found by szykaller.

Fixes: e93d5945ed5b ("iommufd: Change the selftest to use iommupt instead of xarray")
Link: https://patch.msgid.link/r/0-v1-c8ed57e24380+44ae-iommufd_selftest_hwpt_leak_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reported-by: "Lai, Yi" <yi1.lai@linux.intel.com>
Closes: https://lore.kernel.org/r/aTJGMaqwQK0ASj0G@ly-workstation
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-12-15 20:34:41 -04:00
Jason Gunthorpe
5b244b077c iommufd/selftest: Make it clearer to gcc that the access is not out of bounds
GCC gets a bit confused and reports:

   In function '_test_cmd_get_hw_info',
       inlined from 'iommufd_ioas_get_hw_info' at iommufd.c:779:3,
       inlined from 'wrapper_iommufd_ioas_get_hw_info' at iommufd.c:752:1:
>> iommufd_utils.h:804:37: warning: array subscript 'struct iommu_test_hw_info[0]' is partly outside array bounds of 'struct iommu_test_hw_info_buffer_smaller[1]' [-Warray-bounds=]
     804 |                         assert(!info->flags);
         |                                 ~~~~^~~~~~~
   iommufd.c: In function 'wrapper_iommufd_ioas_get_hw_info':
   iommufd.c:761:11: note: object 'buffer_smaller' of size 4
     761 |         } buffer_smaller;
         |           ^~~~~~~~~~~~~~

While it is true that "struct iommu_test_hw_info[0]" is partly out of
bounds of the input pointer, it is not true that info->flags is out of
bounds. Unclear why it warns on this.

Reuse an existing properly sized stack buffer and pass a truncated length
instead to test the same thing.

Fixes: af4fde93c319 ("iommufd/selftest: Add coverage for IOMMU_GET_HW_INFO ioctl")
Link: https://patch.msgid.link/r/0-v1-63a2cffb09da+4486-iommufd_gcc_bounds_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202512032344.kaAcKFIM-lkp@intel.com/
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-12-15 20:34:41 -04:00
Arnd Bergmann
69dc538a4f iommufd: Fix building without dmabuf
When DMABUF is disabled, trying to use it causes a link failure:

x86_64-linux-ld: drivers/iommu/iommufd/io_pagetable.o: in function `iopt_map_file_pages':
io_pagetable.c:(.text+0x1735): undefined reference to `dma_buf_get'
x86_64-linux-ld: io_pagetable.c:(.text+0x1775): undefined reference to `dma_buf_put'

Fixes: 44ebaa1744fd ("iommufd: Accept a DMABUF through IOMMU_IOAS_MAP_FILE")
Link: https://patch.msgid.link/r/20251204100333.1034767-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-12-15 20:34:41 -04:00
Mario Limonciello (AMD)
7bbf6d15e9 accel/amdxdna: Block running under a hypervisor
SVA support is required, which isn't configured by hypervisor
solutions.

Closes: https://github.com/QubesOS/qubes-issues/issues/10275
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4656
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251213054513.87925-1-superm1@kernel.org
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-12-15 13:00:03 -06:00
Christoffer Sandberg
aed3716db7 Input: i8042 - add TUXEDO InfinityBook Max Gen10 AMD to i8042 quirk table
The device occasionally wakes up from suspend with missing input on the
internal keyboard and the following suspend attempt results in an instant
wake-up. The quirks fix both issues for this device.

Signed-off-by: Christoffer Sandberg <cs@tuxedo.de>
Signed-off-by: Werner Sembach <wse@tuxedocomputers.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251124203336.64072-1-wse@tuxedocomputers.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-12-15 10:14:36 -08:00
Cryolitia PukNgae
2aaf33c6e1 Input: atkbd - skip deactivate for HONOR FMB-P's internal keyboard
After commit 9cf6e24c9fbf17e52de9fff07f12be7565ea6d61 ("Input: atkbd -
do not skip atkbd_deactivate() when skipping ATKBD_CMD_GETID"), HONOR
FMB-P, aka HONOR MagicBook Pro 14 2025's internal keyboard stops
working. Adding the atkbd_deactivate_fixup quirk fixes it.

DMI: HONOR FMB-P/FMB-P-PCB, BIOS 1.13 05/08/2025

Fixes: 9cf6e24c9fbf17e52de9fff07f12be7565ea6d61 ("Input: atkbd - do not skip atkbd_deactivate() when skipping ATKBD_CMD_GETID")
Reported-by: Mikura Kyouka <mikurakyouka@aosc.io>
Reported-by: foad.elkhattabi <foad.elkhattabi@gmail.com>
Signed-off-by: Cryolitia PukNgae <cryolitia.pukngae@linux.dev>
Reviewed-by: Hans de Goede <hansg@kernel.org>
Link: https://patch.msgid.link/20251022-honor-v1-1-ff894ed271a9@linux.dev
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-12-15 10:12:54 -08:00
Yongpeng Yang
67d85b062d Documentation: admin-guide: blockdev: replace zone_capacity with zone_capacity_mb when creating devices
The "zone_capacity=%umb" option is no longer used. The effective option
is now "zone_capacity_mb=%u", so update the documentation accordingly.

Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-12-15 09:56:06 -07:00
Yongpeng Yang
4b2b03151e zloop: use READ_ONCE() to read lo->lo_state in queue_rq path
In the queue_rq path, zlo->state is accessed without locking, and direct
access may read stale data. This patch uses READ_ONCE() to read
zlo->state and data_race() to silence code checkers, and changes all
assignments to use WRITE_ONCE().

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-12-15 09:32:42 -07:00
Yongpeng Yang
54891a96b7 loop: use READ_ONCE() to read lo->lo_state without locking
When lo->lo_mutex is not held, direct access may read stale data. This
patch uses READ_ONCE() to read lo->lo_state and data_race() to silence
code checkers, and changes all assignments to use WRITE_ONCE().

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-12-15 09:32:42 -07:00
Brendan Jackman
c33b68801f kunit: make FAULT_TEST default to n when PANIC_ON_OOPS
As describe in the help string, the user might want to disable these
tests if they don't like to see stacktraces/BUG etc in their kernel log.

However, if they enable PANIC_ON_OOPS, these tests also crash the
machine, which it's safe to assume _almost_ nobody wants.

One might argue that _absolutely_ nobody ever wants their kernel to
crash so this should just be a hard dependency instead of a default.
However, since this is rather special code that's anyway concerned with
deliberately doing "bad" things, the normal rules don't seem to apply,
hence prefer flexibility and allow users to set up a crashing Kconfig if
they so choose.

Link: https://lore.kernel.org/r/20251207-kunit-fault-no-panic-v1-1-2ac932f26864@google.com
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-12-15 09:27:19 -07:00
Uwe Kleine-König
726c93b040 kunit: Drop unused parameter from kunit_device_register_internal
The passed driver isn't used, so just drop this parameter.

Link: https://lore.kernel.org/r/20251210065839.482608-2-u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-12-15 09:27:19 -07:00
Marijn Suijten
2b973ca48f drm/panel: sony-td4353-jdi: Enable prepare_prev_first
The DSI host must be enabled before our prepare function can run, which
has to send its init sequence over DSI.  Without enabling the host first
the panel will not probe.

Fixes: 9e15123eca79 ("drm/msm/dsi: Stop unconditionally powering up DSI hosts at modeset")
Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Martin Botka <martin.botka@somainline.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patch.msgid.link/20251130-sony-akari-fix-panel-v1-1-1d27c60a55f5@somainline.org
2025-12-15 08:14:20 -08:00
Zqiang
bb27226f0d sched_ext: Remove unused code in the do_pick_task_scx()
The kick_idle variable is no longer used, this commit therefore remove
it and also remove associated code in the do_pick_task_scx().

Signed-off-by: Zqiang <qiang.zhang@linux.dev>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-12-15 05:53:49 -10:00
Haoxiang Li
680ad315ca MIPS: Fix a reference leak bug in ip22_check_gio()
If gio_device_register fails, gio_dev_put() is required to
drop the gio_dev device reference.

Fixes: e84de0c61905 ("MIPS: GIO bus support for SGI IP22/28")
Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2025-12-15 16:11:14 +01:00
Thierry Reding
bd94fbe8b5 MIPS: Alchemy: Remove bogus static/inline specifiers
The recent io_remap_pfn_range() rework applied the static and inline
specifiers to the implementation of io_remap_pfn_range_pfn() on MIPS
Alchemy, mirroring the same change on other platforms. However, this
function is defined in a source file and that definition causes a
conflict with its declaration. Fix this by dropping the specifiers.

Fixes: c707a68f9468 ("mm: abstract io_remap_pfn_range() based on PFN")
Signed-off-by: Thierry Reding <treding@nvidia.com>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2025-12-15 16:09:46 +01:00
Florian Westphal
fec7b07955 selftests: netfilter: packetdrill: avoid failure on HZ=100 kernel
packetdrill --ip_version=ipv4 --mtu=1500 --tolerance_usecs=1000000 --non_fatal packet conntrack_syn_challenge_ack.pkt
conntrack v1.4.8 (conntrack-tools): 1 flow entries have been shown.
conntrack_syn_challenge_ack.pkt:32: error executing `conntrack -f $NFCT_IP_VERSION \
-L -p tcp --dport 8080 | grep UNREPLIED | grep -q SYN_SENT` command: non-zero status 1

Affected kernel had CONFIG_HZ=100; reset packet was still sitting in
backlog.

Reported-by: Yi Chen <yiche@redhat.com>
Fixes: a8a388c2aae4 ("selftests: netfilter: add packetdrill based conntrack tests")
Signed-off-by: Florian Westphal <fw@strlen.de>
2025-12-15 15:04:04 +01:00
Florian Westphal
7e7a817f2d netfilter: nf_tables: avoid softlockup warnings in nft_chain_validate
This reverts commit
314c82841602 ("netfilter: nf_tables: can't schedule in nft_chain_validate"):
Since commit a60a5abe19d6 ("netfilter: nf_tables: allow iter callbacks to sleep")
the iterator callback is invoked without rcu read lock held, so this
cond_resched() is now valid.

Signed-off-by: Florian Westphal <fw@strlen.de>
2025-12-15 15:04:04 +01:00
Florian Westphal
8e1a1bc4f5 netfilter: nf_tables: avoid chain re-validation if possible
Hamza Mahfooz reports cpu soft lock-ups in
nft_chain_validate():

 watchdog: BUG: soft lockup - CPU#1 stuck for 27s! [iptables-nft-re:37547]
[..]
 RIP: 0010:nft_chain_validate+0xcb/0x110 [nf_tables]
[..]
  nft_immediate_validate+0x36/0x50 [nf_tables]
  nft_chain_validate+0xc9/0x110 [nf_tables]
  nft_immediate_validate+0x36/0x50 [nf_tables]
  nft_chain_validate+0xc9/0x110 [nf_tables]
  nft_immediate_validate+0x36/0x50 [nf_tables]
  nft_chain_validate+0xc9/0x110 [nf_tables]
  nft_immediate_validate+0x36/0x50 [nf_tables]
  nft_chain_validate+0xc9/0x110 [nf_tables]
  nft_immediate_validate+0x36/0x50 [nf_tables]
  nft_chain_validate+0xc9/0x110 [nf_tables]
  nft_immediate_validate+0x36/0x50 [nf_tables]
  nft_chain_validate+0xc9/0x110 [nf_tables]
  nft_table_validate+0x6b/0xb0 [nf_tables]
  nf_tables_validate+0x8b/0xa0 [nf_tables]
  nf_tables_commit+0x1df/0x1eb0 [nf_tables]
[..]

Currently nf_tables will traverse the entire table (chain graph), starting
from the entry points (base chains), exploring all possible paths
(chain jumps).  But there are cases where we could avoid revalidation.

Consider:
1  input -> j2 -> j3
2  input -> j2 -> j3
3  input -> j1 -> j2 -> j3

Then the second rule does not need to revalidate j2, and, by extension j3,
because this was already checked during validation of the first rule.
We need to validate it only for rule 3.

This is needed because chain loop detection also ensures we do not exceed
the jump stack: Just because we know that j2 is cycle free, its last jump
might now exceed the allowed stack size.  We also need to update all
reachable chains with the new largest observed call depth.

Care has to be taken to revalidate even if the chain depth won't be an
issue: chain validation also ensures that expressions are not called from
invalid base chains.  For example, the masquerade expression can only be
called from NAT postrouting base chains.

Therefore we also need to keep record of the base chain context (type,
hooknum) and revalidate if the chain becomes reachable from a different
hook location.

Reported-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com>
Closes: https://lore.kernel.org/netfilter-devel/20251118221735.GA5477@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net/
Tested-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
2025-12-15 15:02:44 +01:00
Jan Maslak
eed5b815fa drm/xe: Restore engine registers before restarting schedulers after GT reset
During GT reset recovery in do_gt_restart(), xe_uc_start() was called
before xe_reg_sr_apply_mmio() restored engine-specific registers. This
created a race window where the scheduler could run jobs before hardware
state was fully restored.

This caused failures in eudebug tests (xe_exec_sip_eudebug@breakpoint-
waitsip-*) where TD_CTL register (containing TD_CTL_GLOBAL_DEBUG_ENABLE)
wasn't restored before jobs started executing. Breakpoints would fail to
trigger SIP entry because the debug enable bit wasn't set yet.

Fix by moving xe_uc_start() after all MMIO register restoration,
including engine registers and CCS mode configuration, ensuring all
hardware state is fully restored before any jobs can be scheduled.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Jan Maslak <jan.maslak@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251210145618.169625-2-jan.maslak@intel.com
(cherry picked from commit 825aed0328588b2837636c1c5a0c48795d724617)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:17:04 +01:00
Jagmeet Randhawa
eafb6f6209 drm/xe: Increase TDF timeout
There are some corner cases where flushing transient
data may take slightly longer than the 150us timeout
we currently allow.  Update the driver to use a 300us
timeout instead based on the latest guidance from
the hardware team. An update to the bspec to formally
document this is expected to arrive soon.

Fixes: c01c6066e6fa ("drm/xe/device: implement transient flush")
Signed-off-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/0201b1d6ec64d3651fcbff1ea21026efa915126a.1765487866.git.jagmeet.randhawa@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit d69d3636f5f7a84bae7cd43473b3701ad9b7d544)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:16:57 +01:00
Satyanarayana K V P
c770467d28 drm/xe/vf: Fix queuing of recovery work
Ensure VF migration recovery work is only queued when no recovery is
already queued and teardown is not in progress.

Fixes: b47c0c07c350 ("drm/xe/vf: Teardown VF post migration worker on driver unload")
Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20251210052546.622809-5-satyanarayana.k.v.p@intel.com
(cherry picked from commit 8d8cf42b03f149dcb545b547906306f3b474565e)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:16:48 +01:00
Thomas Hellström
449bcd5d45 drm/xe/bo: Don't include the CCS metadata in the dma-buf sg-table
Some Xe bos are allocated with extra backing-store for the CCS
metadata. It's never been the intention to share the CCS metadata
when exporting such bos as dma-buf. Don't include it in the
dma-buf sg-table.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://patch.msgid.link/20251209204920.224374-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit a4ebfb9d95d78a12512b435a698ee6886d712571)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:16:39 +01:00
Junxiao Chang
17445af7dc drm/me/gsc: mei interrupt top half should be in irq disabled context
MEI GSC interrupt comes from i915 or xe driver. It has top half and
bottom half. Top half is called from i915/xe interrupt handler. It
should be in irq disabled context.

With RT kernel(PREEMPT_RT enabled), by default IRQ handler is in
threaded IRQ. MEI GSC top half might be in threaded IRQ context.
generic_handle_irq_safe API could be called from either IRQ or
process context, it disables local IRQ then calls MEI GSC interrupt
top half.

This change fixes B580 GPU boot issue with RT enabled.

Fixes: e02cea83d32d ("drm/xe/gsc: add Battlemage support")
Tested-by: Baoli Zhang <baoli.zhang@intel.com>
Signed-off-by: Junxiao Chang <junxiao.chang@intel.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251107033152.834960-1-junxiao.chang@intel.com
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
(cherry picked from commit 3efadf028783a49ab2941294187c8b6dd86bf7da)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:14:47 +01:00
Tomasz Lis
61e6b711c3 drm/xe/vf: Stop waiting for ring space on VF post migration recovery
If wait for ring space started just before migration, it can delay
the recovery process, by waiting without bailout path for up to 2
seconds.

Two second wait for recovery is not acceptable, and if the ring was
completely filled even without the migration temporarily stopping
execution, then such a wait will result in up to a thousand new jobs
(assuming constant flow) being added while the wait is happening.

While this will not cause data corruption, it will lead to warning
messages getting logged due to reset being scheduled on a GT under
recovery. Also several seconds of unresponsiveness, as the backlog
of jobs gets progressively executed.

Add a bailout condition, to make sure the recovery starts without
much delay. The recovery is expected to finish in about 100 ms when
under moderate stress, so the condition verification period needs to be
below that - settling at 64 ms.

The theoretical max time which the recovery can take depends on how
many requests can be emitted to engine rings and be pending execution.
While stress testing, it was possible to reach 10k pending requests
on rings when a platform with two GTs was used. This resulted in max
recovery time of 5 seconds. But in real life situations, it is very
unlikely that the amount of pending requests will ever exceed 100,
and for that the recovery time will be around 50 ms - well within our
claimed limit of 100ms.

Fixes: a4dae94aad6a ("drm/xe/vf: Wakeup in GuC backend on VF post migration recovery")
Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20251204200820.2206168-1-tomasz.lis@intel.com
(cherry picked from commit a00e305fba02a915cf2745bf6ef3f55537e65d57)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:14:37 +01:00
Raag Jadav
17d52ab2a6 drm/xe/throttle: Skip reason prefix while emitting array
The newly introduced "reasons" attribute already signifies possible
reasons for throttling and makes the prefix in individual attribute
names redundant while emitting them as an array. Skip the prefix.

Fixes: 83ccde67a3f7 ("drm/xe/gt_throttle: Avoid TOCTOU when monitoring reasons")
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Sk Anirban <sk.anirban@intel.com>
Link: https://patch.msgid.link/20251203123355.571606-1-raag.jadav@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit b64a14334ef3ebbcf70d11bc67d0934bdc0e390d)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:14:27 +01:00
Arnd Bergmann
9acc329581 drm/xe: fix drm_gpusvm_init() arguments
The Xe driver fails to build when CONFIG_DRM_XE_GPUSVM is disabled
but CONFIG_DRM_GPUSVM is turned on, due to the clash of two commits:

In file included from drivers/gpu/drm/xe/xe_vm_madvise.c:8:
drivers/gpu/drm/xe/xe_svm.h: In function 'xe_svm_init':
include/linux/stddef.h:8:14: error: passing argument 5 of 'drm_gpusvm_init' makes integer from pointer without a cast [-Wint-conversion]
drivers/gpu/drm/xe/xe_svm.h:217:38: note: in expansion of macro 'NULL'
  217 |                                NULL, NULL, 0, 0, 0, NULL, NULL, 0);
      |                                      ^~~~
In file included from drivers/gpu/drm/xe/xe_bo_types.h:11,
                 from drivers/gpu/drm/xe/xe_bo.h:11,
                 from drivers/gpu/drm/xe/xe_vm_madvise.c:11:
include/drm/drm_gpusvm.h:254:35: note: expected 'long unsigned int' but argument is of type 'void *'
  254 |                     unsigned long mm_start, unsigned long mm_range,
      |                     ~~~~~~~~~~~~~~^~~~~~~~
In file included from drivers/gpu/drm/xe/xe_vm_madvise.c:14:
drivers/gpu/drm/xe/xe_svm.h:216:16: error: too many arguments to function 'drm_gpusvm_init'; expected 10, have 11
  216 |         return drm_gpusvm_init(&vm->svm.gpusvm, "Xe SVM (simple)", &vm->xe->drm,
      |                ^~~~~~~~~~~~~~~
  217 |                                NULL, NULL, 0, 0, 0, NULL, NULL, 0);
      |                                                                 ~
include/drm/drm_gpusvm.h:251:5: note: declared here

Adapt the caller to the new argument list by removing the extraneous
NULL argument.

Fixes: 9e9787414882 ("drm/xe/userptr: replace xe_hmm with gpusvm")
Fixes: 10aa5c806030 ("drm/gpusvm, drm/xe: Fix userptr to not allow device private pages")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20251204094704.1030933-1-arnd@kernel.org
(cherry picked from commit 29bce9c8b41d5c378263a927acb9a9074d0e7a0e)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:14:08 +01:00
Matthew Brost
224a6ac080 drm/xe: Do not reference loop variable directly
Do not reference the loop variable job after the loop has exited.
Instead, save the job from the last iteration of the loop.

Fixes: 3d98a7164da6 ("drm/xe/vf: Start re-emission from first unsignaled job during VF migration")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202511291102.jnnKP6IB-lkp@intel.com/
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Link: https://patch.msgid.link/20251203011809.968893-1-matthew.brost@intel.com
(cherry picked from commit 76ce2313709f13a6adbcaa1a43a8539c8f509f6a)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:13:58 +01:00
Vinay Belgaumkar
c88a0731ed drm/xe: Apply Wa_14020316580 in xe_gt_idle_enable_pg()
Wa_14020316580 was getting clobbered by power gating init code
later in the driver load sequence. Move the Wa so that
it applies correctly.

Fixes: 7cd05ef89c9d ("drm/xe/xe2hpm: Add initial set of workarounds")
Suggested-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20251129052548.70766-1-vinay.belgaumkar@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit 8b5502145351bde87f522df082b9e41356898ba3)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:13:48 +01:00
Shuicheng Lin
b32045d73b drm/xe: Fix freq kobject leak on sysfs_create_files failure
Ensure gt->freq is released when sysfs_create_files() fails
in xe_gt_freq_init(). Without this, the kobject would leak.
Add kobject_put() before returning the error.

Fixes: fdc81c43f0c1 ("drm/xe: use devm_add_action_or_reset() helper")
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Alex Zuo <alex.zuo@intel.com>
Reviewed-by: Xin Wang <x.wang@intel.com>
Link: https://patch.msgid.link/20251114205638.2184529-2-shuicheng.lin@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit 251be5fb4982ebb0f5a81b62d975bd770f3ad5c2)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:13:41 +01:00