1
0
mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-01-12 01:20:14 +00:00

1397042 Commits

Author SHA1 Message Date
Christoph Hellwig
ec7f31b2a2 block: make bio auto-integrity deadlock safe
The current block layer automatic integrity protection allocates the
actual integrity buffer, which has three problems:

 - because it happens at the bottom of the I/O stack and doesn't use a
   mempool it can deadlock under load
 - because the data size in a bio is almost unbounded when using lage
   folios it can relatively easily exceed the maximum kmalloc size
 - even when it does not exceed the maximum kmalloc size, it could
   exceed the maximum segment size of the device

Fix this by limiting the I/O size so that we can allocate at least a
2MiB integrity buffer, i.e. 128MiB for 8 byte PI and 512 byte integrity
intervals, and create a mempool as a last resort for this maximum size,
mirroring the scheme used for bvecs.  As a nice upside none of this
can fail now, so we remove the error handling and open code the
trivial addition of the bip vec.

The new allocation helpers sit outside of bio-integrity-auto.c because
I plan to reuse them for file system based PI in the near future.

Fixes: 7ba1ba12eeef ("block: Block layer data integrity support")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Anuj Gupta <anuj20.g@samsung.com>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-04 12:41:50 -07:00
Christoph Hellwig
eef09f742b block: blocking mempool_alloc doesn't fail
So remove the error check for it in bio_integrity_prep.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Anuj Gupta <anuj20.g@samsung.com>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-04 12:40:46 -07:00
Ming Lei
3f5b1169d2 selftests: ublk: make ublk_thread thread-local variable
Refactor ublk_thread to be a thread-local variable instead of storing
it in ublk_dev:

- Remove pthread_t thread field from struct ublk_thread and move it to
  struct ublk_thread_info

- Remove struct ublk_thread array from struct ublk_dev, reducing memory
  footprint

- Define struct ublk_thread as local variable in __ublk_io_handler_fn()
  instead of accessing it from dev->threads[]

- Extract main IO handling logic into __ublk_io_handler_fn() which is
  marked as noinline

- Move CPU affinity setup to ublk_io_handler_fn() before calling
  __ublk_io_handler_fn()

- Update ublk_thread_set_sched_affinity() to take struct ublk_thread_info *
  instead of struct ublk_thread *, and use pthread_setaffinity_np()
  instead of sched_setaffinity()

- Reorder struct ublk_thread fields to group related state together

This change makes each thread's ublk_thread structure truly local to
the thread, improving cache locality and reducing memory usage.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-03 08:34:59 -07:00
Ming Lei
0123bb91f4 selftests: ublk: set CPU affinity before thread initialization
Move ublk_thread_set_sched_affinity() call before ublk_thread_init()
to ensure memory allocations during thread initialization occur on
the correct NUMA node. This leverages Linux's first-touch memory
policy for better NUMA locality.

Also convert ublk_thread_set_sched_affinity() to use
pthread_setaffinity_np() instead of sched_setaffinity(), as the
pthread API is the proper interface for setting thread affinity in
multithreaded programs.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-03 08:34:59 -07:00
Ming Lei
c28ba6b6c5 ublk: use struct_size() for allocation
Convert ublk_queue to use struct_size() for allocation.

Changes in this commit:

1. Update ublk_init_queue() to use struct_size(ubq, ios, depth)
   instead of manual size calculation (sizeof(struct ublk_queue) +
   depth * sizeof(struct ublk_io)).

This provides better type safety and makes the code more maintainable
by using standard kernel macro for flexible array handling.

Meantime annotate ublk_queue.ios by __counted_by().

Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-03 08:34:59 -07:00
Ming Lei
529d4d6327 ublk: implement NUMA-aware memory allocation
Implement NUMA-friendly memory allocation for ublk driver to improve
performance on multi-socket systems.

This commit includes the following changes:

1. Rename __queues to queues, dropping the __ prefix since the field is
   now accessed directly throughout the codebase rather than only through
   the ublk_get_queue() helper.

2. Remove the queue_size field from struct ublk_device as it is no longer
   needed.

3. Move queue allocation and deallocation into ublk_init_queue() and
   ublk_deinit_queue() respectively, improving encapsulation. This
   simplifies ublk_init_queues() and ublk_deinit_queues() to just
   iterate and call the per-queue functions.

4. Add ublk_get_queue_numa_node() helper function to determine the
   appropriate NUMA node for a queue by finding the first CPU mapped
   to that queue via tag_set.map[HCTX_TYPE_DEFAULT].mq_map[] and
   converting it to a NUMA node using cpu_to_node(). This function is
   called internally by ublk_init_queue() to determine the allocation
   node.

5. Allocate each queue structure on its local NUMA node using
   kvzalloc_node() in ublk_init_queue().

6. Allocate the I/O command buffer on the same NUMA node using
   alloc_pages_node().

This reduces memory access latency on multi-socket NUMA systems by
ensuring each queue's data structures are local to the CPUs that
access them.

Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-03 08:34:59 -07:00
Ming Lei
011af85ccd ublk: reorder tag_set initialization before queue allocation
Move ublk_add_tag_set() before ublk_init_queues() in the device
initialization path. This allows us to use the blk-mq CPU-to-queue
mapping established by the tag_set to determine the appropriate
NUMA node for each queue allocation.

The error handling paths are also reordered accordingly.

Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-03 08:34:59 -07:00
Chaitanya Kulkarni
bc49af56ee blktrace: add support for REQ_OP_WRITE_ZEROES tracing
Currently, REQ_OP_WRITE_ZEROES operations are not handled in the
blktrace infrastructure, resulting in incorrect or missing operation
labels in ftrace blktrace output. This manifests as write-zeroes
operations appearing with incorrect labels like "N" instead of a
proper "WZ" designation.

This patch adds complete support for REQ_OP_WRITE_ZEROES across the
blktrace infrastructure:

Add BLK_TC_WRITE_ZEROES trace category in blktrace_api.h and update
BLK_TC_END_V2 marker accordingly
Map REQ_OP_WRITE_ZEROES to BLK_TC_WRITE_ZEROES in __blk_add_trace()
to ensure proper trace event categorization
Update fill_rwbs() to generate "WZ" label for write-zeroes operations
in ftrace output, making them easily identifiable
Add "write-zeroes" string mapping in act_to_str array for debugfs
filter interface
Update blk_fill_rwbs() to handle REQ_OP_WRITE_ZEROES for block layer
event tracing

With this fix, write-zeroes operations are now correctly traced and
displayed.

===========================================================
BEFORE THIS PATCH
===========================================================
blkdiscard -z -o 0 -l 40960 /dev/nvme0n1
   blkdiscard-3809 [030] .....  1212.253701: block_bio_queue: 259,0 NS 0 + 80 [blkdiscard]
   blkdiscard-3809 [030] .....  1212.253703: block_getrq: 259,0 NS 0 + 80 [blkdiscard]
   blkdiscard-3809 [030] .....  1212.253704: block_io_start: 259,0 NS 40960 () 0 + 80 be,0,4 [blkdiscard]
   blkdiscard-3809 [030] .....  1212.253704: block_plug: [blkdiscard]
   blkdiscard-3809 [030] .....  1212.253706: block_unplug: [blkdiscard] 1
   blkdiscard-3809 [030] .....  1212.253706: block_rq_insert: 259,0 NS 40960 () 0 + 80 be,0,4 [blkdiscard]
kworker/30:1H-566  [030] .....  1212.253726: block_rq_issue: 259,0 NS 40960 () 0 + 80 be,0,4 [kworker/30:1H]
       <idle>-0    [030] d.h1.  1212.253957: block_rq_complete: 259,0 NS () 0 + 80 be,0,4 [0]
       <idle>-0    [030] dNh1.  1212.253960: block_io_done: 259,0 NS 0 () 0 + 0 none,0,0 [swapper/30]

Trace Event Breakdown:
 Event             | Device | Op  | Sector | Sectors | Byte Size | Calculation

 block_bio_queue   | 259,0  | NS  | 0      | 80      | -         | 80 × 512 = 40,960
 block_getrq       | 259,0  | NS  | 0      | 80      | -         | 80 × 512 = 40,960
 block_io_start    | 259,0  | NS  | 0      | 80      | 40960     | Direct from trace
 block_rq_insert   | 259,0  | NS  | 0      | 80      | 40960     | Direct from trace
 block_rq_issue    | 259,0  | NS  | 0      | 80      | 40960     | Direct from trace
 block_rq_complete | 259,0  | NS  | 0      | 80      | -         | 80 × 512 = 40,960
 block_io_done     | 259,0  | NS  | 0      | 0       | 0         | Completion (no data)

  Total Bytes Transferred: Sectors: 80 Bytes: 80 × 512 = 40,960 bytes

===========================================================
AFTER THIS PATCH
===========================================================
blkdiscard -z -o 0 -l 40960 /dev/nvme0n1

   blkdiscard-2477 [020] .....   960.989131: block_bio_queue: 259,0 WZS 0 + 80 [blkdiscard]
   blkdiscard-2477 [020] .....   960.989134: block_getrq: 259,0 WZS 0 + 80 [blkdiscard]
   blkdiscard-2477 [020] .....   960.989135: block_io_start: 259,0 WZS 40960 () 0 + 80 be,0,4 [blkdiscard]
   blkdiscard-2477 [020] .....   960.989138: block_plug: [blkdiscard]
   blkdiscard-2477 [020] .....   960.989140: block_unplug: [blkdiscard] 1
   blkdiscard-2477 [020] .....   960.989141: block_rq_insert: 259,0 WZS 40960 () 0 + 80 be,0,4 [blkdiscard]
kworker/20:1H-736  [020] .....   960.989166: block_rq_issue: 259,0 WZS 40960 () 0 + 80 be,0,4 [kworker/20:1H]
       <idle>-0    [020] d.h1.   960.989476: block_rq_complete: 259,0 WZS () 0 + 80 be,0,4 [0]
       <idle>-0    [020] dNh1.   960.989482: block_io_done: 259,0 WZS 0 () 0 + 0 none,0,0 [swapper/20]

Trace Event Breakdown:
 Event             | Device | Op  | Sector | Sectors | Byte Size | Calculation

 block_bio_queue   | 259,0  | WZS | 0      | 80      | -         | 80 × 512 = 40,960
 block_getrq       | 259,0  | WZS | 0      | 80      | -         | 80 × 512 = 40,960
 block_io_start    | 259,0  | WZS | 0      | 80      | 40960     | Direct from trace
 block_rq_insert   | 259,0  | WZS | 0      | 80      | 40960     | Direct from trace
 block_rq_issue    | 259,0  | WZS | 0      | 80      | 40960     | Direct from trace
 block_rq_complete | 259,0  | WZS | 0      | 80      | -         | 80 × 512 = 40,960
 block_io_done     | 259,0  | WZS | 0      | 0       | 0         | Completion (no data)

  Total Bytes Transferred: Sectors: 80 Bytes: 80 × 512 = 40,960 bytes

Tested with ftrace blktrace on NVMe devices using blkdiscard with
the -z (write-zeroes) flag.

Signed-off-by: Chaitanya Kulkarni <ckulkarnilinux@gmail.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-03 08:30:56 -07:00
Shi Hao
77220f6d18 drbd: replace kmap() with kmap_local_page() in receiver path
Use kmap_local_page() instead of kmap() to avoid
CPU contention.

kmap() uses a global set of mapping slots that can cause contention
between multiple CPUs, while kmap_local_page() uses per-CPU slots
eliminating this contention. It also ensures non-sleeping operation
and provides better cache locality.

Convert kmap() to kmap_local_page() as it aligns with ongoing
kernel efforts to modernize kmap() usage for better multi-core
scalability.

Signed-off-by: Shi Hao <i.shihao.999@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-03 08:15:54 -07:00
Chaitanya Kulkarni
e48886b9d6 blktrace: for ftrace use correct trace format ver
The ftrace blktrace path allocates buffers and writes trace events but
was using the wrong recording function. After
commit 4d8bc7bd4f73 ("blktrace: move ftrace blk_io_tracer to blk_io_trace2"),
the ftrace interface was moved to use blk_io_trace2 format, but
__blk_add_trace() still called record_blktrace_event() which writes in
blk_io_trace (v1) format.

This causes critical data corruption:

- blk_io_trace (v1) has 32-bit 'action' field at offset 28
- blk_io_trace2 (v2) has 32-bit 'pid' at offset 28 and 64-bit 'action'
  at offset 32
- When record_blktrace_event() writes to a v2 buffer:
  * Writing pid (offset 32 in v1) corrupts the v2 action field
  * Writing action (offset 28 in v1) corrupts the v2 pid field
  * The 64-bit action is truncated to 32-bit via lower_32_bits()

Fix by:
1. Adding version switch to select correct format (v1 vs v2)
2. Calling appropriate recording function based on version
3. Defaulting to v2 for ftrace (as intended by commit 4d8bc7bd4f73)
4. Adding WARN_ONCE for unexpected version values

Without this patch :-
linux-block (for-next) # sh reproduce_blktrace_bug.sh
              dd-14242   [033] d..1.  3903.022308: Unknown action 36a2
              dd-14242   [033] d..1.  3903.022333: Unknown action 36a2
              dd-14242   [033] d..1.  3903.022365: Unknown action 36a2
              dd-14242   [033] d..1.  3903.022366: Unknown action 36a2
              dd-14242   [033] d..1.  3903.022369: Unknown action 36a2

The action field is corrupted because:
  - ftrace allocated blk_io_trace2 buffer (64 bytes)
  - But called record_blktrace_event() (writes v1, 48 bytes)
  - Field offsets don't match, causing corruption

The hex value shown 0x30e3 is actually a PID, not an action code!

linux-block (for-next) #
linux-block (for-next) #
linux-block (for-next) # sh reproduce_blktrace_bug.sh
Trace output looks correct:

              dd-2420    [019] d..1.    59.641742: 251,0    Q  RS 0 + 8 [dd]
              dd-2420    [019] d..1.    59.641775: 251,0    G  RS 0 + 8 [dd]
              dd-2420    [019] d..1.    59.641784: 251,0    P   N [dd]
              dd-2420    [019] d..1.    59.641785: 251,0    U   N [dd] 1
              dd-2420    [019] d..1.    59.641788: 251,0    D  RS 0 + 8 [dd]

Fixes: 4d8bc7bd4f73 ("blktrace: move ftrace blk_io_tracer to blk_io_trace2")
Signed-off-by: Chaitanya Kulkarni <ckulkarnilinux@gmail.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-28 07:56:06 -06:00
Chaitanya Kulkarni
4a0940bdca blktrace: use debug print to report dropped events
The WARN_ON_ONCE introduced in
commit f9ee38bbf70f ("blktrace: add block trace commands for zone operations")
triggers kernel warnings when zone operations are traced with blktrace
version 1. This can spam the kernel log during normal operation with
zoned block devices when userspace is using the legacy blktrace
protocol.

Currently blktrace implementation drops newly added REQ_OP_ZONE_XXX
when blktrace userspce version is set to 1.

Remove the WARN_ON_ONCE and quietly filter these events. Add a
rate-limited debug message to help diagnose potential issues without
flooding the kernel log. The debug message can be enabled via dynamic
debug when needed for troubleshooting.

This approach is more appropriate as encountering zone operations with
blktrace v1 is an expected condition that should be handled gracefully
rather than warned about, since users may be running older blktrace
userspace tools that only support version 1 of the protocol.

With this patch :-
linux-block (for-next) # git log -1
commit c8966006a0971d2b4bf94c0426eb7e4407c6853f (HEAD -> for-next)
Author: Chaitanya Kulkarni <ckulkarnilinux@gmail.com>
Date:   Mon Oct 27 19:26:53 2025 -0700

    blktrace: use debug print to report dropped events
linux-block (for-next) # cdblktests
blktests (master) # ./check blktrace
blktrace/001 (blktrace zone management command tracing)      [passed]
    runtime  3.805s  ...  3.889s
blktests (master) # dmesg  -c
blktests (master) #  echo "file kernel/trace/blktrace.c +p" > /sys/kernel/debug/dynamic_debug/control
blktests (master) # ./check blktrace
blktrace/001 (blktrace zone management command tracing)      [passed]
    runtime  3.889s  ...  3.881s
blktests (master) # dmesg  -c
[   77.826237] blktrace: blktrace v1 cannot trace zone operation 0x1000190001
[   77.826260] blktrace: blktrace v1 cannot trace zone operation 0x1000190004
[   77.826282] blktrace: blktrace v1 cannot trace zone operation 0x1001490007
[   77.826288] blktrace: blktrace v1 cannot trace zone operation 0x1001890008
[   77.826343] blktrace: blktrace v1 cannot trace zone operation 0x1000190001
[   77.826347] blktrace: blktrace v1 cannot trace zone operation 0x1000190004
[   77.826350] blktrace: blktrace v1 cannot trace zone operation 0x1001490007
[   77.826354] blktrace: blktrace v1 cannot trace zone operation 0x1001890008
[   77.826373] blktrace: blktrace v1 cannot trace zone operation 0x1000190001
[   77.826377] blktrace: blktrace v1 cannot trace zone operation 0x1000190004
blktests (master) #  echo "file kernel/trace/blktrace.c -p" > /sys/kernel/debug/dynamic_debug/control
blktests (master) # ./check blktrace
blktrace/001 (blktrace zone management command tracing)      [passed]
    runtime  3.881s  ...  3.824s
blktests (master) # dmesg  -c
blktests (master) #

Reported-by: syzbot+153e64c0aa875d7e4c37@syzkaller.appspotmail.com
Fixes: f9ee38bbf70f ("blktrace: add block trace commands for zone operations")
Signed-off-by: Chaitanya Kulkarni <ckulkarnilinux@gmail.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-28 07:55:40 -06:00
Johannes Thumshirn
4ae8efb4f9 blktrace: handle BLKTRACESETUP2 ioctl
Handle the BLKTRACESETUP2 ioctl, requesting an extended version of the
blktrace protocol from user-space.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 11:14:06 -06:00
Johannes Thumshirn
3f6722816a blktrace: trace zone write plugging operations
Trace zone write plugging operations on block devices.

As tracing of zoned block commands needs the upper 32bit of the widened
64bit action, only add traces to blktrace if user-space has requested
version 2 of the blktrace protocol.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 11:14:05 -06:00
Johannes Thumshirn
1c164fcc1b blktrace: expose ZONE APPEND completions to blktrace
Expose ZONE APPEND completions as a block trace completion action to
blktrace.

As tracing of zoned block commands needs the upper 32bit of the widened
64bit action, only add traces to blktrace if user-space has requested
version 2 of the blktrace protocol.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 11:14:05 -06:00
Johannes Thumshirn
f9ee38bbf7 blktrace: add block trace commands for zone operations
Add block trace commands for zone operations. These commands can only be
handled with version 2 of the blktrace protocol. For version 1, warn if a
command that does not fit into the 16 bits reserved for the command in
this version is passed in.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 11:14:05 -06:00
Johannes Thumshirn
4d8bc7bd4f blktrace: move ftrace blk_io_tracer to blk_io_trace2
Move ftrace's blk_io_tracer to the new blk_io_trace2 infrastructure.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 11:14:05 -06:00
Johannes Thumshirn
67bfa74d81 blktrace: move trace_note to blk_io_trace2
Move trace_note() to the new blk_io_trace2 infrastructure.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 11:14:05 -06:00
Johannes Thumshirn
915bb53860 blktrace: differentiate between blk_io_trace versions
Differentiate between blk_io_trace and blk_io_trace2 when relaying to
user-space depending on which version has been requested by the blktrace
utility.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 11:14:05 -06:00
Johannes Thumshirn
c44347d606 blktrace: add definitions for struct blk_io_trace2
Add definitions for the extended version of the blktrace protocol using a
wider action type to be able to record new actions in the kernel.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 11:14:05 -06:00
Johannes Thumshirn
113cbd6282 blktrace: pass blk_user_trace2 to setup functions
Pass struct blk_user_trace_setup2 to blktrace_setup_finalize(). This
prepares for the incoming extension of the blktrace protocol with a 64bit
act_mask.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 11:14:05 -06:00
Johannes Thumshirn
0d8627cc93 blktrace: add definitions for blk_user_trace_setup2
Add definitions for a version 2 of the blk_user_trace_setup ioctl. This
new ioctl will enable a different struct layout of the binary data passed
to user-space when using a new version of the blktrace utility requesting
the new struct layout.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 11:14:05 -06:00
Johannes Thumshirn
42da88a724 blktrace: split do_blk_trace_setup into two functions
Split do_blk_trace_setup into two functions, this is done to prepare for
an incoming new BLKTRACESETUP2 ioctl(2) which can receive extended
parameters from user-space.

Also move the size verification logic to the callers in preparation for
using a new internal structure later.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 11:14:05 -06:00
Johannes Thumshirn
370cd70a40 blktrace: change the internal action to 64bit
Change the internal use of the action in blktrace to 64bit. Although for
now only the lower 32bits will be used.

With the upcoming version 2 of the blktrace user-space protocol the upper
32bit will also be utilized.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 11:14:05 -06:00
Johannes Thumshirn
70e3c62b89 blktrace: untangle if/else sequence in __blk_add_trace
Untangle the if/else sequence setting the trace action in
__blk_add_trace() and turn it into a switch statement for better
extensibility.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 11:14:05 -06:00
Johannes Thumshirn
04678e72e9 blktrace: split out relaying a blktrace event
Split out the code relaying a blktrace event to user-space using relayfs.

This enables adding a second version supporting a new version of the
protocol.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 11:14:05 -06:00
Johannes Thumshirn
472eca5383 blktrace: factor out recording a blktrace event
Factor out the recording of a blktrace event into its own function,
deduplicating the code.

This also enables recording different versions of the blktrace protocol
later on.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 11:14:05 -06:00
Johannes Thumshirn
a65988a0ad blktrace: only calculate trace length once
De-duplicate the calculation of the trace length instead of doing the
calculation twice, once for calling trace_buffer_lock_reserve() and once
for calling relay_reserve().

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 11:14:05 -06:00
Keith Busch
5c5028ee59 block: rename min_segment_size
Despite its name, the block layer is fine with segments smaller that the
"min_segment_size" limit. The value is an optimization limit indicating
the largest segment that can be used without considering boundary
limits. Smaller segments can take a fast path, so give it a name that
reflects that: max_fast_segment_size.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-22 07:39:39 -06:00
Mehdi Ben Hadj Khelifa
e5a82249d8 blk-mq: use struct_size() in kmalloc()
Change struct size calculation to use struct_size()
to align with new recommended practices[1] which quotes:
"Another common case to avoid is calculating the size of a structure with
a trailing array of others structures, as in:

header = kzalloc(sizeof(*header) + count * sizeof(*header->item),
                 GFP_KERNEL);

Instead, use the helper:

header = kzalloc(struct_size(header, item, count), GFP_KERNEL);"

Signed-off-by: Mehdi Ben Hadj Khelifa <mehdi.benhadjkhelifa@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-20 10:38:56 -06:00
Bart Van Assche
d60055cf52 block/mq-deadline: Switch back to a single dispatch list
Commit c807ab520fc3 ("block/mq-deadline: Add I/O priority support")
modified the behavior of request flag BLK_MQ_INSERT_AT_HEAD from
dispatching a request before other requests into dispatching a request
before other requests with the same I/O priority. This is not correct since
BLK_MQ_INSERT_AT_HEAD is used when requeuing requests and also when a flush
request is inserted.  Both types of requests should be dispatched as soon
as possible. Hence, make the mq-deadline I/O scheduler again ignore the I/O
priority for BLK_MQ_INSERT_AT_HEAD requests.

Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: Yu Kuai <yukuai@kernel.org>
Reported-by: chengkaitao <chengkaitao@kylinos.cn>
Closes: https://lore.kernel.org/linux-block/20251009155253.14611-1-pilgrimtao@gmail.com/
Fixes: c807ab520fc3 ("block/mq-deadline: Add I/O priority support")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moalv <dlemoal@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-20 10:37:42 -06:00
Bart Van Assche
93a358af59 block/mq-deadline: Introduce dd_start_request()
Prepare for adding a second caller of this function. No functionality
has been changed.

Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: Yu Kuai <yukuai@kernel.org>
Cc: chengkaitao <chengkaitao@kylinos.cn>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-20 10:37:42 -06:00
Linus Torvalds
211ddde082 Linux 6.18-rc2 v6.18-rc2 2025-10-19 15:19:16 -10:00
Linus Torvalds
d9043c79ba - Make sure the check for lost pelt idle time is done unconditionally to
have correct lost idle time accounting
 
 - Stop the deadline server task before a CPU goes offline
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmj0yvsACgkQEsHwGGHe
 VUqiYRAAncYon7a++87nuCHIw2ktAcjn4PJTz0F1VGw9ZvcbWThUhNoA17jd4uOz
 XCzSH1rnHnlz359cJIzFwgVYjkBIaqT8GBN0al9ODra37laZCo89bKLmOeAlH81H
 1xJXrDwn7U8dYBjgf6E6OGCdAx40kspCBxmpxrFW1VrGDvfNjEAKezm5GWeSED0Z
 umA93dBr82i4IvfARUkK8s35ctHyx+o+7lCvCSsKSJgM02WWrKqAA/lv6jFjIgdE
 0UuYJv+5A2e1Iog2KNSbvSPn23VaMnsZtvXfJoRLFHEsNTiL9NliTnwrOY6xx0Z8
 9+GUeWsbobKwcKSk4dctOh0g/4afNbxWe2aAPmScHJNHtXHSeejps+zy4xFCLTZn
 2muHCdZ2zo6YSL+og4TQax+FnLYnGUtPFDOQYsNxv/Cp1H+cbgvG5Qp08XXt8Tfl
 Mt82g25GKklc28AN5Ui7FKTFmV2K363pV04YVZjXOwmxwiEYbwKw8gKfxi7CRW7S
 fl4nW6Kp8BFtJQxc/RCXDIiX3h0wRlTOmF5FzyFYxgdsmO5AdGqS9tqknLrV2NlH
 JVtj7alnrmCU34LwtTVfCvYQZiNd4IN+B6/htsL3AzrcLnqJz4O/T/Eyv9UL4yUs
 yvQuO+yStCyk0BFYaGM3/E0xp87NYjaLiHnpM2jia3DT3UT1t7Q=
 =uqJW
 -----END PGP SIGNATURE-----

Merge tag 'sched_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Borislav Petkov:

 - Make sure the check for lost pelt idle time is done unconditionally
   to have correct lost idle time accounting

 - Stop the deadline server task before a CPU goes offline

* tag 'sched_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Fix pelt lost idle time detection
  sched/deadline: Stop dl_server before CPU goes offline
2025-10-19 04:59:43 -10:00
Linus Torvalds
343b4b44a1 - Make sure perf reporting works correctly in setups using overlayfs or FUSE
- Move the uprobe optimization to a better location logically
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmj0x3YACgkQEsHwGGHe
 VUo6nQ/9E8LWC6PJG40QUXNZuj5qLe9VVaiWTW7w/zgeCf9nxkt6OhlOIu4fCMKz
 6n5marqnvOoG9EXetUz5+n0wJvc9vDACESC0m6ESddaI4PGXULNJIsN2C5dR3UZ3
 RULxaXvz9PVVkW3UIuM/U9az7fsG/ttH1rtrWQOsUYQZEO7vA9g+8KtASwnB7yBa
 29WzVDYQIuHigdFPkVOuKBEdhslOjNjMM/N/shFOyFS62MGgwwFG/f4xv0c2GanJ
 9gS2HPGhwOXLm8x/1Y6D8eKjiT5lvqZcDcRnui8bj7L7YGx+HU4PhRIIg7sBvGqA
 QQGolxA9Xo2BTufUTxEQK9v2fSvg0f9wuKbkDbRUdyUeWiZZjEeBM/m0AkzEEeKf
 FUrLCi3V/mN5J/sXSgIwjuCtYctwmsfaukL2bz6DB7feoTHceQmHunKCtBlDZtLE
 Md/4hzMNYM+T/3nx27quGz8Cepxn9PSObN7W+DddWr0TxOxg2Pq6iMbnd7MulueP
 K/AMvqDtbbVUB1XpsFvadRLcYUYYfXT9tiOCxa9O2w2NXDG8qeB6FZwScBaWuz1N
 9GpKBhVMgZT8m0d3N8NoBi0+h32UVZnsJJ3UhHnceE8UyYf4kSO5L2K3nPHJa301
 AavIPkH7+YOl5TAg6JlyYbRRdwfoUzxKUqY/hQ6Q8aLvwb2Jing=
 =huy7
 -----END PGP SIGNATURE-----

Merge tag 'perf_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Borislav Petkov:

 - Make sure perf reporting works correctly in setups using
   overlayfs or FUSE

 - Move the uprobe optimization to a better location logically

* tag 'perf_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix MMAP2 event device with backing files
  perf/core: Fix MMAP event path names with backing files
  perf/core: Fix address filter match with backing files
  uprobe: Move arch_uprobe_optimize right after handlers execution
2025-10-19 04:54:08 -10:00
Linus Torvalds
c7864eeaa4 - Reset the why-the-system-rebooted register on AMD to avoid stale bits
remaining from previous boots
 
 - Add a missing barrier in the TLB flushing code to prevent erroneously not
   flushing a TLB generation
 
 - Make sure cpa_flush() does not overshoot when computing the end range of
   a flush region
 
 - Fix resctrl bandwidth counting on AMD systems when the amount of monitoring
   groups created exceeds the number the hardware can track
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmj0xcIACgkQEsHwGGHe
 VUq+VA/+Jlb1/m3eBGupCDmfGjT+vhoJ+twfmUDwA4yo5RzABHPOk4y+Q+M5Kmxy
 yhOP+XbT5I2W4x/LNT5oSTHJyG5QbDLTs1Hvpbqan6BHD6PA+1mv5lfv/LtNqSK4
 lDZbJeJElpmozhCD6qznLZjbooh1qP4RkpszAyUjd/Sns4TtAJbZ4IBfKCpCtSfx
 E2X6FW28jBDVzJdDjMm5UfI+7VJmLgA46XUAgEfYxwfIWeQkV28f31+xGch7WW4C
 u41fOo6AI4mzHpdKdwzn4GJdH46UfMg+E8CTcwODxvG40ttskilnfWvwnyCsPWbD
 nTU/ubzTrra8BpeuAYVhVxuBam0fRmZVcEtE59DjG1EhVrt3dEqfhMraHKxjvboW
 vlauvhzkG4lezTHLX8EqnqS7csq0ziBU8tcYCxRA4OKTGTp1y0VfSw/ra4uLY8NJ
 pkjS3KK0VNhD6zthbhtfZ0LaB3ms0eYaTZyPsLhzxwi0/Wm/LTWO+sKX9r9DPafp
 LqzdSdik5YSv2fepxjzVOh1FALvLm+At4sh4Z6NtzLVm2dDrsN31hLVEJL7FUkwi
 d6gVhlII7CNrlIuuC2EsSXymYTdpoMFdKGOoq0RxPz0StSKzkFTFt8lpfb5bdSq+
 VVU8DDFwwQ2iAW2IaogOwItH+tbY15P0kU2RlYey2UHub+Ho1vI=
 =WMx2
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Reset the why-the-system-rebooted register on AMD to avoid stale bits
   remaining from previous boots

 - Add a missing barrier in the TLB flushing code to prevent erroneously
   not flushing a TLB generation

 - Make sure cpa_flush() does not overshoot when computing the end range
   of a flush region

 - Fix resctrl bandwidth counting on AMD systems when the amount of
   monitoring groups created exceeds the number the hardware can track

* tag 'x86_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/CPU/AMD: Prevent reset reasons from being retained across reboot
  x86/mm: Fix SMP ordering in switch_mm_irqs_off()
  x86/mm: Fix overflow in __cpa_addr()
  x86/resctrl: Fix miscount of bandwidth event when reactivating previously unavailable RMID
2025-10-19 04:41:27 -10:00
Linus Torvalds
1c64efcb08 Rust 'rustfmt' cleanup
'rustfmt', by default, formats imports in a way that is prone to
 conflicts while merging and rebasing, since in some cases it condenses
 several items into the same line.
 
 Document in our guidelines that we will handle this for the moment with
 the trailing empty comment workaround and make the tree 'rustfmt'-clean
 again.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAmjzy+0ACgkQGXyLc2ht
 IW2K9BAAmdHFQlI5kE2qFBVrk5JeTBmr/3GJ7wzPXKgQ/XdKgYILK5qx8SaqXo95
 RXnGSFzPXmVx1xds5NojVqJ65gnoyL9KK4qsxFOUVVoH2vjSWXL5DFrRDcVKFlGY
 IRlsdRBGdX8/M6gPgLAy2m2eEeNgrwEA/xcWgfvZhI7CILSfvXDNKfGvnzg0mRHp
 cssYtKTACwzE3/uJUA8lX+HAWNrb1aijEUlvfK5K9CMYwP9wFXzXy1eI2kJ4Yzvx
 aqiK1vvieAl4gbLAoCD513nSxCQaUzpuHJvELA6bVa8uJ5GtWcuCA9U8qn8zcaaG
 tOmC/kF/+5jNwc4/4wCSHhcQD+1qaXZVQjeMBYsObqFv7ixabOlxVfGk/oDbAHEI
 BAtWsqHFMhQlo/E/YdH0palVhvslecnoAhTzURwL9231Sqp6ZeQl5ESya8ArmMFB
 SFeYMHtDhA557pA4G3R88IIVS+xWklNOVtcW1+c+YnW4d+Sb77aU9E78VKYmH0mt
 5rYB0ni09rcDBNRGUSy7fscfUV5ItJd/lccwjRGasM0LzZ6KoMWE2B0Cww5vG27W
 FXQ9hTsn1q92XY3AFiqR5IBiztNLZfjFEq6HAVSQrmXgHJ53ZT/0oFRcObYAsr49
 AjoLzkHcevdj7EUdm9q1o/KVk+t+bkDYjmb3kRgHVOwdekK/Jk8=
 =44ap
 -----END PGP SIGNATURE-----

Merge tag 'rust-rustfmt' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux

Pull rustfmt fixes from Miguel Ojeda:
 "Rust 'rustfmt' cleanup

  'rustfmt', by default, formats imports in a way that is prone to
  conflicts while merging and rebasing, since in some cases it condenses
  several items into the same line.

  Document in our guidelines that we will handle this for the moment
  with the trailing empty comment workaround and make the tree
  'rustfmt'-clean again"

* tag 'rust-rustfmt' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
  rust: bitmap: fix formatting
  rust: cpufreq: fix formatting
  rust: alloc: employ a trailing comment to keep vertical layout
  docs: rust: add section on imports formatting
2025-10-18 10:05:13 -10:00
Linus Torvalds
648937f64a Hi,
If possible, could you still pick this change for v6.18 [1]? The change in
 question  corrects the state transitions for ARM FF-A to match the spec and
 how tpm_crb behaves on other platforms.
 
 [1] https://lore.kernel.org/linux-integrity/aPN59bwcUrieMACf@kernel.org/
 
 BR, Jarkko
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRE6pSOnaBC00OEHEIaerohdGur0gUCaPN84QAKCRAaerohdGur
 0q4SAQD0o1dG70qraZjVU+xySiz/jGb04d49A/LxKJj/LIXxPQD/W3xjulnS3S25
 rWoIn7wO6NeiGUiUPSnCEc6LDIOYYQA=
 =wbvF
 -----END PGP SIGNATURE-----

Merge tag 'tpmdd-next-v6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd

Pull tpm fix from Jarkko Sakkinen:
 "Correct the state transitions for ARM FF-A to match the spec and how
  tpm_crb behaves on other platforms"

* tag 'tpmdd-next-v6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  tpm_crb: Add idle support for the Arm FF-A start method
2025-10-18 08:38:28 -10:00
Linus Torvalds
e67bb0da33 pci-v6.18-fixes-2
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmjyX1sUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vw+DBAAmTruFbffqGqhNwWC586ki4sNYmxf
 g7rqAnUTE23ItblsN+6YfWJvRMf7Id1fMjJ/VYIwm5T7s7FcBWUFqIoVhHc2NmHw
 dmnLFyzrpTT2fDN9H5PSFd24Wu4N0OlXBsum3L6j7Rj6wyDJrakYfqLMsH+rGOoR
 44U2uL8PxhuT5hsUiztFpF6T4i2C6poB6zx173PwAmdNeDoNz1qhRG0mexnByXhU
 fZ6iudP58dw1zg9ZQhPuaBPLWXb9uBXPz5+8j4rB2/Il1q2UL1pjhbu97WoZoPem
 e8yq8p1xQEYlNDVgrEQQE35qQnzSTZLLFyyjupvs3bJEzLQGxZl+ds7tl1JVYWMr
 2PPvDW2LkdQStPGBUuRfgGN7vuCKofy8ibjdWsXOuB7JXZXkbvztqOTcdOLSCnoZ
 jbmVhvoR73wdq0ePG8REm8gNMm+SDPLnxZY3BRTXgrCazeCEFTTCX+UHEWwRBRAj
 VJlO6b95/e5wEjyw5aHHTzD261j+BwfsZ8qIMNbC5OmVJsx46we5/enwpkGLI+TP
 cdWbLz9OKv1Y/FNfb2qy2RttiJTFLI+n30ejVwpHPBNGwkC016f3c6GrImyYL6JD
 21QIh7lA4MODGRDSn/Iqt7YZugWVnOTQJzYgJv1cq1MQSnq9PyXY34GtcyG80cOO
 2/QXi6MGgT52iJQ=
 =oXKT
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci fixes from Bjorn Helgaas:

 - Search for MSI Capability with correct ID to fix an MSI regression on
   platforms with Cadence IP (Hans Zhang)

 - Revert early bridge resource set up to fix resource assignment
   failures that broke at least alpha boot and Snapdragon ath12k WiFi
   (Ilpo Järvinen)

 - Implement VMD .irq_startup()/.irq_shutdown() to fix IRQ issues that
   caused boot crashes and broken devices below VMD (Inochi Amaoto)

 - Select CONFIG_SCREEN_INFO on X86 to fix black screen on boot when
   SCREEN_INFO not selected (Mario Limonciello)

* tag 'pci-v6.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI/VGA: Select SCREEN_INFO on X86
  PCI: vmd: Override irq_startup()/irq_shutdown() in vmd_init_dev_msi_info()
  PCI: Revert early bridge resource set up
  PCI: cadence: Search for MSI Capability with correct ID
2025-10-18 08:35:09 -10:00
Linus Torvalds
ea0bdf2b94 cxl fixes for v6.18-rc2
- Avoid missing port component registers setup due to dport enumeration
   failure
 - Add check for no entries in cxl_feature_info to address accessing
   invalid pointer.
 - Use %pa printk format to emit resource_size_t in
   validate_region_offset()
 
 CXL extended linear cache support fixes:
 - Fix setup of memory resource in cxl_acpi_set_cache_size()
 - Set range param for region_res_match_cxl_range() as const.
   (Addresses a compile warning for match_region_by_range() fix)
 - Fix match_region_by_range() to use region_res_match_cxl_range()
 - Subtract to find an hpa_alias0 in cxl_poison events to correct
   the alias math calculation.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5DAy15EJMCV1R6v9YGjFFmlTOEoFAmjyayAACgkQYGjFFmlT
 OEqfVBAAlYEr+8CreTe20L/OpufW02KF4spXmOgDaqvHq6oyFpAXzz5hsvMmrFvA
 vln8bsW0+xVGHFuhFwzex58Kc3netNR5jEO1ubsxGa9O83ab92gZM4477Qo7j4Yv
 O6CiHi5qo/Ug91KqM/h1cUq4y27aw6U5skve+DPvVIUy5P4mL+z1FXA3w5LXgauP
 vsc4rTaFsQGgD/PWwK+5K0mSXQCUZpfVYWATw21m00Do7hvsK5n26u8Wus1cnAAj
 z80h7lYZ6L9u4hGPEBJ02AgBBzuORV6PFIWCTLfA0d1VAg5AJOi4RzusErHe00t2
 TF4Dpfd9lH/weclhLV9wxcIGBmi7VfslBRfdM8baKWA4An+Kdd84I3dLHkjir+UA
 q6XfnW85Ig0TV9lKyDcmcFZ6+WoSqNya3bEQkDC3V9ZkekTLQqBmXCjEcSWhZa7U
 QokPw0AHlgtG5rmUfWOo+pj8i6w+NfP3nOrBMwaODkycQW0AQhlz3Y8t144mRTAf
 gpfre+TeY3veYtPxQzhT/RNLJCumVIOqupmq6bsQcj20tqeVnCS4iWrhMdrCykgV
 +LvBeLw8ncRMLeAMd+/wN1FiVL1wFkCiFaa+g2sDDLyjMgndH9WeQfs4zS6bMgwl
 E8IpuJYz0pL/NiagMUl9fIqRWL01xMUd5Fd0Qobkie7C9w9YLO4=
 =jGVm
 -----END PGP SIGNATURE-----

Merge tag 'cxl-fixes-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull Compute Express Link fixes from Dave Jiang:
 "A small collection of CXL fixes. In addition to some misc fixes for
  the CXL subsystem, a number of fixes for CXL extended linear cache
  support are included to make it functional again.

   - Avoid missing port component registers setup due to dport
     enumeration failure

   - Add check for no entries in cxl_feature_info to address accessing
     invalid pointer.

   - Use %pa printk format to emit resource_size_t in
     validate_region_offset()

  CXL extended linear cache support fixes:

   - Fix setup of memory resource in cxl_acpi_set_cache_size()

   - Set range param for region_res_match_cxl_range() as const
     (addresses a compile warning for match_region_by_range() fix)

   - Fix match_region_by_range() to use region_res_match_cxl_range()

   - Subtract to find an hpa_alias0 in cxl_poison events to correct the
     alias math calculation"

* tag 'cxl-fixes-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/trace: Subtract to find an hpa_alias0 in cxl_poison events
  cxl/region: Use %pa printk format to emit resource_size_t
  cxl: Fix match_region_by_range() to use region_res_match_cxl_range()
  cxl: Set range param for region_res_match_cxl_range() as const
  cxl/acpi: Fix setup of memory resource in cxl_acpi_set_cache_size()
  cxl/features: Add check for no entries in cxl_feature_info
  cxl/port: Avoid missing port component registers setup
2025-10-18 08:22:07 -10:00
Linus Torvalds
2953fb6548 hid-for-linus-2025101701
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEL65usyKPHcrRDEicpmLzj2vtYEkFAmjyYSUACgkQpmLzj2vt
 YEnGlxAAukp9PohVJYj5XXfIZXd6nl9wQyTLfVxCaw03R4IBSwq6/SpbSFP3Khr0
 nR6wss6GdpiHGc8WA7Yo3EIZGsZtSolaOrg2pozKop24FD42S5bgq/qWeUNERhGF
 LU6+zpZbYhZ5Joo2Q57vRZ4D9UDYg7CoU1jVAkscEuvQhishLFKpG88VE9Jy/2Eu
 SGQ9Aqq/0NAC/msUHtVWn90EiLIQBS4izWO02Js6JObf8mdisH1MQscOeJ2rfJhV
 7WcdMX/LkqL7DQ9dFFo3j5NqE3SF1aJFjhMY+xivQaYeuwPSffCuWrbVq6TM74hP
 uRfIII1YUYAEyh+69KgQbQLGl0OqZbDhtEZ5p2SCn3DaMwXu+sraxfWRQ8nkG55w
 Gw74ibc3F9pUwwls9i7FUJE3CzoNBcN/FdxvWST5izni4rC2XAC7pzeSmyO3jiQO
 Zvpwa0VkaOeZ1OmVqo4YQBZoNlgBIaBPxqtoHfHxwvEvqUmivSzU+//XFGbvXgK1
 XL8hgYd7P1P6g9UVSvsgiVA6FIdR4EtpbOV3YxQpMVv8uAOCN5omeOAd9jRImN2a
 BxpKr5bcB4JhZcji/E4T4IPegA+8LUniZ6AW5KIUCCz34uDyeDSHsWB+cZiy31zc
 gGa0NqGRfhKOzqj9ExnnRDoZWyhnoHZ3yxUb41TPQfS2p5uZepM=
 =UpoT
 -----END PGP SIGNATURE-----

Merge tag 'hid-for-linus-2025101701' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID fixes from Jiri Kosina:

 - fix for sticky fingers handling in hid-multitouch (Benjamin
   Tissoires)

 - fix for reporting of 0 battery levels (Dmitry Torokhov)

 - build fix for hid-haptic in certain configurations (Jonathan Denose)

 - improved probe and avoiding spamming kernel log by hid-nintendo
   (Vicki Pfau)

 - fix for OOB in hid-cp2112 (Deepak Sharma)

 - interrupt handling fix for intel-thc-hid (Even Xu)

 - a couple of new device IDs and device-specific quirks

* tag 'hid-for-linus-2025101701' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: logitech-hidpp: Add HIDPP_QUIRK_RESET_HI_RES_SCROLL
  selftests/hid: add tests for missing release on the Dell Synaptics
  HID: multitouch: fix sticky fingers
  HID: multitouch: fix name of Stylus input devices
  HID: hid-input: only ignore 0 battery events for digitizers
  HID: hid-debug: Fix spelling mistake "Rechargable" -> "Rechargeable"
  HID: Kconfig: Fix build error from CONFIG_HID_HAPTIC
  HID: nintendo: Rate limit IMU compensation message
  HID: nintendo: Wait longer for initial probe
  HID: core: Add printk_ratelimited variants to hid_warn() etc
  HID: quirks: Add ALWAYS_POLL quirk for VRS R295 steering wheel
  HID: quirks: avoid Cooler Master MM712 dongle wakeup bug
  HID: cp2112: Add parameter validation to data length
  HID: intel-thc-hid: intel-quickspi: Add ARL PCI Device Id's
  HID: intel-thc-hid: Intel-quickspi: switch first interrupt from level to edge detection
  HID: intel-thc-hid: intel-quicki2c: Fix wrong type casting
2025-10-18 08:18:18 -10:00
Linus Torvalds
d303caf5ca bpf-fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmjykZUACgkQ6rmadz2v
 bTppcA/+LKOzgJVJd9q3tbPeXb+xJOqiCGPDuv3tfmglHxD+rR5n2lsoTkryL4PZ
 k5yJhA95Z5RVeV3cevl8n4HlTJipm795k+fKz1YRbe9gB49w2SqqDf5s7LYuABOm
 YqdUSCWaLyoBNd+qi9SuyOVXSg0mSbJk0bEsXsgTp/5rUt9v6cz+36BE1Pkcp3Aa
 d6y2I2MBRGCADtEmsXh7+DI0aUvgMi2zlX8veQdfAJZjsQFwbr1ZxSxHYqtS6gux
 wueEiihzipakhONACJskQbRv80NwT5/VmrAI/ZRVzIhsywQhDGdXtQVNs6700/oq
 QmIZtgAL17Y0SAyhzQsQuGhJGKdWKhf3hzDKEDPslmyJ6OCpMAs/ttPHTJ6s/MmG
 6arSwZD/GAIoWhvWYP/zxTdmjqKX13uradvaNTv55hOhCLTTnZQKRSLk1NabHl7e
 V0f7NlZaVPnLerW/90+pn5pZFSrhk0Nno9to+yXaQ9TYJlK4e6dcB9y+rButQNrr
 7bTRyQ55fQlyS+NnaD85wL41IIX4WmJ3ATdrKZIMvGMJaZxjzXRvW4AjGHJ6hbbt
 GATdtISkYqZ4AdlSf2Vj9BysZkf7tS83SyRlM6WDm3IaAS3v5E/e1Ky2Kn6gWu70
 MNcSW/O0DSqXRkcDkY/tSOlYJBZJYo6ZuWhNdQAERA3OxldBSFM=
 =p8bI
 -----END PGP SIGNATURE-----

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Alexei Starovoitov:

 - Replace bpf_map_kmalloc_node() with kmalloc_nolock() to fix kmemleak
   imbalance in tracking of bpf_async_cb structures (Alexei Starovoitov)

 - Make selftests/bpf arg_parsing.c more robust to errors (Andrii
   Nakryiko)

 - Fix redefinition of 'off' as different kind of symbol when I40E
   driver is builtin (Brahmajit Das)

 - Do not disable preemption in bpf_test_run (Sahil Chandna)

 - Fix memory leak in __lookup_instance error path (Shardul Bankar)

 - Ensure test data is flushed to disk before reading it (Xing Guo)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Fix redefinition of 'off' as different kind of symbol
  bpf: Do not disable preemption in bpf_test_run().
  bpf: Fix memory leak in __lookup_instance error path
  selftests: arg_parsing: Ensure data is flushed to disk before reading.
  bpf: Replace bpf_map_kmalloc_node() with kmalloc_nolock() to allocate bpf_async_cb structures.
  selftests/bpf: make arg_parsing.c more robust to crashes
  bpf: test_run: Fix ctx leak in bpf_prog_test_run_xdp error path
2025-10-18 08:00:43 -10:00
Linus Torvalds
847f242f7a Description for this pull request:
- Fix out-of-bounds in FS_IOC_SETFSLABEL.
  - Add validation for stream entry valid size to prevent infinite loop.
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEE6NzKS6Uv/XAAGHgyZwv7A1FEIQgFAmjy83gWHGxpbmtpbmpl
 b25Aa2VybmVsLm9yZwAKCRBnC/sDUUQhCLoTD/sEUM1PpRJmSBRzH+9Le/tSAjK7
 M0MSLkryRvvFKdigCCtHD5LLoNJrc+i4bP6AgTO9l0PJ5OlAwxAQ1Qh4mAOIfazk
 ahpw/WzI1B/qC28zuKepFJS3x05InACb5pwv293AwgEiS/ROfpfAJQ8NGp54cWFm
 ge7/jBU19BX1HLC7PSleNStdwSi+oEKz7twtl+7n5kvgzi3eTJWnXGhYJWDGwpYY
 n9OZQBELliGhLnzeuDuGXddpVKQ0jbH+kchQ7dJAiN3jYq7FeYEsrapFmuae2r+N
 MEjEoKg4pmuf3z3Iru+XQ30xGCOs2lKz9LfowxRcl5xTqxmH5XxqZZF/UdQITXh0
 VYLRjhSfA5Fp2kUHmPayvkLHkzILzNjSF1Vqhk2En2C99RSywcJkvaFr5cnE3v3a
 gueO8eZ+QRSXQVpKCTHS98wPcJrnzPXSTpwF9uKSHKBzNWIee3tLHRZb1nvUuHKo
 bNB2iEDwracb5eEpOoa+7L0VOS8M6sFnAveWF9yNVR9vG36Cs7eXtrFIk/xQmr2s
 FlqjPpx1xWodnE9IxrJi8qSJxAgJtrO6UG1gut7lBK6RHIhkt2B7YUQPr30V8zVx
 f3FszNERZe/D/bvgM5WHyoieMnp5sZAxq0f7pk+2pOTIqvlbojWCAGoqP2P6BN2Y
 mu9lnM3D6k/GfLNbww==
 =A0dK
 -----END PGP SIGNATURE-----

Merge tag 'exfat-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat

Pull exfat fixes from Namjae Jeon:

 - Fix out-of-bounds in FS_IOC_SETFSLABEL

 - Add validation for stream entry size to prevent infinite loop

* tag 'exfat-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
  exfat: fix out-of-bounds in exfat_nls_to_ucs2()
  exfat: fix improper check of dentry.stream.valid_size
2025-10-18 07:23:59 -10:00
Linus Torvalds
2d07c6c209 NFS Client Bugfixes for Linux 6.18-rc
Bugfixes:
  * Fix for FlexFiles mirror->dss allocation
  * Apply delay_retrans to async operations
  * Check if suid/sgid is cleared after a write when needed
  * Fix setting the state renewal timer for early mounts after a reboot
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAmjyqWYACgkQ18tUv7Cl
 QOuuhBAAtW3JjqOvuJdUKqLnvEB7p9MJCbJuLvxu3IncS7LR4ApMthogQNws/kYH
 sU44KZh1lIe0G6juYTR4SmpwuJ1VbPJ7tZ+ZAssfp29mLQu3txybK9o21jZMu+jH
 qjPRH4VIbfJot77lhBykSk+JzpxQMC89JaZbC5sgATHfTstQPiHtBSQ7zwC5EY8f
 dBmh2mlhRbBsblHw6KZq5QhNR6E1XsXMWGE8GoyzQR4QAWTNOtM+QIBkUapREODG
 4HCnW/7JEUNuaAAYMovqqxv+qGC/axEEjZdVEGyiOvliSIs9RM2CCSQnDZQ619DI
 xCV3Qxvbsr22dDX5O127If/vXNisWJM7JzB65yX2y0ZlAAIoDX6XwADDtnvMlHPX
 KI+CqwGyk41Wwc9F6WPOL+NUwaqCcVumeDGBSZ5LFN4uqR1SeSfztAVqgtDU++u4
 cePZlNXwob5BlzfqF8DG5uGagxirXIiwOo6N63xSCWuU0NMwfdi7wrMLi/cfFqeg
 xCyB6mSm+BE6qSwppzSupHxwuQH0RzOSsSAQArKarEICnQfmgKAd98r6fDkRffJQ
 mrfUGMF/bvf7kDyv9kywjFw+KPWr0DNWRjFaLF9gcIxyI6Ml5HDFVzkK7WpI0dou
 N/rO/XGjNt8mVYgAG1BgFRXwSE2NRufNbQTygVUfe7coBSjth+Y=
 =p2V5
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-6.18-2' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client fixes from Anna Schumaker:

 - Fix for FlexFiles mirror->dss allocation

 - Apply delay_retrans to async operations

 - Check if suid/sgid is cleared after a write when needed

 - Fix setting the state renewal timer for early mounts after a reboot

* tag 'nfs-for-6.18-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  NFS4: Fix state renewals missing after boot
  NFS: check if suid/sgid was cleared after a write as needed
  NFS4: Apply delay_retrans to async operations
  NFSv4/flexfiles: fix to allocate mirror->dss before use
2025-10-18 07:18:48 -10:00
Linus Torvalds
4ccb3a8000 smb client fixes and some clealup
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmjyZNsACgkQiiy9cAdy
 T1E85Av+Ik0DprUR/+PZ8y52jLseUKIgGTUH3kixwGEbbLvH2RFqK9KQud9RG61V
 69QWdlDmzOOYN+P155rM/z812oQSmWB+lFtL3zpdE9NqYMldEWAwizrTjF0OCzzw
 4Aiqx7wTMJOq2UFbEo30WlbUWMx339oc5HYC3wslaZTLgb46t6sxiurmyQEcU3UN
 YsdTJezZY60bfFB5cnMInRRXgphD4zcPaTgVQmRxsWLpHZlZZ1YbZIpXYsJMNgkM
 deRmrNd9ZJsv0/OjvNpm/96p+9Jsw3+mIO5ZfrcnNYWu0LjPUpr7mptl6RVZ3kR7
 m3FZs2Fur/HcDA8ePWB70GkuFabKV1ychl3KS14LQErrmzTeaiYYaGKkF5yGQv87
 H29xQqS94CLBVoMBY+a/UihzxJ3UjpaRIF3RX7tVt9sIZ5u7ie8aDBKZDc+Eac9Y
 iu2XhueTPhfP59ebiw6aOE3jX3mUfzPx6OafbBb+HItGVV/qBavUwkHNsrB2gsEo
 PapS6UK6
 =YIGp
 -----END PGP SIGNATURE-----

Merge tag '6.18-rc1-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:
 "smb client fixes, security and smbdirect improvements, and some minor cleanup:

   - Important OOB DFS fix

   - Fix various potential tcon refcount leaks

   - smbdirect (RDMA) fixes (following up from test event a few weeks
     ago):

      - Fixes to improve and simplify handling of memory lifetime of
        smbdirect_mr_io structures, when a connection gets disconnected

      - Make sure we really wait to reach SMBDIRECT_SOCKET_DISCONNECTED
        before destroying resources

      - Make sure the send/recv submission/completion queues are large
        enough to avoid ib_post_send() from failing under pressure

   - convert cifs.ko to use the recommended crypto libraries (instead of
     crypto_shash), this also can improve performance

   - Three small cleanup patches"

* tag '6.18-rc1-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: (24 commits)
  smb: client: Consolidate cmac(aes) shash allocation
  smb: client: Remove obsolete crypto_shash allocations
  smb: client: Use HMAC-MD5 library for NTLMv2
  smb: client: Use MD5 library for SMB1 signature calculation
  smb: client: Use MD5 library for M-F symlink hashing
  smb: client: Use HMAC-SHA256 library for SMB2 signature calculation
  smb: client: Use HMAC-SHA256 library for key generation
  smb: client: Use SHA-512 library for SMB3.1.1 preauth hash
  cifs: parse_dfs_referrals: prevent oob on malformed input
  smb: client: Fix refcount leak for cifs_sb_tlink
  smb: client: let smbd_destroy() wait for SMBDIRECT_SOCKET_DISCONNECTED
  smb: move some duplicate definitions to common/cifsglob.h
  smb: client: let destroy_mr_list() keep smbdirect_mr_io memory if registered
  smb: client: let destroy_mr_list() call ib_dereg_mr() before ib_dma_unmap_sg()
  smb: client: call ib_dma_unmap_sg if mr->sgt.nents is not 0
  smb: client: improve logic in smbd_deregister_mr()
  smb: client: improve logic in smbd_register_mr()
  smb: client: improve logic in allocate_mr_list()
  smb: client: let destroy_mr_list() remove locked from the list
  smb: client: let destroy_mr_list() call list_del(&mr->list)
  ...
2025-10-18 07:11:32 -10:00
Linus Torvalds
02e5f74ef0 ARM:
- Fix the handling of ZCR_EL2 in NV VMs
 
 - Pick the correct translation regime when doing a PTW on
   the back of a SEA
 
 - Prevent userspace from injecting an event into a vcpu that isn't
   initialised yet
 
 - Move timer save/restore to the sysreg handling code, fixing EL2 timer
   access in the process
 
 - Add FGT-based trapping of MDSCR_EL1 to reduce the overhead of debug
 
 - Fix trapping configuration when the host isn't GICv3
 
 - Improve the detection of HCR_EL2.E2H being RES1
 
 - Drop a spurious 'break' statement in the S1 PTW
 
 - Don't try to access SPE when owned by EL3
 
 Documentation updates:
 
 - Document the failure modes of event injection
 
 - Document that a GICv3 guest can be created on a GICv5 host
   with FEAT_GCIE_LEGACY
 
 Selftest improvements:
 
 - Add a selftest for the effective value of HCR_EL2.AMO
 
 - Address build warning in the timer selftest when building with clang
 
 - Teach irqfd selftests about non-x86 architectures
 
 - Add missing sysregs to the set_id_regs selftest
 
 - Fix vcpu allocation in the vgic_lpi_stress selftest
 
 - Correctly enable interrupts in the vgic_lpi_stress selftest
 
 x86:
 
 - Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test for the
   bug fixed by commit 3ccbf6f47098 ("KVM: x86/mmu: Return -EAGAIN if userspace
   deletes/moves memslot during prefault")
 
 - Don't try to get PMU capabilities from perf when running a CPU with hybrid
   CPUs/PMUs, as perf will rightly WARN.
 
 guest_memfd:
 
 - Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a more
   generic KVM_CAP_GUEST_MEMFD_FLAGS
 
 - Add a guest_memfd INIT_SHARED flag and require userspace to explicitly set
   said flag to initialize memory as SHARED, irrespective of MMAP.  The
   behavior merged in 6.18 is that enabling mmap() implicitly initializes
   memory as SHARED, which would result in an ABI collision for x86 CoCo VMs
   as their memory is currently always initialized PRIVATE.
 
 - Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with private
   memory, to enable testing such setups, i.e. to hopefully flush out any
   other lurking ABI issues before 6.18 is officially released.
 
 - Add testcases to the guest_memfd selftest to cover guest_memfd without MMAP,
   and host userspace accesses to mmap()'d private memory.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCgAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmjzqVIUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroO+qQgArc7XXmoiHQfTmdqbFL+1ipzfqd/c
 SHJghONWVNKaSm0EsH72iEokmUyI8HssllaBuaGEAT/1F6YmRFwSSFgUG+N02rah
 pL5ShCG2fPVxHal9ZJ04M4DYWPPClmcE2myfQ6k9kwcMgCRK2BdSRRnKH3XfOKrY
 jAFNZVBCeODcnSvjOyxK2QFEt7J97H1AoAxOORvdqFmRqVIEQNJA/3Hx51wPfkwD
 UnCQiNaPinDMxuuwvcmlYsIrQhGaqO4de1Kx0A4ZkSQqFUcyhvB6Qa+DoApz/IBw
 qsFLqoR/1XXJ90wxutSTFzfjHM/SU6fhj57Cl9dAHI3pgnssC1iUvEt9Iw==
 =dvAj
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:

   - Fix the handling of ZCR_EL2 in NV VMs

   - Pick the correct translation regime when doing a PTW on the back of
     a SEA

   - Prevent userspace from injecting an event into a vcpu that isn't
     initialised yet

   - Move timer save/restore to the sysreg handling code, fixing EL2
     timer access in the process

   - Add FGT-based trapping of MDSCR_EL1 to reduce the overhead of debug

   - Fix trapping configuration when the host isn't GICv3

   - Improve the detection of HCR_EL2.E2H being RES1

   - Drop a spurious 'break' statement in the S1 PTW

   - Don't try to access SPE when owned by EL3

  Documentation updates:

   - Document the failure modes of event injection

   - Document that a GICv3 guest can be created on a GICv5 host with
     FEAT_GCIE_LEGACY

  Selftest improvements:

   - Add a selftest for the effective value of HCR_EL2.AMO

   - Address build warning in the timer selftest when building with
     clang

   - Teach irqfd selftests about non-x86 architectures

   - Add missing sysregs to the set_id_regs selftest

   - Fix vcpu allocation in the vgic_lpi_stress selftest

   - Correctly enable interrupts in the vgic_lpi_stress selftest

  x86:

   - Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test
     for the bug fixed by commit 3ccbf6f47098 ("KVM: x86/mmu: Return
     -EAGAIN if userspace deletes/moves memslot during prefault")

   - Don't try to get PMU capabilities from perf when running a CPU with
     hybrid CPUs/PMUs, as perf will rightly WARN.

  guest_memfd:

   - Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a
     more generic KVM_CAP_GUEST_MEMFD_FLAGS

   - Add a guest_memfd INIT_SHARED flag and require userspace to
     explicitly set said flag to initialize memory as SHARED,
     irrespective of MMAP.

     The behavior merged in 6.18 is that enabling mmap() implicitly
     initializes memory as SHARED, which would result in an ABI
     collision for x86 CoCo VMs as their memory is currently always
     initialized PRIVATE.

   - Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with
     private memory, to enable testing such setups, i.e. to hopefully
     flush out any other lurking ABI issues before 6.18 is officially
     released.

   - Add testcases to the guest_memfd selftest to cover guest_memfd
     without MMAP, and host userspace accesses to mmap()'d private
     memory"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (46 commits)
  arm64: Revamp HCR_EL2.E2H RES1 detection
  KVM: arm64: nv: Use FGT write trap of MDSCR_EL1 when available
  KVM: arm64: Compute per-vCPU FGTs at vcpu_load()
  KVM: arm64: selftests: Fix misleading comment about virtual timer encoding
  KVM: arm64: selftests: Add an E2H=0-specific configuration to get_reg_list
  KVM: arm64: selftests: Make dependencies on VHE-specific registers explicit
  KVM: arm64: Kill leftovers of ad-hoc timer userspace access
  KVM: arm64: Fix WFxT handling of nested virt
  KVM: arm64: Move CNT*CT_EL0 userspace accessors to generic infrastructure
  KVM: arm64: Move CNT*_CVAL_EL0 userspace accessors to generic infrastructure
  KVM: arm64: Move CNT*_CTL_EL0 userspace accessors to generic infrastructure
  KVM: arm64: Add timer UAPI workaround to sysreg infrastructure
  KVM: arm64: Make timer_set_offset() generally accessible
  KVM: arm64: Replace timer context vcpu pointer with timer_id
  KVM: arm64: Introduce timer_context_to_vcpu() helper
  KVM: arm64: Hide CNTHV_*_EL2 from userspace for nVHE guests
  Documentation: KVM: Update GICv3 docs for GICv5 hosts
  KVM: arm64: gic-v3: Only set ICH_HCR traps for v2-on-v3 or v3 guests
  KVM: arm64: selftests: Actually enable IRQs in vgic_lpi_stress
  KVM: arm64: selftests: Allocate vcpus with correct size
  ...
2025-10-18 07:07:14 -10:00
Linus Torvalds
0e622c4b0e powerpc fixes for 6.18 #2
- Fix to handle NULL pointer dereference at irq domain teardown
  - Fix for handling extraction of struct xive_irq_data
  - Fix to skip parameter area allocation when fadump disabled
 
 Thanks to: Ganesh Goudar, Hari Bathini, Nam Cao, Ritesh Harjani (IBM) Sourabh Jain, Venkat Rao Bagalkote,
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEqX2DNAOgU8sBX3pRpnEsdPSHZJQFAmjzbw8ACgkQpnEsdPSH
 ZJQRQQ//S+NgNrn7KloQ28W+zLauH1GZKTkAwR3D8M1wFWwkM4guyxhHVCagxO6d
 dL1Hs4dUzoCfum6qw2/E0Jc/4ul7YLv3QeiTyz6UwytrZOHeUy7WvPUtksb22+Hk
 FSdUP/hD2td8SP1V4vsJVuK58DieiwNMpXvPssnq9wmCJoepKNdc3TTFSIRWji92
 PfL9ooYwoT5flW7nvf7Tx0ZbfL1pEIbTjEufsQOINofTadZnOfSvJoq0zYf5VFFF
 8v1YDOqWfBvWJWR0eLLdS8S0tWLIf6e8h6cQkWIninvsVU1LtWhcbGiq1xD3BTFl
 RYCTkUzNjFAf0lvkjfheXylqKZhoDglfGsHQhADCL0cCtSbRCUYkz24+n7m7se3Y
 0hj1Xs3hi4ELH/i5m6VZqL3sEsXnOLOfA8fD9mhDCy+33RK5jMNdNhTjTLsSmW1T
 m0E7gF6Yt16BTggWuQJREXtEVCQbQOxhDZwuG4tw3bAU3VAuyUbkVggS9Fadb81Q
 klTGTAY8s3BxTzjVzeQByhOCtpyRgkKX3UEYH2CwGHi5NbpcSKlQx7q7JuxenkP4
 aOY9A1oCc4H/vjM2hUOtdu9wUO4hxMrGwv8Nk0dJFFGOJLS/ArdXTvcayrX2J8I9
 Rnkd+ztef8n7lkVjjl82WrRD/gIDMFXZQvsos0nHTS+pRdMX2mk=
 =YUzr
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Madhavan Srinivasan:

 - Fix to handle NULL pointer dereference at irq domain teardown

 - Fix for handling extraction of struct xive_irq_data

 - Fix to skip parameter area allocation when fadump disabled

Thanks to Ganesh Goudar, Hari Bathini, Nam Cao, Ritesh Harjani (IBM),
Sourabh Jain, and Venkat Rao Bagalkote,

* tag 'powerpc-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/fadump: skip parameter area allocation when fadump is disabled
  powerpc, ocxl: Fix extraction of struct xive_irq_data
  powerpc/pseries/msi: Fix NULL pointer dereference at irq domain teardown
2025-10-18 07:02:28 -10:00
Linus Torvalds
959f018f97 slab fixes for 6.18-rc2
-----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCAA5FiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmjx/QwbFIAAAAAABAAO
 bWFudTIsMi41KzEuMTEsMiwyAAoJELvgsHXSRYiaTjQH/RIp1LU+WQTEREzU/BnU
 WLvPPDq/p/xy3uYFx8KaUx7gzu0p1kjvIC/7PBVf4uw4KdfC+mg6MIuM99e9rAkk
 LIVEko58iza0t+y0gX8DqGbYItumhafjzL/OdPKEdRzPWcWNzNMQyGfo/k1gDPF4
 x9mBBuwnASLM7oCCenAmo0UpE6+Tf+gy9kYpN7QQ5+ZDk41DSbMx5wmU9SQu3I0u
 H3VYEiC57QMEo3Bdh+H0XqmvSXOew0u/pPmHLJncEM0nNiKeC3c+Rh9rLER8B7P/
 hqtkGoSIwI2yjIZq3frpHV9yr4sRKQS7/Plu7C4smo1Z0afBzBrDL0UfzNWZQmxj
 mGs=
 =jd8H
 -----END PGP SIGNATURE-----

Merge tag 'slab-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab fixes from Vlastimil Babka:

 - Fixes for two bugs that can be triggered when debugging options are
   enabled (Hao Ge, Vlastimil Babka)

* tag 'slab-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  slab: reset slab->obj_ext when freeing and it is OBJEXTS_ALLOC_FAIL
  slab: fix clearing freelist in free_deferred_objects()
2025-10-18 06:59:25 -10:00
Stuart Yoder
dbfdaeb381 tpm_crb: Add idle support for the Arm FF-A start method
According to the CRB over FF-A specification [1], a TPM that implements
the ABI must comply with the TCG PTP specification. This requires support
for the Idle and Ready states.

This patch implements CRB control area requests for goIdle and
cmdReady on FF-A based TPMs.

The FF-A message used to notify the TPM of CRB updates includes a
locality parameter, which provides a hint to the TPM about which
locality modified the CRB.  This patch adds a locality parameter
to __crb_go_idle() and __crb_cmd_ready() to support this.

[1] https://developer.arm.com/documentation/den0138/latest/

Signed-off-by: Stuart Yoder <stuart.yoder@arm.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2025-10-18 14:33:22 +03:00
Paolo Bonzini
4361f5aa8b KVM x86 fixes for 6.18:
- Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test for the
    bug fixed by commit 3ccbf6f47098 ("KVM: x86/mmu: Return -EAGAIN if userspace
    deletes/moves memslot during prefault")
 
  - Don't try to get PMU capabbilities from perf when running a CPU with hybrid
    CPUs/PMUs, as perf will rightly WARN.
 
  - Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a more
    generic KVM_CAP_GUEST_MEMFD_FLAGS
 
  - Add a guest_memfd INIT_SHARED flag and require userspace to explicitly set
    said flag to initialize memory as SHARED, irrespective of MMAP.  The
    behavior merged in 6.18 is that enabling mmap() implicitly initializes
    memory as SHARED, which would result in an ABI collision for x86 CoCo VMs
    as their memory is currently always initialized PRIVATE.
 
  - Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with private
    memory, to enable testing such setups, i.e. to hopefully flush out any
    other lurking ABI issues before 6.18 is officially released.
 
  - Add testcases to the guest_memfd selftest to cover guest_memfd without MMAP,
    and host userspace accesses to mmap()'d private memory.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmjyszoACgkQOlYIJqCj
 N/03+g/9FPoIPxGL9+tJyGahdH2Mygiip8Q3tUTlGVkskp+dplf+6T51ogdqBOkS
 tvlGjccxAOVW73ijn8ox7UtGSY9B1IJ+rj/uhsBOMFQptyUJEv4iEujFKB/t5RIF
 gOTVVR6Z/mcrYJY7F21qBPCHPbz++rEXrfgyAsosz6tpS/nL6vrNdp1LZlcsLM/k
 5DuhkQHLEwJoUXO5VUsBWRa3gRdOuZ+SJGd4C1mfDwXagxe8uFYRO/vraOV9dKJx
 l9RxKPIhf42cfu9tIeJYDJyXIDgzXUEND4smVw/ito1SeakBlC9rQ15ya8ETLAEX
 tHzmj38RPOfjsycGKIRhlLzQx77b+t7FhNdse19QC3p3u9Jn8FuMyFcGZuiaP5kK
 e9Xrp/zcaCNfHB6gGKEud7h7fpV9yB8SNlCsP81if73buq3qHx0+jeVg3jCS4mkb
 zGY7CG+oKJOdTSN8JAh8nJMM4bUv5m8myr6yYGU+SsGBQsyPyqQdRWuKdQyOQIVC
 RZSWXSjKfmT5FwSs0KRI0yUjMeUYbgwrsysFuS3qX62mZpr0vLaPoiYFvuffe9gB
 W3Tt98QNPtBmJdINNncgKvbn3sp/CzUHirygaI0APZwh6QQAkL1Dp2beA1i95uVI
 8uin4zUjRbRjUpMUaEtAQIaVMTVQqgfiPAvtcDCRBBeOGaNr81M=
 =lBtj
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-fixes-6.18-rc2' of https://github.com/kvm-x86/linux into HEAD

KVM x86 fixes for 6.18:

 - Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test for the
   bug fixed by commit 3ccbf6f47098 ("KVM: x86/mmu: Return -EAGAIN if userspace
   deletes/moves memslot during prefault")

 - Don't try to get PMU capabbilities from perf when running a CPU with hybrid
   CPUs/PMUs, as perf will rightly WARN.

 - Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a more
   generic KVM_CAP_GUEST_MEMFD_FLAGS

 - Add a guest_memfd INIT_SHARED flag and require userspace to explicitly set
   said flag to initialize memory as SHARED, irrespective of MMAP.  The
   behavior merged in 6.18 is that enabling mmap() implicitly initializes
   memory as SHARED, which would result in an ABI collision for x86 CoCo VMs
   as their memory is currently always initialized PRIVATE.

 - Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with private
   memory, to enable testing such setups, i.e. to hopefully flush out any
   other lurking ABI issues before 6.18 is officially released.

 - Add testcases to the guest_memfd selftest to cover guest_memfd without MMAP,
   and host userspace accesses to mmap()'d private memory.
2025-10-18 10:25:43 +02:00
Paolo Bonzini
5d26eaae15 KVM/arm64 fixes for 6.18, take #1
Improvements and bug fixes:
 
 - Fix the handling of ZCR_EL2 in NV VMs
   (20250926194108.84093-1-oliver.upton@linux.dev)
 
 - Pick the correct translation regime when doing a PTW on
   the back of a SEA (20250926224246.731748-1-oliver.upton@linux.dev)
 
 - Prevent userspace from injecting an event into a vcpu that isn't
   initialised yet (20250930085237.108326-1-oliver.upton@linux.dev)
 
 - Move timer save/restore to the sysreg handling code, fixing EL2 timer
   access in the process (20250929160458.3351788-1-maz@kernel.org)
 
 - Add FGT-based trapping of MDSCR_EL1 to reduce the overhead of debug
   (20250924235150.617451-1-oliver.upton@linux.dev)
 
 - Fix trapping configuration when the host isn't GICv3
   (20251007160704.1673584-1-sascha.bischoff@arm.com)
 
 - Improve the detection of HCR_EL2.E2H being RES1
   (20251009121239.29370-1-maz@kernel.org)
 
 - Drop a spurious 'break' statement in the S1 PTW
   (20250930135621.162050-1-osama.abdelkader@gmail.com)
 
 - Don't try to access SPE when owned by EL3
   (20251010174707.1684200-1-mukesh.ojha@oss.qualcomm.com)
 
 Documentation updates:
 
 - Document the failure modes of event injection
   (20250930233620.124607-1-oliver.upton@linux.dev)
 
 - Document that a GICv3 guest can be created on a GICv5 host
   with FEAT_GCIE_LEGACY (20251007154848.1640444-1-sascha.bischoff@arm.com)
 
 Selftest improvements:
 
 - Add a sesttest for the effective value of HCR_EL2.AMO
   (20250926224454.734066-1-oliver.upton@linux.dev)
 
 - Address build warning in the timer selftest when building
   with clang (20250926155838.2612205-1-seanjc@google.com)
 
 - Teach irq_fd selftests about non-x86 architectures
   (20250930193301.119859-1-oliver.upton@linux.dev)
 
 - Add missing sysregs to the set_id_regs selftest
   (20251012154352.61133-1-zenghui.yu@linux.dev)
 
 - Fix vcpu allocation in the vgic_lpi_stress selftest
   (20251008154520.54801-1-zenghui.yu@linux.dev)
 
 - Correctly enable interrupts in the vgic_lpi_stress selftest
   (20251007195254.260539-1-oliver.upton@linux.dev)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmjuPmEACgkQI9DQutE9
 ekPsvhAAnvqnMtZA/4PkW7JemIBZQYKEiMWrSnsdxj+0wDYN0wvyT1Bwrnqr/QLi
 i+4eBNCqiydxQrIg3nocmd4mtmfTZAZfwYPuYOW2Z5j2eJ1+xkkiacbCJZZTRXR1
 mzLhZ/kFWBHf37OhluMQjaCWokiBQN3H699Sxs7GXZT/KpWz3EunjtBAe2oinUUX
 so0j5NSYHiTNON8VSfNwLi9HXEMmEZVvuao7TMaN0xUQIM0Kvlic6XzbvCvSga9S
 gldsmab5JtN3N/ZSCsYTn5Cr2TctbQVxwgH2QVdPADzImbu+nxI4r4f7MCL6lNYw
 AueVHzYSFAJHieebd10ZF1tiJVar2cQ9wVc4JwDfaZv4TwCcv6SHzBaxFf26LuSN
 uVVRCwOZRoewXI2k57Sp6TU/uhzc11tSK4hZTt/tUY5AG7zraEC4c0flbjhfhngp
 FCssLsfQiPBmAOhsxBnUvYDf8WXlrhvfhZLGSmfPrO9fxL5Yd8dU6nvuAjtvjLG7
 8579QqAtllo9yxV3na5GTtdBDlloWXl1yIzrGVW21UoAd23kXFcKFmnZoGKbvw6F
 yccCCU2bZ3s4USH9h4fJJqYDfi01LQsU219G8m696zD76oeMnt8/TZvD2/atZIBR
 uF/nd71btHP+mHC4aKScWERGt0edZEJeP71G+1OubOgPG+q6+Cs=
 =DOeP
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.18, take #1

Improvements and bug fixes:

- Fix the handling of ZCR_EL2 in NV VMs
  (20250926194108.84093-1-oliver.upton@linux.dev)

- Pick the correct translation regime when doing a PTW on
  the back of a SEA (20250926224246.731748-1-oliver.upton@linux.dev)

- Prevent userspace from injecting an event into a vcpu that isn't
  initialised yet (20250930085237.108326-1-oliver.upton@linux.dev)

- Move timer save/restore to the sysreg handling code, fixing EL2 timer
  access in the process (20250929160458.3351788-1-maz@kernel.org)

- Add FGT-based trapping of MDSCR_EL1 to reduce the overhead of debug
  (20250924235150.617451-1-oliver.upton@linux.dev)

- Fix trapping configuration when the host isn't GICv3
  (20251007160704.1673584-1-sascha.bischoff@arm.com)

- Improve the detection of HCR_EL2.E2H being RES1
  (20251009121239.29370-1-maz@kernel.org)

- Drop a spurious 'break' statement in the S1 PTW
  (20250930135621.162050-1-osama.abdelkader@gmail.com)

- Don't try to access SPE when owned by EL3
  (20251010174707.1684200-1-mukesh.ojha@oss.qualcomm.com)

Documentation updates:

- Document the failure modes of event injection
  (20250930233620.124607-1-oliver.upton@linux.dev)

- Document that a GICv3 guest can be created on a GICv5 host
  with FEAT_GCIE_LEGACY (20251007154848.1640444-1-sascha.bischoff@arm.com)

Selftest improvements:

- Add a selftest for the effective value of HCR_EL2.AMO
  (20250926224454.734066-1-oliver.upton@linux.dev)

- Address build warning in the timer selftest when building
  with clang (20250926155838.2612205-1-seanjc@google.com)

- Teach irq_fd selftests about non-x86 architectures
  (20250930193301.119859-1-oliver.upton@linux.dev)

- Add missing sysregs to the set_id_regs selftest
  (20251012154352.61133-1-zenghui.yu@linux.dev)

- Fix vcpu allocation in the vgic_lpi_stress selftest
  (20251008154520.54801-1-zenghui.yu@linux.dev)

- Correctly enable interrupts in the vgic_lpi_stress selftest
  (20251007195254.260539-1-oliver.upton@linux.dev)
2025-10-18 10:25:31 +02:00