1
0
mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-01-12 01:20:14 +00:00

1399827 Commits

Author SHA1 Message Date
Shuhao Fu
9aee8de970 exfat: fix refcount leak in exfat_find
Fix refcount leaks in `exfat_find` related to `exfat_get_dentry_set`.

Function `exfat_get_dentry_set` would increase the reference counter of
`es->bh` on success. Therefore, `exfat_put_dentry_set` must be called
after `exfat_get_dentry_set` to ensure refcount consistency. This patch
relocate two checks to avoid possible leaks.

Fixes: 82ebecdc74ff ("exfat: fix improper check of dentry.stream.valid_size")
Fixes: 13940cef9549 ("exfat: add a check for invalid data size")
Signed-off-by: Shuhao Fu <sfual@cse.ust.hk>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-12-03 10:00:16 +09:00
Linus Torvalds
d61f1cc5db * Enable Linear Address Space Separation (LASS)
* Change X86_FEATURE leaf 17 from an AMD leaf to Linux-defined
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmkuIXAACgkQaDWVMHDJ
 krAwoRAAqqavNrthj26XJHjR3x7FVGu11/rvYXAd1U2moN/dhM2w82HMHNFvPuQY
 3iq9GDRQdc2rKL7LTkREvN4ZM/rFvkFLt6a5Yv0eCRK8KAiSJEw6Yzu/qgG7kF+0
 9clujDUskjjHU0zR5v+o1RxirrLVQ+R50sMVI5uoFx6+WJRiW1BvMG4Csw4BgbvA
 AqgrZpyq1dQ/GQOW4f0yxBPH0z84wgUbdllYzQzE0GeUlGWQSI4lqa8GFMOmE/Gr
 7gBcKmyE0M/BycwTZW7tiMnjWgNL+Y5/RroQJ7hh6R+f5WOd+SpGvlyOihbF7GER
 L3yZfeQ+EWz1aY1QMWwOSvSawIPJo8EkSn3d9/JFq5Vl9zsFh+ZoPZfZ8bEi36U0
 inO93swDcyMkkfOTh4sIgxedLgHja5GFNCGPs0yblvLulWbw7yYVzzEmEjXnclzS
 fmmifsJjGrUpegEnWdEjAQzXkWPd/hKiAvpzDE/3thBal5NkOzFrudITFvCVuk8w
 uS2MW0U8VCskNoON0jjwnvv84p0XdHJOsgPB9WnsuMMASKC1RqKAJWXh8AXvZA+I
 TfCNdSyHDTm+o1e+SMQZRbqoE/r7MmAxUQOkKnlvpJDCz58tsLzW64hRXTe7QpCt
 rry9/wODswu+oaHoDgfjAmzYde2RhCjwWLzGmqmapNIYfCCVhYs=
 =5bcW
 -----END PGP SIGNATURE-----

Merge tag 'x86_cpu_for_6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 CPU feature updates from Dave Hansen:
 "The biggest thing of note here is Linear Address Space Separation
  (LASS). It represents the first time I can think of that the
  upper=>kernel/lower=>user address space convention is actually
  recognized by the hardware on x86. It ensures that userspace can not
  even get the hardware to _start_ page walks for the kernel address
  space. This, of course, is a really nice generic side channel defense.

  This is really only a down payment on LASS support. There are still
  some details to work out in its interaction with EFI calls and
  vsyscall emulation. For now, LASS is disabled if either of those
  features is compiled in (which is almost always the case).

  There's also one straggler commit in here which converts an
  under-utilized AMD CPU feature leaf into a generic Linux-defined leaf
  so more feature can be packed in there.

  Summary:

   - Enable Linear Address Space Separation (LASS)

   - Change X86_FEATURE leaf 17 from an AMD leaf to Linux-defined"

* tag 'x86_cpu_for_6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Enable LASS during CPU initialization
  selftests/x86: Update the negative vsyscall tests to expect a #GP
  x86/traps: Communicate a LASS violation in #GP message
  x86/kexec: Disable LASS during relocate kernel
  x86/alternatives: Disable LASS when patching kernel code
  x86/asm: Introduce inline memcpy and memset
  x86/cpu: Add an LASS dependency on SMAP
  x86/cpufeatures: Enumerate the LASS feature bits
  x86/cpufeatures: Make X86_FEATURE leaf 17 Linux-specific
2025-12-02 14:48:08 -08:00
Linus Torvalds
a7610b8465 * Fix 64-bit identifier structure member name in fred_ss
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmkuHhcACgkQaDWVMHDJ
 krDXKg//UVSrG9dyLoyhXKLEF33mKm98l3AKeIQSV9yY56K71HwxNDr44di6GGY7
 Rwlwl0GPXRCW1qDC11Kbd1m6buj5CZeuCMp6nnIhAvefw+NOlApg4mhzVIdjUTg0
 EKagZPlq13g1puOPiV9T+ybFsRJxV5MPoY+fLCvGCJmTMwavfpYylXo0xkU7vgkR
 Asc7LDK/H+t7Cbe51h+XAAdJMRC/tjBH7s7MhYVZuJtzXiin2MLAw/wV9Uz+YJ5d
 fGy1N945igItyH+qG9TqbKQDi1s5kf70NPcGAL+UxF8ox8NNOVEF90VFs8YEaF/s
 sHegUtahfFmkb7YEBGVz+ctHoSgwQfwiScYdjBPosj3EzPksSOogcy2PqTxGo3wx
 kNp9jc1oHfMziOLVbVDyQg31XEEy6Vvbc/2O7UTPx8fJbivG0+7ZpisKb1XC1MFh
 fsX9mPP2aNQCXs8kIjozeHQTEGCIDWsvnlX3co3bhlIIIO8eIFp6JYoy9A+wTimo
 WOrFH8E9ktbhBQcvIOrf+9WZ9P3uJxq60h6SLtGeL6/x7c3geCajFLcJzwlIYPQU
 m1gZqMLV4/ceQeUw2wE7rExzjD56JSDtnP6FgW3ygalLbxj9mRqqbg5ArsjH9gJ+
 xA01rH6XcXNSUOzuu2yqBps3FqOfp5A4sLyUfnSbs0eTztmX3Q8=
 =CN+R
 -----END PGP SIGNATURE-----

Merge tag 'x86_entry_for_6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 entry update from Dave Hansen:
 "This one is pretty trivial: fix a badly-named FRED data structure
  member"

* tag 'x86_entry_for_6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fred: Fix 64bit identifier in fred_ss
2025-12-02 14:24:21 -08:00
Linus Torvalds
e2aa39b368 * Make MSR-induced taint easier for users to track down
* Restrict KVM-specific exports to KVM itself
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmkuHIIACgkQaDWVMHDJ
 krAxMg//RQUz6JnQnMASuN/NhrjIANRjcPJI9S0LoKcTbZ0nZ5aH6oR1VOFszLLa
 ShGcUO2RuDbCl2wPAG/lRWV8eL/4k4mZi0zNT7vEKTkX/EZn5RDV59p88zCo62KV
 835OpX8W9Hvyiichw51RoVrJxEcqgCmlUYO2fCwtk2rpntUCOVQgHMeLhhqMsZ0e
 yMQECAE75oXQ4vhAG+zO7/KmLqVbSGgqpXYw6DOZGEJF0T7tdZIgFhd25WAPgcf0
 UN8VmTX971Eq67OrUX9OojN6+SxBqQ7vc+qBtd5bDlkZsRxVyV157Zso2PCPbsm2
 FkE65eJBa9qacqvwkCPND6J7gvE/Sm8DaLVafLPKDNWTaqSo4cfKJD7P/sgN1L69
 O8QsiLfafy8ITIA8AXS90C8x/puhqk15OKW2kJFFfUkhrGdu72/AxVlo6JcM1N0u
 qkDXUNBSX9/LHkRT9AtkLch27MEFXRKxsajjx2lFoBIR2VjIijm9314cRczHGZEV
 R/pqBh21yL/ZTriNIgmEPrFOV4zDxaOsHRh8YSEFAXRe2xWvm7dZwNSPRSh7hMT+
 q0ABPuYqTZ4PDGMaAB0gNRqmR9aQKpVMY+4xmTdmqscYkgV4usZQcrQOeiKVwh7F
 KdMC5tr4yFOMMl8CaMgOK+27ZrSYI1hwtXCc/orAhOwxhg62Z40=
 =tjcN
 -----END PGP SIGNATURE-----

Merge tag 'x86_misc_for_6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc x86 updates from Dave Hansen:
 "The most significant are some changes to ensure that symbols exported
  for KVM are used only by KVM modules themselves, along with some
  related cleanups.

  In true x86/misc fashion, the other patch is completely unrelated and
  just enhances an existing pr_warn() to make it clear to users how they
  have tainted their kernel when something is mucking with MSRs.

  Summary:

   - Make MSR-induced taint easier for users to track down

   - Restrict KVM-specific exports to KVM itself"

* tag 'x86_misc_for_6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Restrict KVM-induced symbol exports to KVM modules where obvious/possible
  x86/mm: Drop unnecessary export of "ptdump_walk_pgd_level_debugfs"
  x86/mtrr: Drop unnecessary export of "mtrr_state"
  x86/bugs: Drop unnecessary export of "x86_spec_ctrl_base"
  x86/msr: Add CPU_OUT_OF_SPEC taint name to "unrecognized" pr_warn(msg)
2025-12-02 14:16:42 -08:00
Linus Torvalds
54de197c9a * Allow security version (SVN) updates so enclaves can attest
to new microcode.
  * Fix kernel docs typos
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmkuGsMACgkQaDWVMHDJ
 krB/AhAAlRe+7+jNXp4Xtor9Q9VpTG5Q7WL7yLUHOd+JbsVKCcCLy2i86rrXEVvo
 AN3nQqkAotd1xx9P1SDz2SL4S0PS8zM3avj8JLppBb16VEiApFqUsXYMUJrFVjne
 x/+fEsKVOXp+kfM72qhI3/sqgq/zPUanm/6GR0wlpe2+TPOmnw588GgCV9ADW8ha
 ktLWaBGPHgyzNGAO5CqSSnN+2u25vRcyKX6mfUZeeooudPPVbANxKLN1WzBP2zZo
 eO5jbyeiPNH0bC1p6H5TQOl/bfktfhl99S+ta3e7+/PduDvDR01Z3kMl5JSb74hk
 75fXUI1R+/zUyXbyJ+exxTqAkS0F66m1Tv+DU9U4WorKjgAwYFPcdvjkJh/Ju6XB
 cqvwvLUE5rH2f3hzUmffSP8FwQO65svLsB4BIWd+3sgu6ZbAV4zSVOSb51W39Q7U
 WlAMYtlM6+bZF6RaJCweDUMq407KcHFqjN6rzX7x+VguNe/FMERRsGBz6kBT2FZH
 oxMzUXTwRfwSxblvVJMjjecfplywH5viHATbApMIGdamHWis8Dmvl0DMmBi4O6CI
 /O792TOHxIqLmxkh9SE7ONOCRe1hUfK2JY4n6sgNrBH8OwylPqkl2kgE5X3Cr9c1
 1qwMGfto5t0izr/SA5wruEJtPjQPsAyUKjfP+msyBemTg7dzTFU=
 =2lh5
 -----END PGP SIGNATURE-----

Merge tag 'x86_sgx_for_6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 SGX updates from Dave HansenL
 "The main content here is adding support for the new EUPDATESVN SGX
  ISA. Before this, folks who updated microcode had to reboot before
  enclaves could attest to the new microcode. The new functionality lets
  them do this without a reboot.

  The rest are some nice, but relatively mundane comment and kernel-doc
  fixups.

  Summary:

   - Allow security version (SVN) updates so enclaves can attest to new
     microcode

   - Fix kernel docs typos"

* tag 'x86_sgx_for_6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/sgx: Fix a typo in the kernel-doc comment for enum sgx_attribute
  x86/sgx: Remove superfluous asterisk from copyright comment in asm/sgx.h
  x86/sgx: Document structs and enums with '@', not '%'
  x86/sgx: Add kernel-doc descriptions for params passed to vDSO user handler
  x86/sgx: Add a missing colon in kernel-doc markup for "struct sgx_enclave_run"
  x86/sgx: Enable automatic SVN updates for SGX enclaves
  x86/sgx: Implement ENCLS[EUPDATESVN]
  x86/sgx: Define error codes for use by ENCLS[EUPDATESVN]
  x86/cpufeatures: Add X86_FEATURE_SGX_EUPDATESVN feature flag
  x86/sgx: Introduce functions to count the sgx_(vepc_)open()
2025-12-02 14:03:05 -08:00
Linus Torvalds
c76431e3b5 - Use the proper accessors when reading CR3 as part of the page level
transitions (5-level to 4-level, the use case being kexec) so that
   only the physical address in CR3 is picked up and not flags which are
   above the physical mask shift
 
 - Clean up and unify __phys_addr_symbol() definitions
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmkt8McACgkQEsHwGGHe
 VUpxJBAAg6PaKVNOmceCCcwDb331YLHpd18eeLy7Cdr6ktcdDflo39TiKnwy/BEs
 2uENe9OrS52JL98vMhZxPVFL/3yplrMo7jfuamthSEcFuvlxe2wh7NGhxbNl2gOe
 +9BpYTbHe5wts+W+ij/srcBCzGDIYoYhh7Dbc8wB1dh/jcH2qkEnYvBTGoYtgELF
 lWt1pWsdHVnUORn9qKNI3iAX47jmkUTBqEgQHyPFcSM6s8WGtIOKib7+UtvNiMTw
 V0ZMzfsL5k4J6ifwR5PLLaMNXdwQoZeArWbCA6VYhnOEP0MBmgLxFFCCi5z6iGwv
 ph+YYWm2/kMEOdJDfDlZqjZFcw/QOfk44chGMTqf+G3rFdNrHMdTiovtvzg6vGvG
 akJK5r2JsAJu8ymuwd3Rke3F3k1SP7QfdYB1Tipu4wvt7iSOQNqIA/xcHjMprHBx
 MZ6BifOxwXhhihUr9UA0TSQM6fJfnzrKPdzDSh/h5qThSpjbH/qkNlJwNGy/Knm5
 5MTftDkDkpkmJDiOhJAOCweMBGNyFQrOH1QYuqURrB+AGo3Iq9HIJ+2fVXtUdIZy
 AMmvEROjMRgxD2hoBCVa4AF5Gm3cNiGxGn+jEitdLgqVbTi0tWSO+oPOK+uH2Zib
 77r8hNmd9hE7ikHSRGhWS+5D3mVWejsDtrs8YyCMuXN/Ft2omRU=
 =+p5z
 -----END PGP SIGNATURE-----

Merge tag 'x86_mm_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 mm updates from Borislav Petkov:

 - Use the proper accessors when reading CR3 as part of the page level
   transitions (5-level to 4-level, the use case being kexec) so that
   only the physical address in CR3 is picked up and not flags which are
   above the physical mask shift

 - Clean up and unify __phys_addr_symbol() definitions

* tag 'x86_mm_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi/libstub: Fix page table access in 5-level to 4-level paging transition
  x86/boot: Fix page table access in 5-level to 4-level paging transition
  x86/mm: Unify __phys_addr_symbol()
2025-12-02 13:32:52 -08:00
Linus Torvalds
a9a10e920e - Convert the tsx= cmdline parsing to use early_param()
- Cleanup forward declarations gunk in bugs.c
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmkt3IMACgkQEsHwGGHe
 VUqPxRAAkul5R9zZgR9XcV/ZRA6iulo4M0o38Z2COeWtlDtDXpAJ5vNarDBB1l+K
 ea9YzbfN4iECXNXOCtSrEr7SmJ2ld8xBSyzISeWx2V1AXi4A6A8GX574egtDDCaV
 vN+eTT1ki9YuJLhgi0eZ+uHkdoTLhChrBehuvlcHnEOXN1N9D/P8/QfDYG7IyCxu
 /jSnuXyQ3AA2uM3IMEAGLcAarI+cU4HgMsF2Za5Dp8SKhSbhpgpEfFkp8+p+bUO/
 nKLgUj2dqibFZd+hvhrwtrnsDL8rTJxLr0dVwg7MZeBea4GliUsaT8jBRBMQddIL
 DGjAR2niGeGM42DWvFUVZczcgdP0NI8ChpZ3zCjdICrZBGMkZe70iqcPaL0BELob
 toeBjlHIrlUnXIfZhVBYj59+E+q4yjSwvOd23FYqj1XKcgSR+j305dxo9jTAndLV
 M/f8o8au3tlccE308OQ/XgIkXFBlGLjrNlak32ze4P2nHFMTlYJBCbAFIti3qcpT
 Er7q866s7kzagphmZ/2vf0I7JuYI2le/8OnneVFz7l+SVXY1SNEpA7qRM5bA220J
 n2Orfaen3CgotX7c3yT/XP0c9Wqbd29LMXiQbOdGMMONv5O4aaWMsfRPhlCEYiTK
 VZ/FklfXrVxtSqa96zBny4MlvtKoYwYklk+McuGF5M4iDXGCthw=
 =kAho
 -----END PGP SIGNATURE-----

Merge tag 'x86_bugs_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 CPU mitigation updates from Borislav Petkov:

 - Convert the tsx= cmdline parsing to use early_param()

 - Cleanup forward declarations gunk in bugs.c

* tag 'x86_bugs_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/bugs: Get rid of the forward declarations
  x86/tsx: Get the tsx= command line parameter with early_param()
  x86/tsx: Make tsx_ctrl_state static
2025-12-02 13:27:09 -08:00
Linus Torvalds
cb502f0e5e - Largely cleanups along with a change to save XSS to the GHCB (Guest-Host
Communication Block) in SEV-ES guests so that the hypervisor can determine
    the guest's XSAVES buffer size properly and thus support shadow stacks in
    AMD confidential guests
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmkttt0ACgkQEsHwGGHe
 VUoaFg/8CY+UAE1VtnaWhG7hpxCqBLlVtyt3gVhIn6ZCZ5mxtFoEcZI8BnxnFbbM
 Rpd+5LsMbu4GWfq/AEx/a+IT3rKZIT1RjRrd73JZZYRNpv6/Vnmv/7OizjDbBqhU
 n3Q1rHJCaVk90oP2sbB4dr9qsYHOx624jz0CrBxUM7GnaCQwtofqa6hK6HMJDu3g
 OLjJCoaHY0ry779QUjCmMJ1BbOLsy1fGsmuQO8LcE6xWRJv5ueJPcZbH0I0g5UIF
 NExe03uxaSvrM0ZYdjHpQU590kyPwjzo0Jx8IANQDb0dyY4mIFPdnZwbRBr/OPnZ
 205c0EllHZvDZ4nKTYfeJYjXnPWmovHXJATr/BuqW+0GQZYmTbDq2IgvbbnE9gs1
 67Sy94ISuxs683hNb9U2cLI7OFHcVDGfESuHhmeJTQsVY+VazL00p6azFP1ONpsn
 N93GYK+ZFvOeFFssO/gm97jbkKyUH9PS2+TEbhijeQkZ/PYKVbObM89LDLSRrKC7
 5mEFCZIKIehKLdSoAc8yTzKRE0piyK/PR6ykzP9A2rEjrSN2HDqvsR+nDr0hQ1/V
 Uye4a8V0XiHyZsvD+AIYJnARGYdcnUCiezep81WaC55hn0sdqrhlqnycnkd+7fDE
 MZF9epXCnIAA8IF7I5jCLfwTa1b6ouMhBxpwiMpaL9DGmQghWcM=
 =GCGn
 -----END PGP SIGNATURE-----

Merge tag 'x86_sev_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 SEV updates from Borislav Petkov:

 - Largely cleanups along with a change to save XSS to the GHCB
   (Guest-Host Communication Block) in SEV-ES guests so that the
   hypervisor can determine the guest's XSAVES buffer size properly
   and thus support shadow stacks in AMD confidential guests

* tag 'x86_sev_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cc: Fix enum spelling to fix kernel-doc warnings
  x86/boot: Drop unused sev_enable() fallback
  x86/coco/sev: Convert has_cpuflag() to use cpu_feature_enabled()
  x86/sev: Include XSS value in GHCB CPUID request
  x86/boot: Move boot_*msr helpers to asm/shared/msr.h
2025-12-02 13:07:53 -08:00
Linus Torvalds
d748981834 - The mandatory pile of cleanups the cat drags in every merge window
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmktqqMACgkQEsHwGGHe
 VUpBDg/8CT4uBJLqgQYoh9YYN9U/C1walUMLivHq9VJbld7a55nGCirUTBuiTRzp
 V2np/6V/XpG2YmV9Xj6E4Q6QLiOsfrWKFRzfr7OlPZ7yW5GR8aKBcEZm2Voi7j94
 qLVSraAjtfmTtU5Ym7Lp1fqswDW2Y+iU6zqIyotqy+/7qZ6yp4NwlrUwSocDsSLo
 n6Gh1Vv6fXdeMckzT1WJ/CMtx07IfQB/wqVVTO4WwBu1Pv71WO+Cv1pG2mewUVjK
 879X7+oc1icxCtD2OZxHfOEJGtC97N7cZDq6VEd4s/bjgYF75glyFtkv/DVgbQ6E
 BSkwciT5Sqb7B5N09RJWWJiNLlNQPFQdDFAuU/Sk18Vqc01jySFoWLLsiS/DeCsz
 opEnPK6uXO4m+2DI44dWpWgVj33a/ao6zej3uEPENkp9+FYaZPwip8Bdjn1FptBp
 ZGmqa7oy38MXTzV6hOctMAx/nXE05lff41Xe8fLlisNc7a6ZUwql9oA4A4yULtKI
 BMlddTZ5/N5LzOGVH5nixX+ig3wDqtEEOD1tdLb/f8Nhy71Z3Kmbf7zDuJCTaabn
 VAGaPUDZ4qfeMks1qRIVVR7rUo6PwRRNvi8Wyiauc7/fnxWyVR0z1vrqs2o3MtZJ
 WQ0/HAZZ/K54HWk049ea2b1kFjjOe6R5hKKX1Hc2nPIS3APD/Vc=
 =3uJO
 -----END PGP SIGNATURE-----

Merge tag 'x86_cleanups_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cleanups from Borislav Petkov:

 - The mandatory pile of cleanups the cat drags in every merge window

* tag 'x86_cleanups_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Clean up whitespace in a20.c
  x86/mm: Delete disabled debug code
  x86/{boot,mtrr}: Remove unused function declarations
  x86/percpu: Use BIT_WORD() and BIT_MASK() macros
  x86/cpufeatures: Correct LKGS feature flag description
  x86/idtentry: Add missing '*' to kernel-doc lines
2025-12-02 12:17:47 -08:00
Linus Torvalds
2ae20d6510 - Add support for AMD's Smart Data Cache Injection feature which allows
for direct insertion of data from I/O devices into the L3 cache, thus
   bypassing DRAM and saving its bandwidth; the resctrl side of the feature
   allows the size of the L3 used for data injection to be controlled
 
 - Add Intel Clearwater Forest to the list of CPUs which support Sub-NUMA
   clustering
 
 - Other fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmktpFQACgkQEsHwGGHe
 VUop4g/9GTb/5rcFMQzeGlG3USnJOqJ+SmiAalA9lm1c933en9tqUgL/K0C0xC6h
 yraB3ICuob1YayiZkBwKIOQiei9gmfhH/CGf5vLcZMM+D6fqvlk1D+C40SuFoDFV
 DOH3H2nYoJ3vbZRtRZsD3bv/djST/OVk28g7eY8OwpZIwN5VSFULJwjK1ePPy+nL
 l65s/yrgLY0oLDBCGxtJ9gVxjCBqAoqfbbwVbcJm5hXv+2sYk8BH6de/CU+0v/vo
 K6Qu4GbmWqDKYH9thjC4ZC/DPXjtoCxGkg/l1Af5T1PiZF0ZtgEZI6i9JTR33jYJ
 7j6BpkCwPzY07MKj/Ub1RemlMfY4XMN/qssEfFmnwG+aMBtbojNAjdb00Pu9Ffn+
 TKFKiZ6WBTcYhqPQsFVruwHh8wDbJp2/x/yBfjD4qovo1HuyCln4iGDmoFcU2wTD
 UlOXW89bxOT56A3FL77ElnOg9nRltvdKduOluGtkpSkmBbzmDfoXrhG2z9zuuAui
 FB6GT2c5MRVXEC4BY30xwQBG5MArVRMyz9uYDyXf9+KHhWVdmq9K0ZAkIaUmPCvy
 BvBXpRhfxm/dKJPhtSuUPhh5A+a87gqoiu1McaFoVGyjVJIJ5gflge8+/mLj1lQz
 kG56SnLOzdtcwKcmQ5ncv5EkrTBD1Ph12u1kcd+4IZwkpgGZteE=
 =o7Dg
 -----END PGP SIGNATURE-----

Merge tag 'x86_cache_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 resource control updates from Borislav Petkov:

 - Add support for AMD's Smart Data Cache Injection feature which allows
   for direct insertion of data from I/O devices into the L3 cache, thus
   bypassing DRAM and saving its bandwidth; the resctrl side of the
   feature allows the size of the L3 used for data injection to be
   controlled

 - Add Intel Clearwater Forest to the list of CPUs which support
   Sub-NUMA clustering

 - Other fixes and cleanups

* tag 'x86_cache_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  fs/resctrl: Update bit_usage to reflect io_alloc
  fs/resctrl: Introduce interface to modify io_alloc capacity bitmasks
  fs/resctrl: Modify struct rdt_parse_data to pass mode and CLOSID
  fs/resctrl: Introduce interface to display io_alloc CBMs
  fs/resctrl: Add user interface to enable/disable io_alloc feature
  fs/resctrl: Introduce interface to display "io_alloc" support
  x86,fs/resctrl: Implement "io_alloc" enable/disable handlers
  x86,fs/resctrl: Detect io_alloc feature
  x86/resctrl: Add SDCIAE feature in the command line options
  x86/cpufeatures: Add support for L3 Smart Data Cache Injection Allocation Enforcement
  fs/resctrl: Consider sparse masks when initializing new group's allocation
  x86/resctrl: Support Sub-NUMA Cluster (SNC) mode on Clearwater Forest
2025-12-02 11:55:58 -08:00
Linus Torvalds
2a47c26e55 - Add microcode staging support on Intel: it moves the sole microcode
blobs loading to a non-critical path so that microcode loading
   latencies are kept at minimum. The actual "directing" the hardware to
   load microcode is the only step which is done on the critical path.
   This scheme is also opportunistic as in: on a failure, the machinery
   falls back to normal loading
 
 - Add the capability to the AMD side of the loader to select one of two
   per-family/model/stepping patches: one is pre-Entrysign and the other
   is post-Entrysign; with the goal to take care of machines which
   haven't updated their BIOS yet - something they should absolutely do
   as this is the only proper Entrysign fix
 
 - Other small cleanups and fixlets
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmktjK4ACgkQEsHwGGHe
 VUqCzg/+NTMgw/cb6zvgXviUTTL62127q4YBr0G3AoNruYbWvdt65suK1pMoRUZL
 CDtflIjDTj8ZSIreXS6tUoFIAzsZUnPApUshHCXlHbK6hYbHDjQgkZme48P+AIqC
 kuP8zcqL0Epzv/Il/d9M8LEmP/0JUoACiibI5T0xMA5Ji9yw0njiHaHCBnrwXduy
 oNsTW8KSaGSaq+zbqa+cS7T06b6SNtUpAQyNSg4Jgj9u3+uPb3a9AfD81jGxUmYl
 SoM/gsiwYjujKV/ZAldnN6tOoRSECqeYLRT/J/Bbqe4zSM5gYh7TRg7N4AcZXKuY
 BLps8IbmiS6ZF2qziicJ7+zN35kXLeuVC+T4rq+IjvkTyH+eJsuGFnGYbXxCwV8A
 nkinSLtn6x0sebem/6H77OjNMLZU0zmLgWfiUfvgnXCErb7SZfs967aG8nxs5bDX
 CnEzS7/98sSkZm0yDSjp0TuXzo1PSGS9wcv30vOR4hClx42YmTZlBUJ5QHJQ9AB0
 1PNmLptwUk9rorTemAzB3Cstm490U7BEd32Od6b+NiIyKogL7uPJKHsQ2Q/t07tw
 ubBm5nFzIhCXWz9v5q1fkvInKAXytHdIN4OnzOPw+7jHF95Vpa2o22OBWaBaCRex
 96jCa4b6pPomxPD+LxdSSMtSihUa4PQz9VrrqnYn7vulumQ/YDo=
 =rxMs
 -----END PGP SIGNATURE-----

Merge tag 'x86_microcode_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 microcode loading updates from Borislav Petkov:

 - Add microcode staging support on Intel: it moves the sole microcode
   blobs loading to a non-critical path so that microcode loading
   latencies are kept at minimum. The actual "directing" the hardware to
   load microcode is the only step which is done on the critical path.

   This scheme is also opportunistic as in: on a failure, the machinery
   falls back to normal loading

 - Add the capability to the AMD side of the loader to select one of two
   per-family/model/stepping patches: one is pre-Entrysign and the other
   is post-Entrysign; with the goal to take care of machines which
   haven't updated their BIOS yet - something they should absolutely do
   as this is the only proper Entrysign fix

 - Other small cleanups and fixlets

* tag 'x86_microcode_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode: Mark early_parse_cmdline() as __init
  x86/microcode/AMD: Select which microcode patch to load
  x86/microcode/intel: Enable staging when available
  x86/microcode/intel: Support mailbox transfer
  x86/microcode/intel: Implement staging handler
  x86/microcode/intel: Define staging state struct
  x86/microcode/intel: Establish staging control logic
  x86/microcode: Introduce staging step to reduce late-loading time
  x86/cpu/topology: Make primary thread mask available with SMP=n
2025-12-02 11:35:49 -08:00
Linus Torvalds
a61288200e - The second part of the AMD MCA interrupts rework after the last-minute
show-stopper from the last merge window was sorted out. After this,
   the AMD MCA deferred errors, thresholding and corrected errors
   interrupt handlers use common MCA code and are tightly integrated
   into the core MCA code, thereby getting rid of considerable
   duplication. All culminating into allowing CMCI error thresholding
   storms to be detected at AMD too, using the common infrastructure
 
 - Add support for two new MCA bank bits on AMD Zen6 which denote whether
   the error address logged is a system physical address, which obviates
   the need for it to be translated before further error recovery can be
   done
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmktlV8ACgkQEsHwGGHe
 VUrfGRAAsoVknP8SPap1dFpT82+avi7knEnZ56zuwCxjXOSlXDbvsrAUFsS8Io4o
 sf60gyUnFEFLW551qXUJoSnuSjf0S63tKmnX6ebUXtxe6mVC5Y0l3VGHz8/ymbCV
 8tLFF1yx6qMEwE2WutuIIeKGdZjn4lpg2lvhtaZnzeUSBk/BQcANjPaVYKQZPx/Q
 mXqpfvJnEBxkP6gy9VlrKxkpPyR0obD2/RFcN1M5dEbk0q52KNtcwyjblYR2XmNB
 7SVmwAcRkH+7Icp14XgHZamAs9NMdAShaQ7Rov7OjEucTnot+Q5BO/3ftvFOzvGu
 GHiY4rSew6QtKv4MWIYVHGrxIm6o6Sco7EFmESEC9UDX/Ck60WAj1LY6v6jKEF0g
 nnbqxO1hoD0ygNApBXMYleut8eqiriJlXCrImlaldkG8iQqsmf11kEHagS9EVtk0
 X28/eCoyD14a90NqmY13hBf2xscU41jy+LxdYy7sisL3LC4rhGgBpE/5vd/Ynnlf
 HELeQA8/5bIOgcbVvOIFxQGC+pBwhrHxIIOF0Z6pJZzznUO2cTUepJaLgWXdne7P
 EFE30+tDfeIy/bbB6CmkPV19NW3jNlkZib28t7L9uMCShPKiaza+Qv0SgzfeEy6t
 IERhzgmPxJiJ/7fOtdCUDL8YTlisiZ9t9RbSKNbriL54JHjX+Mc=
 =TY7F
 -----END PGP SIGNATURE-----

Merge tag 'ras_core_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 RAS updates from Borislav Petkov:

 - The second part of the AMD MCA interrupts rework after the
   last-minute show-stopper from the last merge window was sorted out.
   After this, the AMD MCA deferred errors, thresholding and corrected
   errors interrupt handlers use common MCA code and are tightly
   integrated into the core MCA code, thereby getting rid of
   considerable duplication. All culminating into allowing CMCI error
   thresholding storms to be detected at AMD too, using the common
   infrastructure

 - Add support for two new MCA bank bits on AMD Zen6 which denote
   whether the error address logged is a system physical address, which
   obviates the need for it to be translated before further error
   recovery can be done

* tag 'ras_core_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Handle AMD threshold interrupt storms
  x86/mce: Do not clear bank's poll bit in mce_poll_banks on AMD SMCA systems
  x86/mce: Add support for physical address valid bit
  x86/mce: Save and use APEI corrected threshold limit
  x86/mce/amd: Define threshold restart function for banks
  x86/mce/amd: Remove redundant reset_block()
  x86/mce/amd: Support SMCA Corrected Error Interrupt
  x86/mce/amd: Enable interrupt vectors once per-CPU on SMCA systems
  x86/mce: Unify AMD DFR handler with MCA Polling
  x86/mce: Unify AMD THR handler with MCA Polling
2025-12-02 11:04:37 -08:00
Linus Torvalds
49219bba01 - imh_edac: Add a new EDAC driver for Intel Diamond Rapids and
future incarnations of this memory controllers architecture
 
 - amd64_edac: Remove the legacy csrow sysfs interface which has been
   deprecated and unused (we assume) for at least a decade
 
 - Add the capability to fallback to BIOS-provided address translation
   functionality (ACPI PRM) which can be used on systems unsupported by
   the current AMD address translation library
 
 - The usual fixes, fixlets, cleanups and improvements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmktdyMACgkQEsHwGGHe
 VUpXTxAAhdQxn1v1tYKya6YHxBS3T3Y3+4fec+LeKgoY1YnoFHMse3TAU+G67opR
 1xnEKHKrkX4v1FAwe7eD2G6qyz2ytqcApv4XGxmQ1WgldFWuPl/lI3ngPNMCHMog
 dqeQFRQ7MXsk0no0cjMA6NjafFpYOGGGhIzdU3wvgZawH4hG9wHLS6Urvn2SfWj6
 Pf/449qS7XoPU5G22qWPqqixRHpc9BPkJfKMIYeaWbxldePlwbh9cOMLqwsZo1QV
 v5cv/3CAIVFzRvNVIx05kDhRrwqTjIZL+u9IYHg2g9DA45GQuktYQwd1KksbVpUn
 CijhpKMoSnQHN+ZLW84XzvEH2rvroSTZl28d5suY1GHXG3ePc9HpmTVbVElFXWKZ
 dq0X2RIbMEbSxneePFHJ4ESUfNN2HbPSfh/sXN4epxcMQI0VWVhXYs5+Ek4UV1+E
 hvhCS/kuAypODzEi0cULoMcXdyKr2V1zpaAHNlZshepp/kUzY46b3cBhxKiL3Fsd
 x+IhZgow9a+iMJfMpCJhMABKEkoZRgS3gs5nWMJ6t0EvulvknG+aovGB/Q0VaIIa
 H69Fn+R2ewnEuZf1JGZDMit1y+wjGgeamk+uWTym+tCyNH1eHaSq48POribajcYF
 UtcobK4kG7hPodsbwwD4MhqtSLhuyIcXTHbI3x4+r+LLAgdAPKM=
 =NidS
 -----END PGP SIGNATURE-----

Merge tag 'edac_updates_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC updates from Borislav Petkov:

 - imh_edac: Add a new EDAC driver for Intel Diamond Rapids and future
   incarnations of this memory controllers architecture

 - amd64_edac: Remove the legacy csrow sysfs interface which has been
   deprecated and unused (we assume) for at least a decade

 - Add the capability to fallback to BIOS-provided address translation
   functionality (ACPI PRM) which can be used on systems unsupported by
   the current AMD address translation library

 - The usual fixes, fixlets, cleanups and improvements all over the
   place

* tag 'edac_updates_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  RAS/AMD/ATL: Replace bitwise_xor_bits() with hweight16()
  EDAC/igen6: Fix error handling in igen6_edac driver
  EDAC/imh: Setup 'imh_test' debugfs testing node
  EDAC/{skx_comm,imh}: Detect 2-level memory configuration
  EDAC/skx_common: Extend the maximum number of DRAM chip row bits
  EDAC/{skx_common,imh}: Add EDAC driver for Intel Diamond Rapids servers
  EDAC/skx_common: Prepare for skx_set_hi_lo()
  EDAC/skx_common: Prepare for skx_get_edac_list()
  EDAC/{skx_common,skx,i10nm}: Make skx_register_mci() independent of pci_dev
  EDAC/ghes: Replace deprecated strcpy() in ghes_edac_report_mem_error()
  EDAC/ie31200: Fix error handling in ie31200_register_mci
  RAS/CEC: Replace use of system_wq with system_percpu_wq
  EDAC: Remove the legacy EDAC sysfs interface
  EDAC/amd64: Remove NUM_CONTROLLERS macro
  EDAC/amd64: Generate ctl_name string at runtime
  RAS/AMD/ATL: Require PRM support for future systems
  ACPI: PRM: Add acpi_prm_handler_available()
  RAS/AMD/ATL: Return error codes from helper functions
2025-12-02 10:45:50 -08:00
Linus Torvalds
7f8d5f70ff Tree wide cleanup of the remaining users of in_irq() which got replaced
by in_hardirq() and marked deprecated in 2020.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmkvDhUTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoW+XD/959KAIm2JpcEYUWuNBmlhEyuYWvPLw
 ZyOiraLYBNyWmfCO/Yz4Ff8VZSR9gdWQoNfvBb8uxkbSXa0UOEUhCbzWsuoTnqR5
 ObTIHCJ9QmPlRiFDvs4Sf5TGmy/4nXh6/PoH3JykNdlD3rZMTxiAz/k6QuO/S2iu
 ykA+DNtNL7jDkQHzrWa3rf597BkBN1Z+hUD8zHRt8LYKRfmLYWjCMggjPLMnuqcn
 240fnV/FubCLd9f5ZgNxHQMQCQH2qB7GYMk08YwXwCZQqIIXWqbNnhedkkNO3kWq
 Sws4TEO6yg9pgTFqkuiDU5QgYEboRY4pDT45KSkdTHHGZl2OAAl3eVIGCto72UEI
 Eyzn4k900hZ1iI/Rad5mx3D4XJZEXFgEbXhjph0odn6jVvmSj+Fmg3J67u1niO2a
 obzB+xeaIkbGNQIgJFy8+A9SSnZckvuPlXdZdUxS2S95zH7f9+vBY8HWJMuyursa
 3AJAKa82mN1i3A9FdSuMTdttQWkDmrwPKVzxvixs1mBu7kB70XaRIKsPjZj7LH6X
 CiqP9Kt5FO0hVA7K+nKTeUA5DdjB4HzYzOgMqzFUhExY3hksVsj8rQEO6B0bCp9t
 CfITA3BvU7GXxhXZHOq3dABQ21J/ZHgeuK3QdQSnOxSQOv2ElYIdKvYirJy2QdS1
 tSM3O3GXb4zWDg==
 =6LKf
 -----END PGP SIGNATURE-----

Merge tag 'core-core-2025-12-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core irq cleanup from Thomas Gleixner:
 "Tree wide cleanup of the remaining users of in_irq() which got
  replaced by in_hardirq() and marked deprecated in 2020"

* tag 'core-core-2025-12-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  treewide: Remove in_irq()
2025-12-02 10:18:49 -08:00
Linus Torvalds
d42e504a55 Update to the time/timers core:
- Prevent a thundering herd problem when the timekeeper CPU is delayed
     and a large number of CPUs compete to acquire jiffies_lock to do the
     update. Limit it to one CPU with a separate "uncontended" atomic
     variable.
 
   - A set of improvements for the timer migration mechanism:
 
     - Support imbalanced NUMA trees correctly
 
     - Support dynamic exclusion of CPUs from the migrator duty to allow the
       cpuset/isolation mechanism to exclude them from handling timers of
       remote idle CPUs.
 
    - The usual small updates, cleanups and enhancements
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmks7doTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoaxrD/40nxx+8cEXsVbVLIkP2PQbd2Y8+7sk
 YbNu/Cb7j7Bg7R8YIs4p5GHk+7Yt/hNsW77SmbAzRPUyYYG6L3bUYlBa3yQlvIuo
 xRPbzGA+RJies9skIGHbQ8z6ig1zUASRJPcBYiuaVIAuQhCfLNc4Nii9cEWtjZ24
 +5gfRwV+vy74ArWwRkwaGejDK1tav+gd62OkFQZC8WtjQ08ozGZ6VBJNg7nYq/gH
 FYO1rH2tQ/ZyjlO/x5NF8gFcjYD8iv5PDp8oH35MPx+XTdDccf0G3QB7ug0ffVdV
 b4gA6lZTAmpsu/NHb6ByN4i/kf3wf8la/i+EaAh/Ov7NW078gunvVKVA7jStcbBl
 ZgG5SRHiKRvQF/WXLGVQAnilRDZwRuS0nmJlqfExa44v23l5o3768RwdRYwQlv8g
 X5KSRl0jlVgVtZHgNBlZtgX9+rnQSr9sB5sVGBP2a6a1WhVXQV/2kp0wjdnU0mPw
 jLCnSdsHqBlSf9V7O/na823WCnBFb7blrLBXUoSbHBnICqtVFzhE1kBXWw3S7Kqh
 CiaWM+S4WfR0HRnUlWMTS8BZ82MgiDnd7nGUXWwXBbdqWmoj/9CoU6SZRjbMBkzi
 EY1XvmoYf6eSzdxfydI1hFi0/bbb8K9umHQlrpW3HeN9uXnVc0/+TroVPLuaKUdi
 53ClqXjzE+CpJg==
 =lQKn
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer core updates from Thomas Gleixner:

 - Prevent a thundering herd problem when the timekeeper CPU is delayed
   and a large number of CPUs compete to acquire jiffies_lock to do the
   update. Limit it to one CPU with a separate "uncontended" atomic
   variable.

 - A set of improvements for the timer migration mechanism:

     - Support imbalanced NUMA trees correctly

     - Support dynamic exclusion of CPUs from the migrator duty to allow
       the cpuset/isolation mechanism to exclude them from handling
       timers of remote idle CPUs

 - The usual small updates, cleanups and enhancements

* tag 'timers-core-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timers/migration: Exclude isolated cpus from hierarchy
  cpumask: Add initialiser to use cleanup helpers
  sched/isolation: Force housekeeping if isolcpus and nohz_full don't leave any
  cgroup/cpuset: Rename update_unbound_workqueue_cpumask() to update_isolation_cpumasks()
  timers/migration: Use scoped_guard on available flag set/clear
  timers/migration: Add mask for CPUs available in the hierarchy
  timers/migration: Rename 'online' bit to 'available'
  selftests/timers/nanosleep: Add tests for return of remaining time
  selftests/timers: Clean up kernel version check in posix_timers
  time: Fix a few typos in time[r] related code comments
  time: tick-oneshot: Add missing Return and parameter descriptions to kernel-doc
  hrtimer: Store time as ktime_t in restart block
  timers/migration: Remove dead code handling idle CPU checking for remote timers
  timers/migration: Remove unused "cpu" parameter from tmigr_get_group()
  timers/migration: Assert that hotplug preparing CPU is part of stable active hierarchy
  timers/migration: Fix imbalanced NUMA trees
  timers/migration: Remove locking on group connection
  timers/migration: Convert "while" loops to use "for"
  tick/sched: Limit non-timekeeper CPUs calling jiffies update
2025-12-02 09:58:33 -08:00
Linus Torvalds
5028f42416 Updates for clocksource and clockevent drivers:
- A new driver for the Realtel system timer
 
  - Prevent the unbinding of timers when the drivers do not support that.
 
  - Expand the timer counter readout for the SPRD driver to 64 bit to allow
    IOT devices suspend times of more than 36 hours, which is the current
    limit of the 32-bi readout
 
  - The usual small cleanups, fixes and enhancements all over the place.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmksxAATHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYobIuD/9HNzi+SDKiWiwuhZEfwrk4IJY1k4uM
 yRrxQHt8yODKPq13M1eKNiXro3Tbhq6cLhECdQ6Rsf/g4Q0x+TeAl1M2CfHLMOJ5
 +VYNqAx7b63bkZIp1pJk8HJfn4e9itDKnqEgi0M20tIoG3K8fLtZfyIdiuqOsTia
 USWOdOqnPtwIOtVvMLPCmjYTh2FFHFxxcrQgoAW+1ACwOq/AkSSSAqKNcjEB7edH
 7C9IZpm6rCl+13ywMiHS5UsOYFWz1fOgSmQ1c7KPqx9PquMaJ7oZFAQgb2FF0xXJ
 S8DwTMKlwCO2Tq15XjmmCPLlvsGzZgVJkXhDsqyrDAZzOowqjHuT/HTrENLcE3K3
 /gS721vahsLWfJp229whKkT11RDgQOO2c/3cplsL2joUyrkDzW4sloYuu00gqWrJ
 mR9srdA7F3HeSACPb6rX64Rzg3m63P/zJ20h2uJt/JblIkZd+3kBTELM30GZRQbn
 z176KwiRPy0TDbN8pW1I4I1sLtG7zYhaEsASGZM9yH9uKYU1cLej1SmmbLqDs3oO
 e0+QyK+A4OzR43LiRltN4X3dJJ59uf+zru12WGjV85WxJsA4rN4/5q/S0xcpWR7b
 eQNXn/YZwppdlwxTg+n2RWSTzOFtvNm8nfnepxB2UqffOAa1Ah87AT3rPaUrCULj
 NwI9Fy4AY4IvVQ==
 =0426
 -----END PGP SIGNATURE-----

Merge tag 'timers-clocksource-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull clocksource updates from Thomas Gleixner:
 "Updates for clocksource and clockevent drivers:

   - A new driver for the Realtel system timer

   - Prevent the unbinding of timers when the drivers do not support
     that

   - Expand the timer counter readout for the SPRD driver to 64 bit
     to allow IOT devices suspend times of more than 36 hours, which
     is the current limit of the 32-bi readout

   - The usual small cleanups, fixes and enhancements all over the
     place"

* tag 'timers-clocksource-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource/drivers: Add Realtek system timer driver
  dt-bindings: timer: Add Realtek SYSTIMER
  clocksource/drivers/stm32-lp: Drop unused module alias
  clocksource/drivers/rda: Add sched_clock_register for RDA8810PL SoC
  clocksource/drivers/nxp-stm: Prevent driver unbind
  clocksource/drivers/nxp-pit: Prevent driver unbind
  clocksource/drivers/arm_arch_timer_mmio: Prevent driver unbind
  clocksource/drivers/nxp-stm: Fix section mismatches
  clocksource/drivers/sh_cmt: Always leave device running after probe
  clocksource/drivers/stm: Fix double deregistration on probe failure
  clocksource/drivers/ralink: Fix resource leaks in init error path
  clocksource/drivers/timer-sp804: Fix read_current_timer() issue when clock source is not registered
  clocksource/drivers/sprd: Enable register for timer counter from 32 bit to 64 bit
2025-12-02 09:54:27 -08:00
Linus Torvalds
9ce62ebbb7 Updates for [PCI] MSI related code:
- Remove one variant of PCI/MSI management as all users have been
    converted to use per device domains. That reduces the variants to two:
 
    The modern and the real archaic legacy variant, which keeps the usual
    suspects in the museum category alive.
 
  - Rework the platform MSI device ID detection mechanism in the ARM GIC
    world to address resource leaks, duplicated code and other details. This
    requires a corresponding preparatory step in the PCI/iproc driver.
 
  - Trivial core code cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmkswn0THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoYpuD/wKT7d6I6AqnJVF/RhiJ+/d6vuX/aFW
 g6E7XAkMLKhmxunSNFfPzXsHy2a0oJroYKmDJH4C8GWGo/gXa+QvmDt2491k9rdV
 zM+CBodBu3/bXWvTW+o1fbyAvG+p2C3+iSRW/gGqzPdcY8gQiRnNOZS1j7zusMjO
 A6pz5SvLSPWQUnVl9PygJBuNX5TFHPnY3AySRpW11CvqB5/8gqGz+O6lT/Q+5hov
 GUC57hskbQd1PsYhTNRaUR4z7VMolPHqscp8DYVCWjOMP/r5quC6dlsn91yxuATU
 8D7oRiW8xkCaTJplY/rA6r/VxUthZ3EgIxzev3rGaWBdPxHcFfftf2oxyFFAf3lf
 3rEdfGBcNgApx+MCcoT5/3mf3KJfn2/bE6bZhwv94+dtbTlHguztyMD3vnGTS73i
 zPWQ5ae4M5sqc8kCNMRaBfU8yQEHEKs3gia67vStZyn5R/uUNVKRo67LBPZKVDcJ
 2511Ylnm62yG6PtdPGIFHY1i75uPpxXuS7F0BJignzM3iPvVvwLPZLDORr3/pR4q
 CmswZTA2obue6+nwz/LUacxzONsZ2Z8pzGY6rrT9sfj0Z4mk6xrfEPfjfmVoMpyk
 Dk4B8lIVYwcR7d/Sw+FIwYst8iw+L77Yn7kN8yCbh4lAOxBUUvtS5KAP6uPGe3D1
 30Q/DbBVlEvg/g==
 =VAtQ
 -----END PGP SIGNATURE-----

Merge tag 'irq-msi-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull MSI updates from Thomas Gleixner:
 "Updates for [PCI] MSI related code:

   - Remove one variant of PCI/MSI management as all users have been
     converted to use per device domains. That reduces the variants to
     two:

     The modern and the real archaic legacy variant, which keeps the
     usual suspects in the museum category alive.

   - Rework the platform MSI device ID detection mechanism in the ARM
     GIC world to address resource leaks, duplicated code and other
     details. This requires a corresponding preparatory step in the
     PCI/iproc driver.

   - Trivial core code cleanups"

* tag 'irq-msi-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-its: Rework platform MSI deviceID detection
  PCI: iproc: Implement MSI controller node detection with of_msi_xlate()
  genirq/msi: Slightly simplify msi_domain_alloc()
  PCI/MSI: Delete pci_msi_create_irq_domain()
2025-12-02 09:35:59 -08:00
Linus Torvalds
15b87bec89 Boring updates for interrupt drivers:
- Support for a couple of new ARM64 and RISCV SoC variants and their
     magic interrupt controllers which either can reuse existing code or
     require quirks due to a botched hardware implementation.
 
   - More section mismatch fixes.
 
   - The usual cleanups and fixes all over the place.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmkswMYTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoZvSEACZdCx7vO2XX7oef7DxQ6EKFA/NQvd0
 xJGFTBlrxIucp26yUxWKkuDVdhu8WYe13zJG6+LVl9IxH3IIBa2duQ4HhIyqxuz6
 z74IDjBlOcKAHu2xLFJmBIS4vGTd6UPOg1KvSrIFd9oiuMXikphbnFgyrFGAFiSQ
 J1gP7mKZATUH08mTXK5k1pmBIbMjEHpyyTdBEJKoVgiN/MB/qsq95dy0Oxal+C13
 1cOKBaFreTMdX+77U5RucBcGaLHW4SdoaAVaqc/UXw2c2TAezbt/gPYexRpkdVaG
 2tuYTWIfCUuHbjUoOOYwI+ILnuiBMzjxlIUx3uSvcvtUVO4YuMDR4JOWVsevtfgI
 uUV+4OPq9kBI6PNqAyo16NhDdZ9rmjg3q14F9oyidQfR5gRbsZPPDmtCB/M2jbE1
 n3LlsHUJt0UYo8ZqCPrGhiw9hkGXv4wsEl10FKkyoNrQ0Y4SCUrdzGdr6vwhAAub
 yxMe1+BrFQT23R9l+qVrUZmDmpV9tlFNr6rPwtucrQX3PMWEfAeCc6a/vjY3eqJl
 sZ4pGyFEx0cwfKzHu1/SmNpnjSNdyc7niiN8HAQ7AnxzRW13fDdGQuuVGsKxHyJc
 Tke9wJsyUO4MxpSQDI+cmpsF8OeJDHuRDKMBdLFxlLPhABECdLUO0qKq9l0Ry/Ji
 uDkc3WvM14zKpw==
 =kdyt
 -----END PGP SIGNATURE-----

Merge tag 'irq-drivers-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq driver updates from Thomas Gleixner:
 "Boring updates for interrupt drivers:

   - Support for a couple of new ARM64 and RISCV SoC variants and their
     magic interrupt controllers which either can reuse existing code or
     require quirks due to a botched hardware implementation

   - More section mismatch fixes

   - The usual cleanups and fixes all over the place"

* tag 'irq-drivers-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
  irqchip/meson-gpio: Add support for Amlogic S6 S7 and S7D SoCs
  dt-bindings: interrupt-controller: Add support for Amlogic S6 S7 and S7D SoCs
  dt-bindings: interrupt-controller: aspeed,ast2700: Correct #interrupt-cells and interrupts count
  irqchip/aclint-sswi: Add Nuclei UX900 support
  dt-bindings: interrupt-controller: Add Anlogic DR1V90 ACLINT SSWI
  dt-bindings: interrupt-controller: Add Anlogic DR1V90 ACLINT MSWI
  dt-bindings: interrupt-controller: Add Anlogic DR1V90 PLIC
  irqchip/irq-bcm7038-l1: Remove unused reg_mask_status()
  irqchip/sifive-plic: Fix call to __plic_toggle() in M-Mode code path
  irqchip/sifive-plic: Add support for UltraRISC DP1000 PLIC
  irqchip/sifive-plic: Cache the interrupt enable state
  dt-bindings: interrupt-controller: Add UltraRISC DP1000 PLIC
  dt-bindings: vendor-prefixes: Add UltraRISC
  irqchip/qcom-irq-combiner: Rename driver structure
  irqchip/riscv-imsic: Inline imsic_vector_from_local_id()
  irqchip/riscv-imsic: Embed the vector array in lpriv
  irqchip/riscv-imsic: Remove redundant irq_data lookups
  irqchip/ts4800: Drop unused module alias
  irqchip/mvebu-pic: Drop unused module alias
  irqchip/meson-gpio: Drop unused module alias
  ...
2025-12-02 09:32:53 -08:00
Linus Torvalds
6863c8385c Updates for the interrupt core and treewide cleanups:
- Rework of the Per Processor Interrupt (PPI) management on ARM[64].
 
     PPI support was built under the assumption that the systems are
     homogenous so that the same CPU local device types are connected to
     them. That's unfortunately wishful thinking and created horrible
     workarounds.
 
     This rework provides affinity management for PPIs so that they can be
     individually configured in the firmware tables and mops up the related
     drivers all over the place.
 
   - Prevent CPUSET/isolation changes to arbitrarily affine interrupt
     threads to random CPUs, which ignores user or driver settings.
 
   - Plug a harmless race in the interrupt affinity proc interface, which
     allows to see a half updated mask
 
   - Adjust the priority of secondary interrupt threads on RT, so that the
     combination of primary and secondary thread emulates the hardware
     interrupt plus thread scenario. Having them at the same priority can
     cause starvation issues in some drivers.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmksv3oTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoe5+D/wNnBaX9LRajuLOF+zaYw5WZxkzp6U7
 X4AP3cLny8xynI1kM5V8M1ym3Fspk0hiqxNX2LLXrSZzBR+3O4uGCyCceBXeHKo2
 vW4auUXG4MB+2sZyudQXaBpNK4A2YBubycTUcRECjkjDkBPAWgN7J+Oz2lXUSUcH
 zlitlHNo48hnZQPAJr4PDpi5q9+rChn+8/s+K1d8NlEf9HOXC98qzyMuMq+jHdJE
 AQ6tKoHkA5lHjHAUY3AbWptoHo1Wp+p5PSqsrFr6nbKuPlhUqRNEPXX0Z8q7aUTj
 NgdkvIHJVJ0C+T40FIWCNzUYOUk4gTQXBSPvptwJSHAmf9ovp+Kg2ltVZBzyL2iI
 R0EZSQAQU8iJcRrqjcAYqI36LkmwwVT6RD1zFa98xJT/AjsMpAt/U1pEMDtkoTKe
 Lv7ZQ/hloc+4wV4xS4zEtoV/ukdUfA9aEdXsh5hNH/07tvatpKO2LgortsiI+lCK
 76vAULcGvbMr5Jr63snjICgstahunpNMRn2HmnGAjmdZf4+g+TDvZR4DI6bswtuO
 jp5G6OM30Z9zKheAr1VioV1XAKr6Y4jDKVjfFy/n1k5pDwYaSJopmZxSD35aas4e
 VqWizAzc5dAVCYRlzr6S1lrMQ2JJRg0RpIn+sMS8dhf9SK7hs5ilGSOsgX1fgVat
 1N3WXvYM8vSW+g==
 =zrA1
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq core updates from Thomas Gleixner:
 "Updates for the interrupt core and treewide cleanups:

   - Rework of the Per Processor Interrupt (PPI) management on ARM[64]

     PPI support was built under the assumption that the systems are
     homogenous so that the same CPU local device types are connected to
     them. That's unfortunately wishful thinking and created horrible
     workarounds.

     This rework provides affinity management for PPIs so that they can
     be individually configured in the firmware tables and mops up the
     related drivers all over the place.

   - Prevent CPUSET/isolation changes to arbitrarily affine interrupt
     threads to random CPUs, which ignores user or driver settings.

   - Plug a harmless race in the interrupt affinity proc interface,
     which allows to see a half updated mask

   - Adjust the priority of secondary interrupt threads on RT, so that
     the combination of primary and secondary thread emulates the
     hardware interrupt plus thread scenario. Having them at the same
     priority can cause starvation issues in some drivers"

* tag 'irq-core-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
  genirq: Remove cpumask availability check on kthread affinity setting
  genirq: Fix interrupt threads affinity vs. cpuset isolated partitions
  genirq: Prevent early spurious wake-ups of interrupt threads
  genirq: Use raw_spinlock_irq() in irq_set_affinity_notifier()
  genirq/manage: Reduce priority of forced secondary interrupt handler
  genirq/proc: Fix race in show_irq_affinity()
  genirq: Fix percpu_devid irq affinity documentation
  perf: arm_pmu: Kill last use of per-CPU cpu_armpmu pointer
  irqdomain: Kill of_node_to_fwnode() helper
  genirq: Kill irq_{g,s}et_percpu_devid_partition()
  irqchip: Kill irq-partition-percpu
  irqchip/apple-aic: Drop support for custom PMU irq partitions
  irqchip/gic-v3: Drop support for custom PPI partitions
  coresight: trbe: Request specific affinities for per CPU interrupts
  perf: arm_spe_pmu: Request specific affinities for per CPU interrupts
  perf: arm_pmu: Request specific affinities for per CPU NMIs/interrupts
  genirq: Add request_percpu_irq_affinity() helper
  genirq: Allow per-cpu interrupt sharing for non-overlapping affinities
  genirq: Update request_percpu_nmi() to take an affinity
  genirq: Add affinity to percpu_devid interrupt requests
  ...
2025-12-02 09:14:26 -08:00
Linus Torvalds
312f5b1866 Two small updates for debugobjects:
- Allow pool refill on RT enabled kernels before the scheduler is up
       and running to prevent pool exhaustion
 
     - Correct the lockdep override to prevent false positives.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmksu/UTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoScuD/9g3OtG29VZ2uNkOhgvGKuAThxZ+Y4d
 8YPNT/X9SuSevuwBCc9zXgpc5S4Af40ndRbcsiZ38t/xrInE6+J6qPJ7BbKCTiac
 cYvNz4ibMx1qz35BPtym4RnJyZA2EHX2hVIFGCfdh4MkILI7r3OPjemX542epZAW
 8MdKu5WZJNa8KYIvUE1UZdDtH1imxU7jdYBkr1ockN66+HMjRKHxcPwrhTCFJeCT
 N4DHOQ+hf9NzipHpRppDmqwkzQCOyKrOojXht00rG92QXIzmZRepH93cCFi/nW1d
 8aUjHU6myNQa65VkFDM2I2bpzCzlK7HpBU3iNXEkXPLZ8bVrYMP9koK+SXIa+Gj0
 icXdJwBe9uOKQOaG6MRSO2hn3fHO0m+PjZGtQFg7EqFCaY0J+8tv9k3WttDDpfMg
 hjXjyJ0U9T+/YUuSDBLdPczIJZr8eGh960SF0OTshHGGVOCGJt4dlvoC0NtUxN8x
 WQ/he9K/Cyz7U6yr1aNO6hAfqX/+6c0ZhD3OONuC9xgxHUkjPdlEe1ntLbdfn92z
 VygbJaguvRdzkAeaAlXNNU5WTNvm3ZeLPqDnnUHUlDW1f7hF0KwCrfZUW0PqdB76
 +94ptMeIlCv53zIEKamHuALGp7WtGddGGzaZLH8rUnPxfiff+JiMhXtV0ioMuUNG
 jpdlyBMXK+s0PA==
 =dUo2
 -----END PGP SIGNATURE-----

Merge tag 'core-debugobjects-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull debugobjects update from Thomas Gleixner:
 "Two small updates for debugobjects:

   - Allow pool refill on RT enabled kernels before the scheduler is up
     and running to prevent pool exhaustion

   - Correct the lockdep override to prevent false positives"

* tag 'core-debugobjects-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  debugobjects: Use LD_WAIT_CONFIG instead of LD_WAIT_SLEEP
  debugobjects: Allow to refill the pool before SYSTEM_SCHEDULING
2025-12-02 09:07:48 -08:00
Linus Torvalds
2b09f480f0 A large overhaul of the restartable sequences and CID management:
The recent enablement of RSEQ in glibc resulted in regressions which are
   caused by the related overhead. It turned out that the decision to invoke
   the exit to user work was not really a decision. More or less each
   context switch caused that. There is a long list of small issues which
   sums up nicely and results in a 3-4% regression in I/O benchmarks.
 
   The other detail which caused issues due to extra work in context switch
   and task migration is the CID (memory context ID) management. It also
   requires to use a task work to consolidate the CID space, which is
   executed in the context of an arbitrary task and results in sporadic
   uncontrolled exit latencies.
 
   The rewrite addresses this by:
 
   - Removing deprecated and long unsupported functionality
 
   - Moving the related data into dedicated data structures which are
     optimized for fast path processing.
 
   - Caching values so actual decisions can be made
 
   - Replacing the current implementation with a optimized inlined variant.
 
   - Separating fast and slow path for architectures which use the generic
     entry code, so that only fault and error handling goes into the
     TIF_NOTIFY_RESUME handler.
 
   - Rewriting the CID management so that it becomes mostly invisible in the
     context switch path. That moves the work of switching modes into the
     fork/exit path, which is a reasonable tradeoff. That work is only
     required when a process creates more threads than the cpuset it is
     allowed to run on or when enough threads exit after that. An artificial
     thread pool benchmarks which triggers this did not degrade, it actually
     improved significantly.
 
     The main effect in migration heavy scenarios is that runqueue lock held
     time and therefore contention goes down significantly.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmksaRYTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoencEADA5he8PAFPmSRRPo6+2G5mHzWe8kIU
 5ZViQStWFNAA0qqy8VXryWiJ6qqrO6la9o7K4YOXASUtlkVjquRp1DF7PabqGwuy
 zshbRCXNlT51J8uqanN8VrGVjlf+bMdHDbGoI1SLkUTxG8b+kDD5PXUQE1ARelPP
 Slbg9u+EMrxj6D5MDTPbuW6TqryJEkPtiNScyOz43emp9ww9+WVxenOcRqU4D+Th
 mjWmrGIzkroSf4XReMoD/wg9TPTpUjXnNCwl2viY9JvBpkMfYtU4tJAGK3aNFOWy
 zsAN0O9CaFGrUEFne7qUmtwhNLdtnjx5HN5pe7yZd1EhdTuQKq4jPiiQnwwm8w72
 c0o6m45FNPmPoSyfaZWCkLjbTEUXonT9JF61iN35JVxim8gBDDJjHFKnLxDmLrH3
 X0eESE48ReY2EneDV6Y8RJRo6oG14Fccvc39aTf/2Rw3trpmtt2agvConQzupQIg
 DzANw4jhUUzFRrHrMHACNsqKFXh9ratue/S9DM3xxTpGO/bKdeK7jGIgzNf8O34M
 J0O6Hvk5jMdcWlIJTx21GoGzoSkkXnR49g/71aCcp+MwdY4x9zFz5SWi8LWQRmkx
 xRo6tY27Bma8/SEwMJjPpAUXDTpq6v+j3cPisybL1yGsyt9lh+p8LX7VUtwcoEqe
 6ZelC5Kgw/+/kg==
 =n5KT
 -----END PGP SIGNATURE-----

Merge tag 'core-rseq-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull rseq updates from Thomas Gleixner:
 "A large overhaul of the restartable sequences and CID management:

  The recent enablement of RSEQ in glibc resulted in regressions which
  are caused by the related overhead. It turned out that the decision to
  invoke the exit to user work was not really a decision. More or less
  each context switch caused that. There is a long list of small issues
  which sums up nicely and results in a 3-4% regression in I/O
  benchmarks.

  The other detail which caused issues due to extra work in context
  switch and task migration is the CID (memory context ID) management.
  It also requires to use a task work to consolidate the CID space,
  which is executed in the context of an arbitrary task and results in
  sporadic uncontrolled exit latencies.

  The rewrite addresses this by:

   - Removing deprecated and long unsupported functionality

   - Moving the related data into dedicated data structures which are
     optimized for fast path processing.

   - Caching values so actual decisions can be made

   - Replacing the current implementation with a optimized inlined
     variant.

   - Separating fast and slow path for architectures which use the
     generic entry code, so that only fault and error handling goes into
     the TIF_NOTIFY_RESUME handler.

   - Rewriting the CID management so that it becomes mostly invisible in
     the context switch path. That moves the work of switching modes
     into the fork/exit path, which is a reasonable tradeoff. That work
     is only required when a process creates more threads than the
     cpuset it is allowed to run on or when enough threads exit after
     that. An artificial thread pool benchmarks which triggers this did
     not degrade, it actually improved significantly.

     The main effect in migration heavy scenarios is that runqueue lock
     held time and therefore contention goes down significantly"

* tag 'core-rseq-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
  sched/mmcid: Switch over to the new mechanism
  sched/mmcid: Implement deferred mode change
  irqwork: Move data struct to a types header
  sched/mmcid: Provide CID ownership mode fixup functions
  sched/mmcid: Provide new scheduler CID mechanism
  sched/mmcid: Introduce per task/CPU ownership infrastructure
  sched/mmcid: Serialize sched_mm_cid_fork()/exit() with a mutex
  sched/mmcid: Provide precomputed maximal value
  sched/mmcid: Move initialization out of line
  signal: Move MMCID exit out of sighand lock
  sched/mmcid: Convert mm CID mask to a bitmap
  cpumask: Cache num_possible_cpus()
  sched/mmcid: Use cpumask_weighted_or()
  cpumask: Introduce cpumask_weighted_or()
  sched/mmcid: Prevent pointless work in mm_update_cpus_allowed()
  sched/mmcid: Move scheduler code out of global header
  sched: Fixup whitespace damage
  sched/mmcid: Cacheline align MM CID storage
  sched/mmcid: Use proper data structures
  sched/mmcid: Revert the complex CID management
  ...
2025-12-02 08:48:53 -08:00
Linus Torvalds
1dce50698a Scoped user mode access and related changes:
- Implement the missing u64 user access function on ARM when
    CONFIG_CPU_SPECTRE=n. This makes it possible to access a 64bit value in
    generic code with [unsafe_]get_user(). All other architectures and ARM
    variants provide the relevant accessors already.
 
  - Ensure that ASM GOTO jump label usage in the user mode access helpers
    always goes through a local C scope label indirection inside the
    helpers. This is required because compilers are not supporting that a
    ASM GOTO target leaves a auto cleanup scope. GCC silently fails to emit
    the cleanup invocation and CLANG fails the build.
 
    This provides generic wrapper macros and the conversion of affected
    architecture code to use them.
 
  - Scoped user mode access with auto cleanup
 
    Access to user mode memory can be required in hot code paths, but if it
    has to be done with user controlled pointers, the access is shielded
    with a speculation barrier, so that the CPU cannot speculate around the
    address range check. Those speculation barriers impact performance quite
    significantly. This can be avoided by "masking" the provided pointer so
    it is guaranteed to be in the valid user memory access range and
    otherwise to point to a guaranteed unpopulated address space. This has
    to be done without branches so it creates an address dependency for the
    access, which the CPU cannot speculate ahead.
 
    This results in repeating and error prone programming patterns:
 
      	    if (can_do_masked_user_access())
                     from = masked_user_read_access_begin((from));
             else if (!user_read_access_begin(from, sizeof(*from)))
                     return -EFAULT;
             unsafe_get_user(val, from, Efault);
             user_read_access_end();
             return 0;
       Efault:
             user_read_access_end();
             return -EFAULT;
 
     which can be replaced with scopes and automatic cleanup:
 
             scoped_user_read_access(from, Efault)
                     unsafe_get_user(val, from, Efault);
             return 0;
        Efault:
             return -EFAULT;
 
  - Convert code which implements the above pattern over to
    scope_user.*.access(). This also corrects a couple of imbalanced
    masked_*_begin() instances which are harmless on most architectures, but
    prevent PowerPC from implementing the masking optimization.
 
  - Add a missing speculation barrier in copy_from_user_iter()
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmksRfITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoVhBEACEySjWcyCrD1e0ZFMFAOJZFI2BShav
 reotzCzmHYQdpVukDRxc64BgM2vN4yB04xnyMhi2o4hSTiIJhz1NzbKggsQJhVoA
 psYz+xEI161HuLZnUBUBuF9RRko/HVsbGqO2JFCuOKor4GCycvjVgupR3EIN9h5T
 HZEWGIgaTmN7MBj0QRrJgJkaaSTnPKOwWaNMV/F9pfk27zuB7vuV8WM9P3FaJYG+
 JGa9td7VGaBpWavxgMJqfdvXWBCVDDfZ1dunWx8tPTnLxKZZZD6HlfQXhZTr2n1e
 rtJpGgfVBx5Uqxn4RrhS0I7QeK1b9rrt3IU7EkFoaa3Z8LU5B7cHlm7KyicyoHhy
 SzFFUszssznT/0OhA5fmgPRlqI295HynW2p1L4Xy9hC0EZ2vXJPG5rO6X3x6QwSR
 asjRB7x/6JzWQUzE7/nhXd9KcB66wvQxhnjp7GqulF74aPBCtIdXXDD68YEDYkbi
 dPC3NRBr0ePbsGVGWbYvYIPWcvo1u814C2io1zKwmVbiN6lCYURgQK861vfAZUP8
 oP5D2a6ENgezDKoJo6eJ82inuDu64qZy7OOkU/aO3cbOuWGVyY9CjYD11x85Nr0k
 UNabSOfvcmhmobtYUiAgLLrjX1grQUG3F74ZQTw513mwgMObuDAAoS11GPjY6HL6
 b99WUJRv8jP66A==
 =6no0
 -----END PGP SIGNATURE-----

Merge tag 'core-uaccess-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scoped user access updates from Thomas Gleixner:
 "Scoped user mode access and related changes:

   - Implement the missing u64 user access function on ARM when
     CONFIG_CPU_SPECTRE=n.

     This makes it possible to access a 64bit value in generic code with
     [unsafe_]get_user(). All other architectures and ARM variants
     provide the relevant accessors already.

   - Ensure that ASM GOTO jump label usage in the user mode access
     helpers always goes through a local C scope label indirection
     inside the helpers.

     This is required because compilers are not supporting that a ASM
     GOTO target leaves a auto cleanup scope. GCC silently fails to emit
     the cleanup invocation and CLANG fails the build.

     [ Editor's note: gcc-16 will have fixed the code generation issue
       in commit f68fe3ddda4 ("eh: Invoke cleanups/destructors in asm
       goto jumps [PR122835]"). But we obviously have to deal with clang
       and older versions of gcc, so.. - Linus ]

     This provides generic wrapper macros and the conversion of affected
     architecture code to use them.

   - Scoped user mode access with auto cleanup

     Access to user mode memory can be required in hot code paths, but
     if it has to be done with user controlled pointers, the access is
     shielded with a speculation barrier, so that the CPU cannot
     speculate around the address range check. Those speculation
     barriers impact performance quite significantly.

     This cost can be avoided by "masking" the provided pointer so it is
     guaranteed to be in the valid user memory access range and
     otherwise to point to a guaranteed unpopulated address space. This
     has to be done without branches so it creates an address dependency
     for the access, which the CPU cannot speculate ahead.

     This results in repeating and error prone programming patterns:

       	    if (can_do_masked_user_access())
                      from = masked_user_read_access_begin((from));
              else if (!user_read_access_begin(from, sizeof(*from)))
                      return -EFAULT;
              unsafe_get_user(val, from, Efault);
              user_read_access_end();
              return 0;
        Efault:
              user_read_access_end();
              return -EFAULT;

      which can be replaced with scopes and automatic cleanup:

              scoped_user_read_access(from, Efault)
                      unsafe_get_user(val, from, Efault);
              return 0;
         Efault:
              return -EFAULT;

   - Convert code which implements the above pattern over to
     scope_user.*.access(). This also corrects a couple of imbalanced
     masked_*_begin() instances which are harmless on most
     architectures, but prevent PowerPC from implementing the masking
     optimization.

   - Add a missing speculation barrier in copy_from_user_iter()"

* tag 'core-uaccess-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lib/strn*,uaccess: Use masked_user_{read/write}_access_begin when required
  scm: Convert put_cmsg() to scoped user access
  iov_iter: Add missing speculation barrier to copy_from_user_iter()
  iov_iter: Convert copy_from_user_iter() to masked user access
  select: Convert to scoped user access
  x86/futex: Convert to scoped user access
  futex: Convert to get/put_user_inline()
  uaccess: Provide put/get_user_inline()
  uaccess: Provide scoped user access regions
  arm64: uaccess: Use unsafe wrappers for ASM GOTO
  s390/uaccess: Use unsafe wrappers for ASM GOTO
  riscv/uaccess: Use unsafe wrappers for ASM GOTO
  powerpc/uaccess: Use unsafe wrappers for ASM GOTO
  x86/uaccess: Use unsafe wrappers for ASM GOTO
  uaccess: Provide ASM GOTO safe wrappers for unsafe_*_user()
  ARM: uaccess: Implement missing __get_user_asm_dword()
2025-12-02 08:01:39 -08:00
Linus Torvalds
4a26e7032d Core kernel bug handling infrastructure changes for v6.19:
- Improve WARN(), which has vararg printf like arguments,
     to work with the x86 #UD based WARN-optimizing infrastructure
     by hiding the format in the bug_table and replacing this
     first argument with the address of the bug-table entry,
     while making the actual function that's called a UD1 instruction.
     (Peter Zijlstra)
 
   - Introduce the CONFIG_DEBUG_BUGVERBOSE_DETAILED Kconfig switch
     (Ingo Molnar, s390 support by Heiko Carstens)
 
 Fixes and cleanups:
 
   - bugs/s390: Remove private WARN_ON() implementation (Heiko Carstens)
 
   - <asm/bugs.h>: Make i386 use GENERIC_BUG_RELATIVE_POINTERS
     (Peter Zijlstra)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmktffcRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gLCQ//ZMHTpv6UF+xY4Kq3rdVFJWRR7Prs2hDn
 4V/cFHu7Xb2lRwpJDBNWnbiEkLbGR5qlWD+0CpAMK4mGuL9VjnoeflEzXxOPP9W1
 7SWw1HDc6NkjHM3BNnvkPQbzWF4RL4ZGs4Lbb1nv7pjoTSdBMXNrD5RL+iNmfHHd
 QfOG9ZiRyD5A/b07bfyjIXNaeC/Hot+FeVXTMfD7/vCfc2ywhL2Sm5G/igY/shTY
 Una7q8sbFDD/bFFtWSR2eFQeHQQfT6c/Pu39ZcAIdTlLk1FtZ+A2wcRtwYv/FdPV
 6KDOAxZK7fLMHoCdNTswsg+LbazhABOb/V1J7TaHq2tF/PeyN+B+ucxVY2KFcxQJ
 V5eG5crMrUIL22QO/UaT3dPWxGbJYkNlAWl416tAKdgXA52W4OsPd+k4DGjeP569
 KogAy3SY0D/80v1QN2HcFEJMvr7W2SukxtErqdtA5OKt/ZevB49lGqZoBecqASDO
 QjI1K0yLKnb+erbMIpCpNj+o/Fr1JQgWbYVpwipL2GON4an6vrowimGTsUG7qqxN
 Hwb7IaTNnWQ/4iyCkVV44q6Ln1gyP2hz5Lnzo7QJINUdzp98UjJLAEK4NRG44T4L
 p0t5NMxnvREpAv35rwy03xhmITYeQMOqzN9JEBBvegQyld5ghbThxI7g9BDzcXyL
 Bd0mF7WOV9M=
 =bMUL
 -----END PGP SIGNATURE-----

Merge tag 'core-bugs-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull bug handling infrastructure updates from Ingo Molnar:
 "Core updates:

   - Improve WARN(), which has vararg printf like arguments, to work
     with the x86 #UD based WARN-optimizing infrastructure by hiding the
     format in the bug_table and replacing this first argument with the
     address of the bug-table entry, while making the actual function
     that's called a UD1 instruction (Peter Zijlstra)

   - Introduce the CONFIG_DEBUG_BUGVERBOSE_DETAILED Kconfig switch (Ingo
     Molnar, s390 support by Heiko Carstens)

  Fixes and cleanups:

   - bugs/s390: Remove private WARN_ON() implementation (Heiko Carstens)

   - <asm/bugs.h>: Make i386 use GENERIC_BUG_RELATIVE_POINTERS (Peter
     Zijlstra)"

* tag 'core-bugs-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
  x86/bugs: Make i386 use GENERIC_BUG_RELATIVE_POINTERS
  x86/bug: Fix BUG_FORMAT vs KASLR
  x86_64/bug: Inline the UD1
  x86/bug: Implement WARN_ONCE()
  x86_64/bug: Implement __WARN_printf()
  x86/bug: Use BUG_FORMAT for DEBUG_BUGVERBOSE_DETAILED
  x86/bug: Add BUG_FORMAT basics
  bug: Allow architectures to provide __WARN_printf()
  bug: Implement WARN_ON() using __WARN_FLAGS()
  bug: Add report_bug_entry()
  bug: Add BUG_FORMAT_ARGS infrastructure
  bug: Clean up CONFIG_GENERIC_BUG_RELATIVE_POINTERS
  bug: Add BUG_FORMAT infrastructure
  x86: Rework __bug_table helpers
  bugs/s390: Remove private WARN_ON() implementation
  bugs/core: Reorganize fields in the first line of WARNING output, add ->comm[] output
  bugs/sh: Concatenate 'cond_str' with '__FILE__' in __WARN_FLAGS(), to extend WARN_ON/BUG_ON output
  bugs/parisc: Concatenate 'cond_str' with '__FILE__' in __WARN_FLAGS(), to extend WARN_ON/BUG_ON output
  bugs/riscv: Concatenate 'cond_str' with '__FILE__' in __BUG_FLAGS(), to extend WARN_ON/BUG_ON output
  bugs/riscv: Pass in 'cond_str' to __BUG_FLAGS()
  ...
2025-12-01 21:33:01 -08:00
Linus Torvalds
dcd8637edb Core x86 changes for v6.19:
- x86/alternatives: Drop unnecessary test after call to
    alt_replace_call() (Juergen Gross)
 
  - x86/dumpstack: Prevent KASAN false positive warnings in
    __show_regs() (Tengda Wu)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmktetMRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1jLBRAAluu/0ACAzh1A5gw6Mr9uQ45KFxCt4HZa
 gjIyfDNqsdb6fqJ3yFAzw36gYViIzEGZo26S0mGWXqLVmMHus2HqvHTxYCPA7mov
 YfMxMUKsyTIaDoGHOYuhHwwdJL+KzXuKQri5/5MVbxNPPYedk4qV4weTTDPz/pkV
 biYOw6S4GDsOC5O1RT1AX+NBcIZkQzqoTSnkCXjDSCxXpVxYg7CQ6besMtkCCArZ
 QMl5ZCh6cEhhyOXIVL1Huz9pxxWpzxXM0oqNzhBjCjijYN5HGxL5kUv9tWHatP8h
 Q5SFnhyhPu/GtzZUw32O5r7sC93HweJeUtvaE0ekTSMK8A/oIwBup3Z4pRGpY/A2
 B16VAYSEYzoJEUmptPkHTbOouKMR45P02M0RM5rkojeh75LRivW6T0Zd1+LkEXJr
 yVwWfr+DoXaTtyK1GYdQNbwEWjpIxUwcBtVZN2h3Ajo28yPwB6Eh0mHUpl1Yol83
 Ex5XL9lsCljIkmDTup+6ai9fyVpsRUbLANnBZ76DiRRw44vUto/Dj8pkIOysZ/8/
 K/zVxou2KFtIDB/oFOIc1AIEjenq6IXqLhyKoerm2c0cALH67KUsFmHTH8/6hbUV
 WLWtROB9tLizBBE7Xo4uu1LkmIMWVZ8nHbX/QYQSQBv2Nhw29a681JaHjVgRYBIJ
 K97PRgZS6dM=
 =i6oY
 -----END PGP SIGNATURE-----

Merge tag 'x86-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core x86 updates from Ingo Molnar:

 - x86/alternatives: Drop unnecessary test after call to
   alt_replace_call() (Juergen Gross)

 - x86/dumpstack: Prevent KASAN false positive warnings in
   __show_regs() (Tengda Wu)

* tag 'x86-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/dumpstack: Prevent KASAN false positive warnings in __show_regs()
  x86/alternative: Drop not needed test after call of alt_replace_call()
2025-12-01 21:31:02 -08:00
Linus Torvalds
e7d81c1ed6 A single fix for an ancient prototype in the math-emu code, by Arnd Bergmann.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmktecYRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gu0Q//S8JrbuW+o5XecSPgK+TEk8RhLRUYZPNK
 pObhFOZbWN8XPpVLGfHDLKTahyWNki8zfuTgdL+jwiCMvCp8u/vrBUr2bRYahDuI
 I4V6pau7zqSvDFG8bqpYbYyo8h0cGXm/f74wquSiaO7B6kEATDAQKUzCYOg1aTHR
 +hDYjgJssjBJg5e/F7gwri/V008aNCAJhOAlD75Yf6O209sMt8PjwaLQ9m+5zDPX
 9WD9wXAlV5sDcz7ukcsggKPc0fXKaqqjEu/mFXOHCe5qFbXeUQ+CE+ExGVT6oBo+
 3u/m4uH+TZ3LpP1ebWtGD22jZbd4xjKj5XyJ7KEh8zpu9xRaF4h0J5koLHRKvo9H
 KZ9j+tSQ7N6jH+fW0cj2NWA9onicSDgu5t0m5354Ui0TSjAqSFcErfMJJL+gEjHr
 zUHqIBvcMZqreg2KYYeS0VAslbD1Wr0f4GWBlJ26h71DZf5orHVYCaktWfCYCd72
 6HW34USKbzXK/X4ygl3VIvBOxSXeeOz8ce0nBJTZpwJsmMbQsmVFqyvqa8TLf131
 xntjS4mPkhIrnh8JpwwhoYC9dN58OtYbQ2cmkiXLwkqn60u3bkQATir0Zb1As0yz
 lZ7sdtxJ+OBImctySTM+mz1OjO8pNMmb37ZNcujuuSHaPS22cHqFg4TVpQWmDG3C
 ebH3gi+irqs=
 =hfpM
 -----END PGP SIGNATURE-----

Merge tag 'x86-build-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 math-emu fix from Ingo Molnar:
 "A single fix for an ancient prototype in the math-emu code, by Arnd
  Bergmann"

* tag 'x86-build-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/math-emu: Fix div_Xsig() prototype
2025-12-01 21:28:23 -08:00
Linus Torvalds
de2f75d55e x86/apic changes for v6.19:
- x86/apic: Fix the frequency in apic=verbose log output (Julian Stecklina)
 
   - Simplify mp_irqdomain_alloc() slightly (Christophe JAILLET)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmktePURHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iIsQ//ZGocltNkF7GrYytJbLTactwvj/w1C9ly
 ejEh0enSVibyvc+ADkEwFBPIXqQDo9KssYlwTZmgI3SSv1nohMch7CWPQ87Rnajo
 NxzWueS4q5Yp/IRVAiwXME7cYmJ2MtGxRk+gyq8RSFoV2gVOmPkFlWX0DZzDv5Oi
 Z0OPM2sTlw+hrFKd0DoG5PySaYhVooYfKynXCFFwn8KrRNYuEfQwA4wZFesn2+u2
 xwKaAKricpePG2/th3DdcBRUhkVZliVIUitbBUW5Ya9LsDeYkaq5+vLmywWEQC1q
 WCY8J3ptlH02E9xVoNA0M3lPhx5XPCcYQvSkgIobDE4d41e8yAj7Rgx1l5opNK6b
 KRMcsL9eBRMDbFSY0x6YA17XTz0yCN49BtiCx8hTGhPhBNyWICw5n8t1k5kTH9XQ
 Aj3oGSdtBstJIZjM4biPC5sP8HPXNn/QCHwMXBZEWG5IbyjbEFWGQvd3w2cNTuBd
 nMxaovs7eo+Q9ftbVlVJFhIgq9wEUUG9rTm7KLdqjglolhn7WzC/T7iVw+o8X9OX
 3IZRYimbfoP90vgqROVQiRjPxMj7t2F8Y11Bf4jIizTeSSCZQhrXQjZdUX1JNY3F
 T2Yhk8AXBaUEgf7kZlwzy+jco0Sd60MWHD18k1Ouot/RONgAbmkTn2BIXD/CkkkP
 UjiPoILd1kg=
 =RTrX
 -----END PGP SIGNATURE-----

Merge tag 'x86-apic-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 apic updates from Ingo Molnar:

 - x86/apic: Fix the frequency in apic=verbose log output (Julian
   Stecklina)

 - Simplify mp_irqdomain_alloc() slightly (Christophe JAILLET)

* tag 'x86-apic-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic: Fix frequency in apic=verbose log output
  x86/ioapic: Simplify mp_irqdomain_alloc() slightly
2025-12-01 21:26:35 -08:00
Linus Torvalds
6d2c10e889 Scheduler changes for v6.19:
Scalability and load-balancing improvements:
 
   - Enable scheduler feature NEXT_BUDDY (Mel Gorman)
 
   - Reimplement NEXT_BUDDY to align with EEVDF goals (Mel Gorman)
 
   - Skip sched_balance_running cmpxchg when balance is not due (Tim Chen)
 
   - Implement generic code for architecture specific sched domain
     NUMA distances (Tim Chen)
 
   - Optimize the NUMA distances of the sched-domains builds of Intel
     Granite Rapids (GNR) and Clearwater Forest (CWF) platforms
     (Tim Chen)
 
   - Implement proportional newidle balance: a randomized algorithm
     that runs newidle balancing proportional to its success rate.
     (Peter Zijlstra)
 
 Scheduler infrastructure changes:
 
   - Implement the 'sched_change' scoped_guard() pattern for
     the entire scheduler (Peter Zijlstra)
 
   - More broadly utilize the sched_change guard (Peter Zijlstra)
 
   - Add support to pick functions to take runqueue-flags (Joel Fernandes)
 
   - Provide and use set_need_resched_current() (Peter Zijlstra)
 
 Fair scheduling enhancements:
 
   - Forfeit vruntime on yield (Fernand Sieber)
   - Only update stats for allowed CPUs when looking for dst group (Adam Li)
 
 CPU-core scheduling enhancements:
 
   - Optimize core cookie matching check (Fernand Sieber)
 
 Deadline scheduler fixes:
 
   - Only set free_cpus for online runqueues (Doug Berger)
   - Fix dl_server time accounting (Peter Zijlstra)
   - Fix dl_server stop condition (Peter Zijlstra)
 
 Proxy scheduling fixes:
 
   - Yield the donor task (Fernand Sieber)
 
 Fixes and cleanups:
 
   - Fix do_set_cpus_allowed() locking (Peter Zijlstra)
   - Fix migrate_disable_switch() locking (Peter Zijlstra)
   - Remove double update_rq_clock() in __set_cpus_allowed_ptr_locked() (Hao Jia)
   - Increase sched_tick_remote timeout (Phil Auld)
   - sched/deadline: Use cpumask_weight_and() in dl_bw_cpus() (Shrikanth Hegde)
   - sched/deadline: Clean up select_task_rq_dl() (Shrikanth Hegde)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmktd+MRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hRFw//TNK2tN9D/U3rh66RU/evkv4ZoEfdkqwl
 L3MDs9evupkggELFBjImd+lxDTjKnvq6aSb9ryJEbmivJMVUAcCF1fBILcs5m51N
 pCIapavjsENMdCPRzC7xDE3t5GSAQ35NIWkLbr5d4bpEIp8YmchOKoikWNMT09A1
 ifHJ+BMWiX5AY3r0bcvsi8XRzo/bSw1OA+gTh0BUyD9VBbIrqjvYR+W6nJXw6FKB
 rHp+VlaidFIbfsi25KU7Ixn52K4Cqm0AlapMIlDZABPZpKFAkhDCRSb6CGigZl4T
 rNSSmnkmF6GFVVIKnNfWiW2OsojnS9mm1tAiM8huu+KumUqFXhfzlfoNPF9oMphR
 TCeyK66br55/WK86Of5zowxQLwj0/nLHCT9HbV4Bre7kakTUJjtFKBqxBcdoENjt
 bx/tLoFeS4EeMSULAT0vZvE064XT0gHnRU0UXSBwRTuNcjMsN6RQVFWfkBHRx90g
 HLoK2AkvdqUFeiHWXiIx9wu6CpWBPPmiGL3+7RyXJ93z2EccmCRP5jmzYFxNoaD/
 uKnz9gYETNyVIhU44pSJcImpxr2m9SKfqetouhaYrPi9SRhhOBP+ZSzNCUPMQeIN
 ZVpzYinMhZo4m0c8W2hitlGLmj3hIgt7vesyiFlqMJrS3Y5PNnPsN608PI27IGLa
 GSbXtohZZIM=
 =BZY4
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:
 "Scalability and load-balancing improvements:

   - Enable scheduler feature NEXT_BUDDY (Mel Gorman)

   - Reimplement NEXT_BUDDY to align with EEVDF goals (Mel Gorman)

   - Skip sched_balance_running cmpxchg when balance is not due (Tim
     Chen)

   - Implement generic code for architecture specific sched domain NUMA
     distances (Tim Chen)

   - Optimize the NUMA distances of the sched-domains builds of Intel
     Granite Rapids (GNR) and Clearwater Forest (CWF) platforms (Tim
     Chen)

   - Implement proportional newidle balance: a randomized algorithm that
     runs newidle balancing proportional to its success rate. (Peter
     Zijlstra)

  Scheduler infrastructure changes:

   - Implement the 'sched_change' scoped_guard() pattern for the entire
     scheduler (Peter Zijlstra)

   - More broadly utilize the sched_change guard (Peter Zijlstra)

   - Add support to pick functions to take runqueue-flags (Joel
     Fernandes)

   - Provide and use set_need_resched_current() (Peter Zijlstra)

  Fair scheduling enhancements:

   - Forfeit vruntime on yield (Fernand Sieber)

   - Only update stats for allowed CPUs when looking for dst group (Adam
     Li)

  CPU-core scheduling enhancements:

   - Optimize core cookie matching check (Fernand Sieber)

  Deadline scheduler fixes:

   - Only set free_cpus for online runqueues (Doug Berger)

   - Fix dl_server time accounting (Peter Zijlstra)

   - Fix dl_server stop condition (Peter Zijlstra)

  Proxy scheduling fixes:

   - Yield the donor task (Fernand Sieber)

  Fixes and cleanups:

   - Fix do_set_cpus_allowed() locking (Peter Zijlstra)

   - Fix migrate_disable_switch() locking (Peter Zijlstra)

   - Remove double update_rq_clock() in __set_cpus_allowed_ptr_locked()
     (Hao Jia)

   - Increase sched_tick_remote timeout (Phil Auld)

   - sched/deadline: Use cpumask_weight_and() in dl_bw_cpus() (Shrikanth
     Hegde)

   - sched/deadline: Clean up select_task_rq_dl() (Shrikanth Hegde)"

* tag 'sched-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
  sched: Provide and use set_need_resched_current()
  sched/fair: Proportional newidle balance
  sched/fair: Small cleanup to update_newidle_cost()
  sched/fair: Small cleanup to sched_balance_newidle()
  sched/fair: Revert max_newidle_lb_cost bump
  sched/fair: Reimplement NEXT_BUDDY to align with EEVDF goals
  sched/fair: Enable scheduler feature NEXT_BUDDY
  sched: Increase sched_tick_remote timeout
  sched/fair: Have SD_SERIALIZE affect newidle balancing
  sched/fair: Skip sched_balance_running cmpxchg when balance is not due
  sched/deadline: Minor cleanup in select_task_rq_dl()
  sched/deadline: Use cpumask_weight_and() in dl_bw_cpus
  sched/deadline: Document dl_server
  sched/deadline: Fix dl_server stop condition
  sched/deadline: Fix dl_server time accounting
  sched/core: Remove double update_rq_clock() in __set_cpus_allowed_ptr_locked()
  sched/eevdf: Fix min_vruntime vs avg_vruntime
  sched/core: Add comment explaining force-idle vruntime snapshots
  sched/core: Optimize core cookie matching check
  sched/proxy: Yield the donor task
  ...
2025-12-01 21:04:45 -08:00
Linus Torvalds
6c26fbe8c9 Performance events changes for v6.19:
Callchain support:
 
  - Add support for deferred user-space stack unwinding for
    perf, enabled on x86. (Peter Zijlstra, Steven Rostedt)
 
  - unwind_user/x86: Enable frame pointer unwinding on x86
    (Josh Poimboeuf)
 
 x86 PMU support and infrastructure:
 
  - x86/insn: Simplify for_each_insn_prefix() (Peter Zijlstra)
 
  - x86/insn,uprobes,alternative: Unify insn_is_nop()
    (Peter Zijlstra)
 
 Intel PMU driver:
 
  - Large series to prepare for and implement architectural PEBS
    support for Intel platforms such as Clearwater Forest (CWF)
    and Panther Lake (PTL). (Dapeng Mi, Kan Liang)
 
  - Check dynamic constraints (Kan Liang)
 
  - Optimize PEBS extended config (Peter Zijlstra)
 
  - cstates: Remove PC3 support from LunarLake (Zhang Rui)
 
  - cstates: Add Pantherlake support (Zhang Rui)
 
  - cstates: Clearwater Forest support (Zide Chen)
 
 AMD PMU driver:
 
  - x86/amd: Check event before enable to avoid GPF (George Kennedy)
 
 Fixes and cleanups:
 
  - task_work: Fix NMI race condition (Peter Zijlstra)
 
  - perf/x86: Fix NULL event access and potential PEBS record loss
    (Dapeng Mi)
 
  - Misc other fixes and cleanups.
    (Dapeng Mi, Ingo Molnar, Peter Zijlstra)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmktcU0RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gNKw//ThLmbkoGJ0/yLOdEcW8rA/7HB43Oz6j9
 k0Vs7zwDBMRFP4zQg2XeF5SH7CWS9p/nI3eMhorgmH77oJCvXJxVtD5991zmlZhf
 eafOar5ZMVaoMz+tK8WWiENZyuN0bt0mumZmz9svXR3KV1S/q18XZ8bCas0itwnq
 D0T3Gqi/Z39gJIy7bHNgLoFY2zvI9b2EJNDKlzHk3NJ7UamA4GuMHN0cM2dIzKGK
 2L+wXOe2BH9YYzYrz/cdKq7sBMjOvFsCQ/5jh23A2Yu6JI4nJbw0WmexZRK1OWCp
 GAdMjBuqIShibLRxK746WRO9iut49uTsah4iSG80hXzhpwf7VaegOarost1nLaqm
 zweIOr3iwJRf273r6IqRuaporVHpQYMj2w2H63z36sQtGtkKHNyxZ50b6bqpwwjU
 LikLEJ9Bmh3mlvlXsOx2wX6dTb1fUk+cy2ezCDKUHqOLjqy4dM8V+jYhuRO4yxXz
 mj9aHZKgyuREt8yo/3nLqAzF5Okj9cXp7H6F1hCKWuCoAhNXkrvYcvbg8h6aRxOX
 2vGhMYjpElkl/DG6OWCSwuqCt9nVEC/dazW9fKQjh4S0CFOVopaMGSkGcS/xUPub
 92J4XMDEJX4RJ6dfspeQr97+1fETXEIWNv4WbKnDjqJlAucU1gnOTprVnAYUjcWw
 74320FjGN1E=
 =/8GE
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull performance events updates from Ingo Molnar:
 "Callchain support:

   - Add support for deferred user-space stack unwinding for perf,
     enabled on x86. (Peter Zijlstra, Steven Rostedt)

   - unwind_user/x86: Enable frame pointer unwinding on x86 (Josh
     Poimboeuf)

  x86 PMU support and infrastructure:

   - x86/insn: Simplify for_each_insn_prefix() (Peter Zijlstra)

   - x86/insn,uprobes,alternative: Unify insn_is_nop() (Peter Zijlstra)

  Intel PMU driver:

   - Large series to prepare for and implement architectural PEBS
     support for Intel platforms such as Clearwater Forest (CWF) and
     Panther Lake (PTL). (Dapeng Mi, Kan Liang)

   - Check dynamic constraints (Kan Liang)

   - Optimize PEBS extended config (Peter Zijlstra)

   - cstates:
      - Remove PC3 support from LunarLake (Zhang Rui)
      - Add Pantherlake support (Zhang Rui)
      - Clearwater Forest support (Zide Chen)

  AMD PMU driver:

   - x86/amd: Check event before enable to avoid GPF (George Kennedy)

  Fixes and cleanups:

   - task_work: Fix NMI race condition (Peter Zijlstra)

   - perf/x86: Fix NULL event access and potential PEBS record loss
     (Dapeng Mi)

   - Misc other fixes and cleanups (Dapeng Mi, Ingo Molnar, Peter
     Zijlstra)"

* tag 'perf-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
  perf/x86/intel: Fix and clean up intel_pmu_drain_arch_pebs() type use
  perf/x86/intel: Optimize PEBS extended config
  perf/x86/intel: Check PEBS dyn_constraints
  perf/x86/intel: Add a check for dynamic constraints
  perf/x86/intel: Add counter group support for arch-PEBS
  perf/x86/intel: Setup PEBS data configuration and enable legacy groups
  perf/x86/intel: Update dyn_constraint base on PEBS event precise level
  perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR
  perf/x86/intel: Process arch-PEBS records or record fragments
  perf/x86/intel/ds: Factor out PEBS group processing code to functions
  perf/x86/intel/ds: Factor out PEBS record processing code to functions
  perf/x86/intel: Initialize architectural PEBS
  perf/x86/intel: Correct large PEBS flag check
  perf/x86/intel: Replace x86_pmu.drain_pebs calling with static call
  perf/x86: Fix NULL event access and potential PEBS record loss
  perf/x86: Remove redundant is_x86_event() prototype
  entry,unwind/deferred: Fix unwind_reset_info() placement
  unwind_user/x86: Fix arch=um build
  perf: Support deferred user unwind
  unwind_user/x86: Teach FP unwind about start of function
  ...
2025-12-01 20:42:01 -08:00
Linus Torvalds
63e6995005 objtool updates for v6.19:
- klp-build livepatch module generation (Josh Poimboeuf)
 
    Introduce new objtool features and a klp-build
    script to generate livepatch modules using a
    source .patch as input.
 
    This builds on concepts from the longstanding out-of-tree
    kpatch project which began in 2012 and has been used for
    many years to generate livepatch modules for production kernels.
    However, this is a complete rewrite which incorporates
    hard-earned lessons from 12+ years of maintaining kpatch.
 
    Key improvements compared to kpatch-build:
 
     - Integrated with objtool: Leverages objtool's existing control-flow
       graph analysis to help detect changed functions.
 
     - Works on vmlinux.o: Supports late-linked objects, making it
       compatible with LTO, IBT, and similar.
 
     - Simplified code base: ~3k fewer lines of code.
 
     - Upstream: No more out-of-tree #ifdef hacks, far less cruft.
 
     - Cleaner internals: Vastly simplified logic for symbol/section/reloc
       inclusion and special section extraction.
 
     - Robust __LINE__ macro handling: Avoids false positive binary diffs
       caused by the __LINE__ macro by introducing a fix-patch-lines script
       which injects #line directives into the source .patch to preserve
       the original line numbers at compile time.
 
  - Disassemble code with libopcodes instead of running objdump
    (Alexandre Chartre)
 
  - Disassemble support (-d option to objtool) by Alexandre Chartre,
    which supports the decoding of various Linux kernel code generation
    specials such as alternatives:
 
       17ef:  sched_balance_find_dst_group+0x62f                 mov    0x34(%r9),%edx
       17f3:  sched_balance_find_dst_group+0x633               | <alternative.17f3>             | X86_FEATURE_POPCNT
       17f3:  sched_balance_find_dst_group+0x633               | call   0x17f8 <__sw_hweight64> | popcnt %rdi,%rax
       17f8:  sched_balance_find_dst_group+0x638                 cmp    %eax,%edx
 
    ... jump table alternatives:
 
       1895:  sched_use_asym_prio+0x5                            test   $0x8,%ch
       1898:  sched_use_asym_prio+0x8                            je     0x18a9 <sched_use_asym_prio+0x19>
       189a:  sched_use_asym_prio+0xa                          | <jump_table.189a>                        | JUMP
       189a:  sched_use_asym_prio+0xa                          | jmp    0x18ae <sched_use_asym_prio+0x1e> | nop2
       189c:  sched_use_asym_prio+0xc                            mov    $0x1,%eax
       18a1:  sched_use_asym_prio+0x11                           and    $0x80,%ecx
 
    ... exception table alternatives:
 
     native_read_msr:
       5b80:  native_read_msr+0x0                                                     mov    %edi,%ecx
       5b82:  native_read_msr+0x2                                                   | <ex_table.5b82> | EXCEPTION
       5b82:  native_read_msr+0x2                                                   | rdmsr           | resume at 0x5b84 <native_read_msr+0x4>
       5b84:  native_read_msr+0x4                                                     shl    $0x20,%rdx
 
    .... x86 feature flag decoding (also see the X86_FEATURE_POPCNT
         example in sched_balance_find_dst_group() above):
 
       2faaf:  start_thread_common.constprop.0+0x1f                                    jne    0x2fba4 <start_thread_common.constprop.0+0x114>
       2fab5:  start_thread_common.constprop.0+0x25                                  | <alternative.2fab5>                  | X86_FEATURE_ALWAYS                                  | X86_BUG_NULL_SEG
       2fab5:  start_thread_common.constprop.0+0x25                                  | jmp    0x2faba <.altinstr_aux+0x2f4> | jmp    0x4b0 <start_thread_common.constprop.0+0x3f> | nop5
       2faba:  start_thread_common.constprop.0+0x2a                                    mov    $0x2b,%eax
 
    ... NOP sequence shortening:
 
       1048e2:  snapshot_write_finalize+0xc2                                            je     0x104917 <snapshot_write_finalize+0xf7>
       1048e4:  snapshot_write_finalize+0xc4                                            nop6
       1048ea:  snapshot_write_finalize+0xca                                            nop11
       1048f5:  snapshot_write_finalize+0xd5                                            nop11
       104900:  snapshot_write_finalize+0xe0                                            mov    %rax,%rcx
       104903:  snapshot_write_finalize+0xe3                                            mov    0x10(%rdx),%rax
 
    ... and much more.
 
  - Function validation tracing support (Alexandre Chartre)
 
  - Various -ffunction-sections fixes (Josh Poimboeuf)
 
  - Clang AutoFDO (Automated Feedback-Directed Optimizations) support (Josh Poimboeuf)
 
  - Misc fixes and cleanups (Borislav Petkov, Chen Ni,
    Dylan Hatch, Ingo Molnar, John Wang, Josh Poimboeuf,
    Pankaj Raghav, Peter Zijlstra, Thorsten Blum)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmktavcRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1j3IhAAvc9tRV8SJcohim6DrkPGxCN/S80uzt5S
 q8v1x5tBzMYmUxftfpoLsPCri6Ww0jprNuhnbRCvWAzXFuW79HWBNdVkEO7V/cym
 OsCKQv3r0mWv5UXP3o8VM5K3tnU61wOAIx3yZCz5XKWeOg6NPXBJCSGWYpLuA7z0
 1wUWAXuHgmj4RHMlHu5x0FZnSqGU3/TkUDGAqdxrY+myhdwm0Ul+dSwWGQdQjCgA
 59Y/gDsWWEe5BVL56suwKZ1e+8UFnpbncbWkjELD6euJpYpDSNMOW/S6PYqOOz5M
 rjMv06XIX5ma7QQbF5fMG/sXW64tZtc090UocDnx7hpDq9mLEyNNkXsqRQlmd8Wt
 wG19IaeWo8aG9DTQkiv8OhtmssPKZHJsVjRUvXGnjktvxnsYSomgOT1lNme38dJD
 X9jHgZCFMdPsQmG0dp00Y0oejfTChqIDef7qSpYwT96R7l9VQQF7K7AxfJwSeLGO
 3hClZ0Gz/u9NiJTUUWTxUmR+YEy+1xIeaQSDq6t4JRtNJaMZlcevfVW+F2Lm04XH
 9eSeF7bJS2XKrlLHVdPgWCGZOmee+ghdQ7svsyEGpzdzaAZ7UveTucHJ9CvW3Fft
 Dcrl8rxX2NiD2PLz03HCHR/JVUDc3W3Exrer1TD8PD4LcZhFoBEGQbZ/gFlkBTxb
 TOcemtJT03U=
 =yPrS
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Ingo Molnar:

 - klp-build livepatch module generation (Josh Poimboeuf)

   Introduce new objtool features and a klp-build script to generate
   livepatch modules using a source .patch as input.

   This builds on concepts from the longstanding out-of-tree kpatch
   project which began in 2012 and has been used for many years to
   generate livepatch modules for production kernels. However, this is a
   complete rewrite which incorporates hard-earned lessons from 12+
   years of maintaining kpatch.

   Key improvements compared to kpatch-build:

    - Integrated with objtool: Leverages objtool's existing control-flow
      graph analysis to help detect changed functions.

    - Works on vmlinux.o: Supports late-linked objects, making it
      compatible with LTO, IBT, and similar.

    - Simplified code base: ~3k fewer lines of code.

    - Upstream: No more out-of-tree #ifdef hacks, far less cruft.

    - Cleaner internals: Vastly simplified logic for
      symbol/section/reloc inclusion and special section extraction.

    - Robust __LINE__ macro handling: Avoids false positive binary diffs
      caused by the __LINE__ macro by introducing a fix-patch-lines
      script which injects #line directives into the source .patch to
      preserve the original line numbers at compile time.

 - Disassemble code with libopcodes instead of running objdump
   (Alexandre Chartre)

 - Disassemble support (-d option to objtool) by Alexandre Chartre,
   which supports the decoding of various Linux kernel code generation
   specials such as alternatives:

      17ef:  sched_balance_find_dst_group+0x62f                 mov    0x34(%r9),%edx
      17f3:  sched_balance_find_dst_group+0x633               | <alternative.17f3>             | X86_FEATURE_POPCNT
      17f3:  sched_balance_find_dst_group+0x633               | call   0x17f8 <__sw_hweight64> | popcnt %rdi,%rax
      17f8:  sched_balance_find_dst_group+0x638                 cmp    %eax,%edx

   ... jump table alternatives:

      1895:  sched_use_asym_prio+0x5                            test   $0x8,%ch
      1898:  sched_use_asym_prio+0x8                            je     0x18a9 <sched_use_asym_prio+0x19>
      189a:  sched_use_asym_prio+0xa                          | <jump_table.189a>                        | JUMP
      189a:  sched_use_asym_prio+0xa                          | jmp    0x18ae <sched_use_asym_prio+0x1e> | nop2
      189c:  sched_use_asym_prio+0xc                            mov    $0x1,%eax
      18a1:  sched_use_asym_prio+0x11                           and    $0x80,%ecx

   ... exception table alternatives:

    native_read_msr:
      5b80:  native_read_msr+0x0                                                     mov    %edi,%ecx
      5b82:  native_read_msr+0x2                                                   | <ex_table.5b82> | EXCEPTION
      5b82:  native_read_msr+0x2                                                   | rdmsr           | resume at 0x5b84 <native_read_msr+0x4>
      5b84:  native_read_msr+0x4                                                     shl    $0x20,%rdx

   .... x86 feature flag decoding (also see the X86_FEATURE_POPCNT
        example in sched_balance_find_dst_group() above):

      2faaf:  start_thread_common.constprop.0+0x1f                                    jne    0x2fba4 <start_thread_common.constprop.0+0x114>
      2fab5:  start_thread_common.constprop.0+0x25                                  | <alternative.2fab5>                  | X86_FEATURE_ALWAYS                                  | X86_BUG_NULL_SEG
      2fab5:  start_thread_common.constprop.0+0x25                                  | jmp    0x2faba <.altinstr_aux+0x2f4> | jmp    0x4b0 <start_thread_common.constprop.0+0x3f> | nop5
      2faba:  start_thread_common.constprop.0+0x2a                                    mov    $0x2b,%eax

   ... NOP sequence shortening:

      1048e2:  snapshot_write_finalize+0xc2                                            je     0x104917 <snapshot_write_finalize+0xf7>
      1048e4:  snapshot_write_finalize+0xc4                                            nop6
      1048ea:  snapshot_write_finalize+0xca                                            nop11
      1048f5:  snapshot_write_finalize+0xd5                                            nop11
      104900:  snapshot_write_finalize+0xe0                                            mov    %rax,%rcx
      104903:  snapshot_write_finalize+0xe3                                            mov    0x10(%rdx),%rax

   ... and much more.

 - Function validation tracing support (Alexandre Chartre)

 - Various -ffunction-sections fixes (Josh Poimboeuf)

 - Clang AutoFDO (Automated Feedback-Directed Optimizations) support
   (Josh Poimboeuf)

 - Misc fixes and cleanups (Borislav Petkov, Chen Ni, Dylan Hatch, Ingo
   Molnar, John Wang, Josh Poimboeuf, Pankaj Raghav, Peter Zijlstra,
   Thorsten Blum)

* tag 'objtool-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits)
  objtool: Fix segfault on unknown alternatives
  objtool: Build with disassembly can fail when including bdf.h
  objtool: Trim trailing NOPs in alternative
  objtool: Add wide output for disassembly
  objtool: Compact output for alternatives with one instruction
  objtool: Improve naming of group alternatives
  objtool: Add Function to get the name of a CPU feature
  objtool: Provide access to feature and flags of group alternatives
  objtool: Fix address references in alternatives
  objtool: Disassemble jump table alternatives
  objtool: Disassemble exception table alternatives
  objtool: Print addresses with alternative instructions
  objtool: Disassemble group alternatives
  objtool: Print headers for alternatives
  objtool: Preserve alternatives order
  objtool: Add the --disas=<function-pattern> action
  objtool: Do not validate IBT for .return_sites and .call_sites
  objtool: Improve tracing of alternative instructions
  objtool: Add functions to better name alternatives
  objtool: Identify the different types of alternatives
  ...
2025-12-01 20:18:59 -08:00
Linus Torvalds
b53440f8e5 Locking updates for v6.19:
Mutexes:
 
  - Redo __mutex_init() to reduce generated code size
    (Sebastian Andrzej Siewior)
 
 Seqlocks:
 
  - Introduce scoped_seqlock_read() (Peter Zijlstra)
 
  - Change thread_group_cputime() to use scoped_seqlock_read()
    (Oleg Nesterov)
 
  - Change do_task_stat() to use scoped_seqlock_read()
    (Oleg Nesterov)
 
  - Change do_io_accounting() to use scoped_seqlock_read()
    (Oleg Nesterov)
 
  - Fix the incorrect documentation of read_seqbegin_or_lock() /
    need_seqretry() (Oleg Nesterov)
 
  - Allow KASAN to fail optimizing (Peter Zijlstra)
 
 Local lock updates:
 
  - Fix all kernel-doc warnings (Randy Dunlap)
 
  - Add the <linux/local_lock*.h> headers to MAINTAINERS
    (Sebastian Andrzej Siewior)
 
  - Reduce the risk of shadowing via s/l/__l/ and s/tl/__tl/
    (Vincent Mailhol)
 
 Lock debugging:
 
  - spinlock/debug: Fix data-race in do_raw_write_lock
    (Alexander Sverdlin)
 
 Atomic primitives infrastructure:
 
  - atomic: Skip alignment check for try_cmpxchg() old arg
    (Arnd Bergmann)
 
 Rust runtime integration:
 
  - sync: atomic: Enable generated Atomic<T> usage (Boqun Feng)
 
  - sync: atomic: Implement Debug for Atomic<Debug> (Boqun Feng)
 
  - debugfs: Remove Rust native atomics and replace them with
    Linux versions (Boqun Feng)
 
  - debugfs: Implement Reader for Mutex<T> only when T is Unpin
    (Boqun Feng)
 
  - lock: guard: Add T: Unpin bound to DerefMut (Daniel Almeida)
 
  - lock: Pin the inner data (Daniel Almeida)
 
  - lock: Add a Pin<&mut T> accessor (Daniel Almeida)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmktVmgRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hDqw//e9BTs9Yx6yLxZVRDagS7KrDKy3V3OMg1
 9h+tXzCosGeNz7XwVzt590wsACJvX7/QtqgTFQy/GPYBW56WeVOSYSpA6I4s43G1
 sz+EAc4BYPBWEwBQxCfPDwHI/M0MD0Im6mjpATc6Uv1Ct0+KS8iFXRnGUALe4bis
 8aKV8ZHo81Wnu6v1B8GroExHolL/AMORYfEYHABpWEe+GpwxqQcRdZjc/eiEUzOg
 umwMC9bNc5FAiPlku9Mh6pcBPjkMd9bGjVEIG8deJhm/aD8f/b0mgaxyaKgoHx8J
 ptauY3kLYBLuV793U37SXuQVw6o2LGHCYvN1fX+g1D0YTIuqIq9Pz7ObZs7w4xDd
 6iIK4QYP4Gjkvn0ZA275eI3ItcBEjJ2FD3ZDbkXNj+O4vEOrmG/OX4h2H5WGq/AU
 zr9YfmkRn0InPeHeLU1UM3NdbKgwc/Bd6MubSwX4v7G7ws4CGDtlvA2d3xg5q8Ls
 MQoAV+9QtiZ9prQjtd8nukgmh/+okPmCcnuEVXhSOZHpPjqXXnyUCTPyKXVkltdF
 1u4oUHiQY7Jydfn0wZgEV4nASDeV2gz5BwKoSAuKvYc5HGhXnXxvzyJyHJy3dL8R
 afGGQ3XfQhA0hUKoMiQFUk7p7dvjdAiHxN1EcvxxJqWVsaE/Gpik1GOm+FRn7Oqs
 UMvspgGrbI4=
 =KPgY
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:
 "Mutexes:

   - Redo __mutex_init() to reduce generated code size (Sebastian
     Andrzej Siewior)

  Seqlocks:

   - Introduce scoped_seqlock_read() (Peter Zijlstra)

   - Change thread_group_cputime() to use scoped_seqlock_read() (Oleg
     Nesterov)

   - Change do_task_stat() to use scoped_seqlock_read() (Oleg Nesterov)

   - Change do_io_accounting() to use scoped_seqlock_read() (Oleg
     Nesterov)

   - Fix the incorrect documentation of read_seqbegin_or_lock() /
     need_seqretry() (Oleg Nesterov)

   - Allow KASAN to fail optimizing (Peter Zijlstra)

  Local lock updates:

   - Fix all kernel-doc warnings (Randy Dunlap)

   - Add the <linux/local_lock*.h> headers to MAINTAINERS (Sebastian
     Andrzej Siewior)

   - Reduce the risk of shadowing via s/l/__l/ and s/tl/__tl/ (Vincent
     Mailhol)

  Lock debugging:

   - spinlock/debug: Fix data-race in do_raw_write_lock (Alexander
     Sverdlin)

  Atomic primitives infrastructure:

   - atomic: Skip alignment check for try_cmpxchg() old arg (Arnd
     Bergmann)

  Rust runtime integration:

   - sync: atomic: Enable generated Atomic<T> usage (Boqun Feng)

   - sync: atomic: Implement Debug for Atomic<Debug> (Boqun Feng)

   - debugfs: Remove Rust native atomics and replace them with Linux
     versions (Boqun Feng)

   - debugfs: Implement Reader for Mutex<T> only when T is Unpin (Boqun
     Feng)

   - lock: guard: Add T: Unpin bound to DerefMut (Daniel Almeida)

   - lock: Pin the inner data (Daniel Almeida)

   - lock: Add a Pin<&mut T> accessor (Daniel Almeida)"

* tag 'locking-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/local_lock: Fix all kernel-doc warnings
  locking/local_lock: s/l/__l/ and s/tl/__tl/ to reduce the risk of shadowing
  locking/local_lock: Add the <linux/local_lock*.h> headers to MAINTAINERS
  locking/mutex: Redo __mutex_init() to reduce generated code size
  rust: debugfs: Replace the usage of Rust native atomics
  rust: sync: atomic: Implement Debug for Atomic<Debug>
  rust: sync: atomic: Make Atomic*Ops pub(crate)
  seqlock: Allow KASAN to fail optimizing
  rust: debugfs: Implement Reader for Mutex<T> only when T is Unpin
  seqlock: Change do_io_accounting() to use scoped_seqlock_read()
  seqlock: Change do_task_stat() to use scoped_seqlock_read()
  seqlock: Change thread_group_cputime() to use scoped_seqlock_read()
  seqlock: Introduce scoped_seqlock_read()
  documentation: seqlock: fix the wrong documentation of read_seqbegin_or_lock/need_seqretry
  atomic: Skip alignment check for try_cmpxchg() old arg
  rust: lock: Add a Pin<&mut T> accessor
  rust: lock: Pin the inner data
  rust: lock: guard: Add T: Unpin bound to DerefMut
  locking/spinlock/debug: Fix data-race in do_raw_write_lock
2025-12-01 19:50:58 -08:00
Linus Torvalds
1b5dd29869 vfs-6.19-rc1.fd_prepare.fs
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZwAKCRCRxhvAZXjc
 op0AAP4oNVJkFyvgKoPos5K2EGFB8M7merGhpYtsOoeg8UK6OwD/UySQErHsXQDR
 sUDDa5uFOhfrkcfM8REtAN4wF8p5qAc=
 =QgFD
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc1.fd_prepare.fs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull fd prepare updates from Christian Brauner:
 "This adds the FD_ADD() and FD_PREPARE() primitive. They simplify the
  common pattern of get_unused_fd_flags() + create file + fd_install()
  that is used extensively throughout the kernel and currently requires
  cumbersome cleanup paths.

  FD_ADD() - For simple cases where a file is installed immediately:

      fd = FD_ADD(O_CLOEXEC, vfio_device_open_file(device));
      if (fd < 0)
          vfio_device_put_registration(device);
      return fd;

  FD_PREPARE() - For cases requiring access to the fd or file, or
  additional work before publishing:

      FD_PREPARE(fdf, O_CLOEXEC, sync_file->file);
      if (fdf.err) {
          fput(sync_file->file);
          return fdf.err;
      }

      data.fence = fd_prepare_fd(fdf);
      if (copy_to_user((void __user *)arg, &data, sizeof(data)))
          return -EFAULT;

      return fd_publish(fdf);

  The primitives are centered around struct fd_prepare. FD_PREPARE()
  encapsulates all allocation and cleanup logic and must be followed by
  a call to fd_publish() which associates the fd with the file and
  installs it into the caller's fdtable. If fd_publish() isn't called,
  both are deallocated automatically. FD_ADD() is a shorthand that does
  fd_publish() immediately and never exposes the struct to the caller.

  I've implemented this in a way that it's compatible with the cleanup
  infrastructure while also being usable separately. IOW, it's centered
  around struct fd_prepare which is aliased to class_fd_prepare_t and so
  we can make use of all the basica guard infrastructure"

* tag 'vfs-6.19-rc1.fd_prepare.fs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (42 commits)
  io_uring: convert io_create_mock_file() to FD_PREPARE()
  file: convert replace_fd() to FD_PREPARE()
  vfio: convert vfio_group_ioctl_get_device_fd() to FD_ADD()
  tty: convert ptm_open_peer() to FD_ADD()
  ntsync: convert ntsync_obj_get_fd() to FD_PREPARE()
  media: convert media_request_alloc() to FD_PREPARE()
  hv: convert mshv_ioctl_create_partition() to FD_ADD()
  gpio: convert linehandle_create() to FD_PREPARE()
  pseries: port papr_rtas_setup_file_interface() to FD_ADD()
  pseries: convert papr_platform_dump_create_handle() to FD_ADD()
  spufs: convert spufs_gang_open() to FD_PREPARE()
  papr-hvpipe: convert papr_hvpipe_dev_create_handle() to FD_PREPARE()
  spufs: convert spufs_context_open() to FD_PREPARE()
  net/socket: convert __sys_accept4_file() to FD_ADD()
  net/socket: convert sock_map_fd() to FD_ADD()
  net/kcm: convert kcm_ioctl() to FD_PREPARE()
  net/handshake: convert handshake_nl_accept_doit() to FD_PREPARE()
  secretmem: convert memfd_secret() to FD_ADD()
  memfd: convert memfd_create() to FD_ADD()
  bpf: convert bpf_token_create() to FD_PREPARE()
  ...
2025-12-01 17:32:07 -08:00
Linus Torvalds
ffbf700df2 vfs-6.19-rc1.autofs
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZwAKCRCRxhvAZXjc
 ov48AP4qgtSo78euYDtsxkgU1IKRow1Hc3L/rql6uIP7dFtQ8AEAjZjCMK3vDSZy
 DUqStlgPZ3/GWyzdnDKOoAuwvn56gQQ=
 =KElr
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc1.autofs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull autofs update from Christian Brauner:
 "Prevent futile mount triggers in private mount namespaces.

  Fix a problematic loop in autofs when a mount namespace contains
  autofs mounts that are propagation private and there is no
  namespace-specific automount daemon to handle possible automounting.

  Previously, attempted path resolution would loop until MAXSYMLINKS was
  reached before failing, causing significant noise in the log.

 The fix adds a check in autofs ->d_automount() so that the VFS can
 immediately return EPERM in this case. Since the mount is propagation
 private, EPERM is the most appropriate error code"

* tag 'vfs-6.19-rc1.autofs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  autofs: dont trigger mount if it cant succeed
2025-12-01 16:38:21 -08:00
Linus Torvalds
d0deeb803c vfs-6.19-rc1.ovl
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZwAKCRCRxhvAZXjc
 othbAP97FNSOMUMAXTUxocE7vMgq3B/MG54e22ZrYhnZeP8NsgEA3zo4GpPCeM0p
 e8EjiLz0wUlveF68MZUg52eXT5/nTAE=
 =tbY/
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc1.ovl' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull overlayfs cred guard conversion from Christian Brauner:
 "This converts all of overlayfs to use credential guards, eliminating
  manual credential management throughout the filesystem.

  Credential guard conversion:

   - Convert all of overlayfs to use credential guards, replacing the
     manual ovl_override_creds()/ovl_revert_creds() pattern with scoped
     guards.

     This makes credential handling visually explicit and eliminates a
     class of potential bugs from mismatched override/revert calls.

     (1) Basic credential guard (with_ovl_creds)
     (2) Creator credential guard (ovl_override_creator_creds):

         Introduced a specialized guard for file creation operations
         that handles the two-phase credential override (mounter
         credentials, then fs{g,u}id override). The new pattern is much
         clearer:

         with_ovl_creds(dentry->d_sb) {
                 scoped_class(prepare_creds_ovl, cred, dentry, inode, mode) {
                         if (IS_ERR(cred))
                                 return PTR_ERR(cred);
                         /* creation operations */
                 }
         }

     (3) Copy-up credential guard (ovl_cu_creds):

         Introduced a specialized guard for copy-up operations,
         simplifying the previous struct ovl_cu_creds helper and
         associated functions.

         Ported ovl_copy_up_workdir() and ovl_copy_up_tmpfile() to this
         pattern.

  Cleanups:

   - Remove ovl_revert_creds() after all callers converted to guards

   - Remove struct ovl_cu_creds and associated functions

   - Drop ovl_setup_cred_for_create() after conversion

   - Refactor ovl_fill_super(), ovl_lookup(), ovl_iterate(),
     ovl_rename() for cleaner credential guard scope

   - Introduce struct ovl_renamedata to simplify rename handling

   - Don't override credentials for ovl_check_whiteouts() (unnecessary)

   - Remove unneeded semicolon"

* tag 'vfs-6.19-rc1.ovl' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (54 commits)
  ovl: remove unneeded semicolon
  ovl: remove struct ovl_cu_creds and associated functions
  ovl: port ovl_copy_up_tmpfile() to cred guard
  ovl: mark *_cu_creds() as unused temporarily
  ovl: port ovl_copy_up_workdir() to cred guard
  ovl: add copy up credential guard
  ovl: drop ovl_setup_cred_for_create()
  ovl: port ovl_create_or_link() to new ovl_override_creator_creds cleanup guard
  ovl: mark ovl_setup_cred_for_create() as unused temporarily
  ovl: reflow ovl_create_or_link()
  ovl: port ovl_create_tmpfile() to new ovl_override_creator_creds cleanup guard
  ovl: add ovl_override_creator_creds cred guard
  ovl: remove ovl_revert_creds()
  ovl: port ovl_fill_super() to cred guard
  ovl: refactor ovl_fill_super()
  ovl: port ovl_lower_positive() to cred guard
  ovl: port ovl_lookup() to cred guard
  ovl: refactor ovl_lookup()
  ovl: port ovl_copyfile() to cred guard
  ovl: port ovl_rename() to cred guard
  ...
2025-12-01 16:31:21 -08:00
Linus Torvalds
a8058f8442 vfs-6.19-rc1.directory.locking
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZwAKCRCRxhvAZXjc
 op9tAQCJ//STOkvYHfqgsdRD+cW9MRg/gPzfVZgnV1FTyf8sMgEA0IsY5zCZB9eh
 9FdD0E57P8PlWRwWZ+LktnWBzRAUqwI=
 =MOVR
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc1.directory.locking' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull directory locking updates from Christian Brauner:
 "This contains the work to add centralized APIs for directory locking
  operations.

  This series is part of a larger effort to change directory operation
  locking to allow multiple concurrent operations in a directory. The
  ultimate goal is to lock the target dentry(s) rather than the whole
  parent directory.

  To help with changing the locking protocol, this series centralizes
  locking and lookup in new helper functions. The helpers establish a
  pattern where it is the dentry that is being locked and unlocked
  (currently the lock is held on dentry->d_parent->d_inode, but that can
  change in the future).

  This also changes vfs_mkdir() to unlock the parent on failure, as well
  as dput()ing the dentry. This allows end_creating() to only require
  the target dentry (which may be IS_ERR() after vfs_mkdir()), not the
  parent"

* tag 'vfs-6.19-rc1.directory.locking' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  nfsd: fix end_creating() conversion
  VFS: introduce end_creating_keep()
  VFS: change vfs_mkdir() to unlock on failure.
  ecryptfs: use new start_creating/start_removing APIs
  Add start_renaming_two_dentries()
  VFS/ovl/smb: introduce start_renaming_dentry()
  VFS/nfsd/ovl: introduce start_renaming() and end_renaming()
  VFS: add start_creating_killable() and start_removing_killable()
  VFS: introduce start_removing_dentry()
  smb/server: use end_removing_noperm for for target of smb2_create_link()
  VFS: introduce start_creating_noperm() and start_removing_noperm()
  VFS/nfsd/cachefiles/ovl: introduce start_removing() and end_removing()
  VFS/nfsd/cachefiles/ovl: add start_creating() and end_creating()
  VFS: tidy up do_unlinkat()
  VFS: introduce start_dirop() and end_dirop()
  debugfs: rename end_creating() to debugfs_end_creating()
2025-12-01 16:13:46 -08:00
Linus Torvalds
db74a7d02a vfs-6.19-rc1.directory.delegations
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZgAKCRCRxhvAZXjc
 ooiEAPwNZfkqiSs6G1B2EmjFpMrA2BDqskaOsnN2sywra0sNewD9EQxJwlYXUn+z
 nNUIAvmegJGg2OiU2UaNGwxMR3lR3w8=
 =YELr
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc1.directory.delegations' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull directory delegations update from Christian Brauner:
 "This contains the work for recall-only directory delegations for
  knfsd.

  Add support for simple, recallable-only directory delegations. This
  was decided at the fall NFS Bakeathon where the NFS client and server
  maintainers discussed how to merge directory delegation support.

  The approach starts with recallable-only delegations for several reasons:

   1. RFC8881 has gaps that are being addressed in RFC8881bis. In
      particular, it requires directory position information for
      CB_NOTIFY callbacks, which is difficult to implement properly
      under Linux. The spec is being extended to allow that information
      to be omitted.

   2. Client-side support for CB_NOTIFY still lags. The client side
      involves heuristics about when to request a delegation.

   3. Early indication shows simple, recallable-only delegations can
      help performance. Anna Schumaker mentioned seeing a multi-minute
      speedup in xfstests runs with them enabled.

  With these changes, userspace can also request a read lease on a
  directory that will be recalled on conflicting accesses. This may be
  useful for applications like Samba. Users can disable leases
  altogether via the fs.leases-enable sysctl if needed.

  VFS changes:

   - Dedicated Type for Delegations

     Introduce struct delegated_inode to track inodes that may have
     delegations that need to be broken. This replaces the previous
     approach of passing raw inode pointers through the delegation
     breaking code paths, providing better type safety and clearer
     semantics for the delegation machinery.

   - Break parent directory delegations in open(..., O_CREAT) codepath

   - Allow mkdir to wait for delegation break on parent

   - Allow rmdir to wait for delegation break on parent

   - Add try_break_deleg calls for parents to vfs_link(), vfs_rename(),
     and vfs_unlink()

   - Make vfs_create(), vfs_mknod(), and vfs_symlink() break delegations
     on parent directory

   - Clean up argument list for vfs_create()

   - Expose delegation support to userland

  Filelock changes:

   - Make lease_alloc() take a flags argument

   - Rework the __break_lease API to use flags

   - Add struct delegated_inode

   - Push the S_ISREG check down to ->setlease handlers

   - Lift the ban on directory leases in generic_setlease

  NFSD changes:

   - Allow filecache to hold S_IFDIR files

   - Allow DELEGRETURN on directories

   - Wire up GET_DIR_DELEGATION handling

  Fixes:

   - Fix kernel-doc warnings in __fcntl_getlease

   - Add needed headers for new struct delegation definition"

* tag 'vfs-6.19-rc1.directory.delegations' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  vfs: add needed headers for new struct delegation definition
  filelock: __fcntl_getlease: fix kernel-doc warnings
  vfs: expose delegation support to userland
  nfsd: wire up GET_DIR_DELEGATION handling
  nfsd: allow DELEGRETURN on directories
  nfsd: allow filecache to hold S_IFDIR files
  filelock: lift the ban on directory leases in generic_setlease
  vfs: make vfs_symlink break delegations on parent dir
  vfs: make vfs_mknod break delegations on parent directory
  vfs: make vfs_create break delegations on parent directory
  vfs: clean up argument list for vfs_create()
  vfs: break parent dir delegations in open(..., O_CREAT) codepath
  vfs: allow rmdir to wait for delegation break on parent
  vfs: allow mkdir to wait for delegation break on parent
  vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink}
  filelock: push the S_ISREG check down to ->setlease handlers
  filelock: add struct delegated_inode
  filelock: rework the __break_lease API to use flags
  filelock: make lease_alloc() take a flags argument
2025-12-01 15:34:41 -08:00
Linus Torvalds
4664fb427c vfs-6.19-rc1.minix
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZgAKCRCRxhvAZXjc
 olEcAP4qG313oT/tm4W3nC4g2k8S//KqET97B80pSX0K3DvQEwD+LSCf1Th3RnsV
 EAMHczmCtRlbcFPqYOFVAMS8VxOyVg0=
 =7Ca4
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc1.minix' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull minix fixes from Christian Brauner:
 "Fix two syzbot corruption bugs in the minix filesystem.

  Syzbot fuzzes filesystems by trying to mount and manipulate
  deliberately corrupted images. This should not lead to BUG_ONs and
  WARN_ONs for easy to detect corruptions.

   - Add error handling to minix filesystem for inode corruption
     detection, enabling the filesystem to report such corruptions
     cleanly.

   - Fix a drop_nlink warning in minix_rmdir() triggered by corrupted
     directory link counts.

   - Fix a drop_nlink warning in minix_rename() triggered by corrupted
     inode link counts"

* tag 'vfs-6.19-rc1.minix' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  Fix a drop_nlink warning in minix_rename
  Fix a drop_nlink warning in minix_rmdir
  Add error handling to minix filesystem for inode corruption detection
2025-12-01 15:22:40 -08:00
Linus Torvalds
978d337c2e vfs-6.19-rc1.guards
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZgAKCRCRxhvAZXjc
 opxBAQCjNjr0yTSoaGRM0CJXg79Of3DLIlBdB7TygibTN16WhwEA+VKWoHL5eRjg
 PZlwZD4Ei2ymeQYxi+6owTF8G806tAs=
 =m/Bt
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc1.guards' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull superblock lock guard updates from Christian Brauner:
 "This starts the work of introducing guards for superblock related
  locks.

  Introduce super_write_guard for scoped superblock write protection.

  This provides a guard-based alternative to the manual sb_start_write()
  and sb_end_write() pattern, allowing the compiler to automatically
  handle the cleanup"

* tag 'vfs-6.19-rc1.guards' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  xfs: use super write guard in xfs_file_ioctl()
  open: use super write guard in do_ftruncate()
  btrfs: use super write guard in relocating_repair_kthread()
  ext4: use super write guard in write_mmp_block()
  btrfs: use super write guard in sb_start_write()
  btrfs: use super write guard btrfs_run_defrag_inode()
  btrfs: use super write guard in btrfs_reclaim_bgs_work()
  fs: add super_write_guard
2025-12-01 14:39:03 -08:00
Linus Torvalds
afdf0fb340 vfs-6.19-rc1.fs_header
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZgAKCRCRxhvAZXjc
 oq2EAQD09y/qVU81E7Qg7Cn4n5/3WTlnQjx0aSvhb4p6dFUcFwD+K9uVJNP8x8tA
 xTaPt59nZbEX9BIAwtLChSPa4CZsnwM=
 =XrvE
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc1.fs_header' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull fs header updates from Christian Brauner:
 "This contains initial work to start splitting up fs.h.

  Begin the long-overdue work of splitting up the monolithic fs.h
  header. The header has grown to over 3000 lines and includes types and
  functions for many different subsystems, making it difficult to
  navigate and causing excessive compilation dependencies.

  This series introduces new focused headers for superblock-related
  code:

   - Rename fs_types.h to fs_dirent.h to better reflect its actual
     content (directory entry types)

   - Add fs/super_types.h containing superblock type definitions

   - Add fs/super.h containing superblock function declarations

  This is the first step in a longer effort to modularize the VFS
  headers.

  Cleanups:

   - Inode Field Layout Optimization (Mateusz Guzik)

     Move inode fields used during fast path lookup closer together to
     improve cache locality during path resolution.

   - current_umask() Optimization (Mateusz Guzik)

     Inline current_umask() and move it to fs_struct.h. This improves
     performance by avoiding function call overhead for this
     frequently-used function, and places it in a more appropriate
     header since it operates on fs_struct"

* tag 'vfs-6.19-rc1.fs_header' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs: move inode fields used during fast path lookup closer together
  fs: inline current_umask() and move it to fs_struct.h
  fs: add fs/super.h header
  fs: add fs/super_types.h header
  fs: rename fs_types.h to fs_dirent.h
2025-12-01 14:18:01 -08:00
Linus Torvalds
1d18101a64 kernel-6.19-rc1.cred
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZQAKCRCRxhvAZXjc
 orJLAP9UD+dX6cicJDkzFZowDakmoIQkR5ZSDwChSlmvLcmquwEAlSq4svVd9Bdl
 7kOFUk71DqhVHrPAwO7ap0BxehokEAA=
 =Cli6
 -----END PGP SIGNATURE-----

Merge tag 'kernel-6.19-rc1.cred' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull cred guard updates from Christian Brauner:
 "This contains substantial credential infrastructure improvements
  adding guard-based credential management that simplifies code and
  eliminates manual reference counting in many subsystems.

  Features:

   - Kernel Credential Guards

     Add with_kernel_creds() and scoped_with_kernel_creds() guards that
     allow using the kernel credentials without allocating and copying
     them. This was requested by Linus after seeing repeated
     prepare_kernel_creds() calls that duplicate the kernel credentials
     only to drop them again later.

     The new guards completely avoid the allocation and never expose the
     temporary variable to hold the kernel credentials anywhere in
     callers.

   - Generic Credential Guards

     Add scoped_with_creds() guards for the common override_creds() and
     revert_creds() pattern. This builds on earlier work that made
     override_creds()/revert_creds() completely reference count free.

   - Prepare Credential Guards

     Add prepare credential guards for the more complex pattern of
     preparing a new set of credentials and overriding the current
     credentials with them:
      - prepare_creds()
      - modify new creds
      - override_creds()
      - revert_creds()
      - put_cred()

  Cleanups:

   - Make init_cred static since it should not be directly accessed

   - Add kernel_cred() helper to properly access the kernel credentials

   - Fix scoped_class() macro that was introduced two cycles ago

   - coredump: split out do_coredump() from vfs_coredump() for cleaner
     credential handling

   - coredump: move revert_cred() before coredump_cleanup()

   - coredump: mark struct mm_struct as const

   - coredump: pass struct linux_binfmt as const

   - sev-dev: use guard for path"

* tag 'kernel-6.19-rc1.cred' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (36 commits)
  trace: use override credential guard
  trace: use prepare credential guard
  coredump: use override credential guard
  coredump: use prepare credential guard
  coredump: split out do_coredump() from vfs_coredump()
  coredump: mark struct mm_struct as const
  coredump: pass struct linux_binfmt as const
  coredump: move revert_cred() before coredump_cleanup()
  sev-dev: use override credential guards
  sev-dev: use prepare credential guard
  sev-dev: use guard for path
  cred: add prepare credential guard
  net/dns_resolver: use credential guards in dns_query()
  cgroup: use credential guards in cgroup_attach_permissions()
  act: use credential guards in acct_write_process()
  smb: use credential guards in cifs_get_spnego_key()
  nfs: use credential guards in nfs_idmap_get_key()
  nfs: use credential guards in nfs_local_call_write()
  nfs: use credential guards in nfs_local_call_read()
  erofs: use credential guards
  ...
2025-12-01 13:45:41 -08:00
Linus Torvalds
f2e74ecfba vfs-6.19-rc1.folio
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZQAKCRCRxhvAZXjc
 onGBAQDtqeO0jZzS7q9UxlJ84Wj/H9w+9INpO4jMxtWK4svhUAEAghG4qVxRvkE2
 Qh+wrpTPIC7OCQ78k8psDRmkj9cn8QA=
 =FCVN
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc1.folio' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull folio updates from Christian Brauner:
 "Add a new folio_next_pos() helper function that returns the file
  position of the first byte after the current folio. This is a common
  operation in filesystems when needing to know the end of the current
  folio.

  The helper is lifted from btrfs which already had its own version, and
  is now used across multiple filesystems and subsystems:
   - btrfs
   - buffer
   - ext4
   - f2fs
   - gfs2
   - iomap
   - netfs
   - xfs
   - mm

  This fixes a long-standing bug in ocfs2 on 32-bit systems with files
  larger than 2GiB. Presumably this is not a common configuration, but
  the fix is backported anyway. The other filesystems did not have bugs,
  they were just mildly inefficient.

  This also introduce uoff_t as the unsigned version of loff_t. A recent
  commit inadvertently changed a comparison from being unsigned (on
  64-bit systems) to being signed (which it had always been on 32-bit
  systems), leading to sporadic fstests failures.

  Generally file sizes are restricted to being a signed integer, but in
  places where -1 is passed to indicate "up to the end of the file", it
  is convenient to have an unsigned type to ensure comparisons are
  always unsigned regardless of architecture"

* tag 'vfs-6.19-rc1.folio' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs: Add uoff_t
  mm: Use folio_next_pos()
  xfs: Use folio_next_pos()
  netfs: Use folio_next_pos()
  iomap: Use folio_next_pos()
  gfs2: Use folio_next_pos()
  f2fs: Use folio_next_pos()
  ext4: Use folio_next_pos()
  buffer: Use folio_next_pos()
  btrfs: Use folio_next_pos()
  filemap: Add folio_next_pos()
2025-12-01 10:26:38 -08:00
Linus Torvalds
212c4053a1 vfs-6.19-rc1.coredump
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZQAKCRCRxhvAZXjc
 oji0AQC5jl35xh04fJKB343InVAxtRFp8mSkJJ9Bx6x7xA7a+QEAiBMxYilUgYIW
 bZMcI5LU+gNO/1y076QkVt84jTUQLww=
 =WIBZ
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull pidfd and coredump updates from Christian Brauner:
 "Features:

   - Expose coredump signal via pidfd

     Expose the signal that caused the coredump through the pidfd
     interface. The recent changes to rework coredump handling to rely
     on unix sockets are in the process of being used in systemd. The
     previous systemd coredump container interface requires the coredump
     file descriptor and basic information including the signal number
     to be sent to the container. This means the signal number needs to
     be available before sending the coredump to the container.

   - Add supported_mask field to pidfd

     Add a new supported_mask field to struct pidfd_info that indicates
     which information fields are supported by the running kernel. This
     allows userspace to detect feature availability without relying on
     error codes or kernel version checks.

  Cleanups:

   - Drop struct pidfs_exit_info and prepare to drop exit_info pointer,
     simplifying the internal publication mechanism for exit and
     coredump information retrievable via the pidfd ioctl

   - Use guard() for task_lock in pidfs

   - Reduce wait_pidfd lock scope

   - Add missing PIDFD_INFO_SIZE_VER1 constant

   - Add missing BUILD_BUG_ON() assert on struct pidfd_info

  Fixes:

   - Fix PIDFD_INFO_COREDUMP handling

  Selftests:

   - Split out coredump socket tests and common helpers into separate
     files for better organization

   - Fix userspace coredump client detection issues

   - Handle edge-triggered epoll correctly

   - Ignore ENOSPC errors in tests

   - Add debug logging to coredump socket tests, socket protocol tests,
     and test helpers

   - Add tests for PIDFD_INFO_COREDUMP_SIGNAL

   - Add tests for supported_mask field

   - Update pidfd header for selftests"

* tag 'vfs-6.19-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (23 commits)
  pidfs: reduce wait_pidfd lock scope
  selftests/coredump: add second PIDFD_INFO_COREDUMP_SIGNAL test
  selftests/coredump: add first PIDFD_INFO_COREDUMP_SIGNAL test
  selftests/coredump: ignore ENOSPC errors
  selftests/coredump: add debug logging to coredump socket protocol tests
  selftests/coredump: add debug logging to coredump socket tests
  selftests/coredump: add debug logging to test helpers
  selftests/coredump: handle edge-triggered epoll correctly
  selftests/coredump: fix userspace coredump client detection
  selftests/coredump: fix userspace client detection
  selftests/coredump: split out coredump socket tests
  selftests/coredump: split out common helpers
  selftests/pidfd: add second supported_mask test
  selftests/pidfd: add first supported_mask test
  selftests/pidfd: update pidfd header
  pidfs: expose coredump signal
  pidfs: drop struct pidfs_exit_info
  pidfs: prepare to drop exit_info pointer
  pidfd: add a new supported_mask field
  pidfs: add missing BUILD_BUG_ON() assert on struct pidfd_info
  ...
2025-12-01 10:17:39 -08:00
Linus Torvalds
415d34b92c namespace-6.19-rc1
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZQAKCRCRxhvAZXjc
 ooKwAP4kR5kMjHlthf8jHmmCjVU3nQFO9hUZsIQL9gFJLOIQMAD+LLoTaq1WJufl
 oSgZpREXZVmI1TK61eR6EZMB1YikGAo=
 =TExi
 -----END PGP SIGNATURE-----

Merge tag 'namespace-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull namespace updates from Christian Brauner:
 "This contains substantial namespace infrastructure changes including a new
  system call, active reference counting, and extensive header cleanups.
  The branch depends on the shared kbuild branch for -fms-extensions support.

  Features:

   - listns() system call

     Add a new listns() system call that allows userspace to iterate
     through namespaces in the system. This provides a programmatic
     interface to discover and inspect namespaces, addressing
     longstanding limitations:

     Currently, there is no direct way for userspace to enumerate
     namespaces. Applications must resort to scanning /proc/*/ns/ across
     all processes, which is:
      - Inefficient - requires iterating over all processes
      - Incomplete - misses namespaces not attached to any running
        process but kept alive by file descriptors, bind mounts, or
        parent references
      - Permission-heavy - requires access to /proc for many processes
      - No ordering or ownership information
      - No filtering per namespace type

     The listns() system call solves these problems:

       ssize_t listns(const struct ns_id_req *req, u64 *ns_ids,
                      size_t nr_ns_ids, unsigned int flags);

       struct ns_id_req {
             __u32 size;
             __u32 spare;
             __u64 ns_id;
             struct /* listns */ {
                     __u32 ns_type;
                     __u32 spare2;
                     __u64 user_ns_id;
             };
       };

     Features include:
      - Pagination support for large namespace sets
      - Filtering by namespace type (MNT_NS, NET_NS, USER_NS, etc.)
      - Filtering by owning user namespace
      - Permission checks respecting namespace isolation

   - Active Reference Counting

     Introduce an active reference count that tracks namespace
     visibility to userspace. A namespace is visible in the following
     cases:
      - The namespace is in use by a task
      - The namespace is persisted through a VFS object (namespace file
        descriptor or bind-mount)
      - The namespace is a hierarchical type and is the parent of child
        namespaces

     The active reference count does not regulate lifetime (that's still
     done by the normal reference count) - it only regulates visibility
     to namespace file handles and listns().

     This prevents resurrection of namespaces that are pinned only for
     internal kernel reasons (e.g., user namespaces held by
     file->f_cred, lazy TLB references on idle CPUs, etc.) which should
     not be accessible via (1)-(3).

   - Unified Namespace Tree

     Introduce a unified tree structure for all namespaces with:
      - Fixed IDs assigned to initial namespaces
      - Lookup based solely on inode number
      - Maintained list of owned namespaces per user namespace
      - Simplified rbtree comparison helpers

   Cleanups

    - Header Reorganization:
      - Move namespace types into separate header (ns_common_types.h)
      - Decouple nstree from ns_common header
      - Move nstree types into separate header
      - Switch to new ns_tree_{node,root} structures with helper functions
      - Use guards for ns_tree_lock

   - Initial Namespace Reference Count Optimization
      - Make all reference counts on initial namespaces a nop to avoid
        pointless cacheline ping-pong for namespaces that can never go
        away
      - Drop custom reference count initialization for initial namespaces
      - Add NS_COMMON_INIT() macro and use it for all namespaces
      - pid: rely on common reference count behavior

   - Miscellaneous Cleanups
      - Rename exit_task_namespaces() to exit_nsproxy_namespaces()
      - Rename is_initial_namespace() and make argument const
      - Use boolean to indicate anonymous mount namespace
      - Simplify owner list iteration in nstree
      - nsfs: raise SB_I_NODEV, SB_I_NOEXEC, and DCACHE_DONTCACHE explicitly
      - nsfs: use inode_just_drop()
      - pidfs: raise DCACHE_DONTCACHE explicitly
      - pidfs: simplify PIDFD_GET__NAMESPACE ioctls
      - libfs: allow to specify s_d_flags
      - cgroup: add cgroup namespace to tree after owner is set
      - nsproxy: fix free_nsproxy() and simplify create_new_namespaces()

  Fixes:

   - setns(pidfd, ...) race condition

     Fix a subtle race when using pidfds with setns(). When the target
     task exits after prepare_nsset() but before commit_nsset(), the
     namespace's active reference count might have been dropped. If
     setns() then installs the namespaces, it would bump the active
     reference count from zero without taking the required reference on
     the owner namespace, leading to underflow when later decremented.

     The fix resurrects the ownership chain if necessary - if the caller
     succeeded in grabbing passive references, the setns() should
     succeed even if the target task exits or gets reaped.

   - Return EFAULT on put_user() error instead of success

   - Make sure references are dropped outside of RCU lock (some
     namespaces like mount namespace sleep when putting the last
     reference)

   - Don't skip active reference count initialization for network
     namespace

   - Add asserts for active refcount underflow

   - Add asserts for initial namespace reference counts (both passive
     and active)

   - ipc: enable is_ns_init_id() assertions

   - Fix kernel-doc comments for internal nstree functions

   - Selftests
      - 15 active reference count tests
      - 9 listns() functionality tests
      - 7 listns() permission tests
      - 12 inactive namespace resurrection tests
      - 3 threaded active reference count tests
      - commit_creds() active reference tests
      - Pagination and stress tests
      - EFAULT handling test
      - nsid tests fixes"

* tag 'namespace-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (103 commits)
  pidfs: simplify PIDFD_GET_<type>_NAMESPACE ioctls
  nstree: fix kernel-doc comments for internal functions
  nsproxy: fix free_nsproxy() and simplify create_new_namespaces()
  selftests/namespaces: fix nsid tests
  ns: drop custom reference count initialization for initial namespaces
  pid: rely on common reference count behavior
  ns: add asserts for initial namespace active reference counts
  ns: add asserts for initial namespace reference counts
  ns: make all reference counts on initial namespace a nop
  ipc: enable is_ns_init_id() assertions
  fs: use boolean to indicate anonymous mount namespace
  ns: rename is_initial_namespace()
  ns: make is_initial_namespace() argument const
  nstree: use guards for ns_tree_lock
  nstree: simplify owner list iteration
  nstree: switch to new structures
  nstree: add helper to operate on struct ns_tree_{node,root}
  nstree: move nstree types into separate header
  nstree: decouple from ns_common header
  ns: move namespace types into separate header
  ...
2025-12-01 09:47:41 -08:00
Linus Torvalds
ebaeabfa5a vfs-6.19-rc1.writeback
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZQAKCRCRxhvAZXjc
 or4UAP9FbpFsZd0DpsYnKuv7kFepl291PuR0x2dKmseJ/wcf8AEAzI8FR5wd/fey
 25ZNdExoUojAOj5wVn+jUep3u54jBws=
 =/toi
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc1.writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull writeback updates from Christian Brauner:
 "Features:

   - Allow file systems to increase the minimum writeback chunk size.

     The relatively low minimal writeback size of 4MiB means that
     written back inodes on rotational media are switched a lot. Besides
     introducing additional seeks, this also can lead to extreme file
     fragmentation on zoned devices when a lot of files are cached
     relative to the available writeback bandwidth.

     This adds a superblock field that allows the file system to
     override the default size, and sets it to the zone size for zoned
     XFS.

   - Add logging for slow writeback when it exceeds
     sysctl_hung_task_timeout_secs. This helps identify tasks waiting
     for a long time and pinpoint potential issues. Recording the
     starting jiffies is also useful when debugging a crashed vmcore.

   - Wake up waiting tasks when finishing the writeback of a chunk

  Cleanups:

   - filemap_* writeback interface cleanups.

     Adding filemap_fdatawrite_wbc ended up being a mistake, as all but
     the original btrfs caller should be using better high level
     interfaces instead.

     This series removes all these low-level interfaces, switches btrfs
     to a more specific interface, and cleans up other too low-level
     interfaces. With this the writeback_control that is passed to the
     writeback code is only initialized in three places.

   - Remove __filemap_fdatawrite, __filemap_fdatawrite_range, and
     filemap_fdatawrite_wbc

   - Add filemap_flush_nr helper for btrfs

   - Push struct writeback_control into start_delalloc_inodes in btrfs

   - Rename filemap_fdatawrite_range_kick to filemap_flush_range

   - Stop opencoding filemap_fdatawrite_range in 9p, ocfs2, and mm

   - Make wbc_to_tag() inline and use it in fs"

* tag 'vfs-6.19-rc1.writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs: Make wbc_to_tag() inline and use it in fs.
  xfs: set s_min_writeback_pages for zoned file systems
  writeback: allow the file system to override MIN_WRITEBACK_PAGES
  writeback: cleanup writeback_chunk_size
  mm: rename filemap_fdatawrite_range_kick to filemap_flush_range
  mm: remove __filemap_fdatawrite_range
  mm: remove filemap_fdatawrite_wbc
  mm: remove __filemap_fdatawrite
  mm,btrfs: add a filemap_flush_nr helper
  btrfs: push struct writeback_control into start_delalloc_inodes
  btrfs: use the local tmp_inode variable in start_delalloc_inodes
  ocfs2: don't opencode filemap_fdatawrite_range in ocfs2_journal_submit_inode_data_buffers
  9p: don't opencode filemap_fdatawrite_range in v9fs_mmap_vm_close
  mm: don't opencode filemap_fdatawrite_range in filemap_invalidate_inode
  writeback: Add logging for slow writeback (exceeds sysctl_hung_task_timeout_secs)
  writeback: Wake up waiting tasks when finishing the writeback of a chunk.
2025-12-01 09:20:51 -08:00
Linus Torvalds
9368f0f941 vfs-6.19-rc1.inode
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZAAKCRCRxhvAZXjc
 omMSAP9GLhavxyWQ24Q+49CNWWRQWDY1wTOiUK2BwtIvZ0YEcAD8D1dAiMckL5pC
 RwEAVA5p+y+qi+bZP0KXCBxQddoTIQM=
 =zo/J
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc1.inode' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs inode updates from Christian Brauner:
 "Features:

   - Hide inode->i_state behind accessors. Open-coded accesses prevent
     asserting they are done correctly. One obvious aspect is locking,
     but significantly more can be checked. For example it can be
     detected when the code is clearing flags which are already missing,
     or is setting flags when it is illegal (e.g., I_FREEING when
     ->i_count > 0)

   - Provide accessors for ->i_state, converts all filesystems using
     coccinelle and manual conversions (btrfs, ceph, smb, f2fs, gfs2,
     overlayfs, nilfs2, xfs), and makes plain ->i_state access fail to
     compile

   - Rework I_NEW handling to operate without fences, simplifying the
     code after the accessor infrastructure is in place

  Cleanups:

   - Move wait_on_inode() from writeback.h to fs.h

   - Spell out fenced ->i_state accesses with explicit smp_wmb/smp_rmb
     for clarity

   - Cosmetic fixes to LRU handling

   - Push list presence check into inode_io_list_del()

   - Touch up predicts in __d_lookup_rcu()

   - ocfs2: retire ocfs2_drop_inode() and I_WILL_FREE usage

   - Assert on ->i_count in iput_final()

   - Assert ->i_lock held in __iget()

  Fixes:

   - Add missing fences to I_NEW handling"

* tag 'vfs-6.19-rc1.inode' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (22 commits)
  dcache: touch up predicts in __d_lookup_rcu()
  fs: push list presence check into inode_io_list_del()
  fs: cosmetic fixes to lru handling
  fs: rework I_NEW handling to operate without fences
  fs: make plain ->i_state access fail to compile
  xfs: use the new ->i_state accessors
  nilfs2: use the new ->i_state accessors
  overlayfs: use the new ->i_state accessors
  gfs2: use the new ->i_state accessors
  f2fs: use the new ->i_state accessors
  smb: use the new ->i_state accessors
  ceph: use the new ->i_state accessors
  btrfs: use the new ->i_state accessors
  Manual conversion to use ->i_state accessors of all places not covered by coccinelle
  Coccinelle-based conversion to use ->i_state accessors
  fs: provide accessors for ->i_state
  fs: spell out fenced ->i_state accesses with explicit smp_wmb/smp_rmb
  fs: move wait_on_inode() from writeback.h to fs.h
  fs: add missing fences to I_NEW handling
  ocfs2: retire ocfs2_drop_inode() and I_WILL_FREE usage
  ...
2025-12-01 09:02:34 -08:00
Linus Torvalds
b04b2e7a61 vfs-6.19-rc1.misc
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZAAKCRCRxhvAZXjc
 onGCAQDEHKNEuZMhkyd3K5YsJtMzZlW/uXp4+Wddeob+5yQp0wEA09xN4CJNMwhP
 J6Kjaa80hWfrFacqSvyMUwQHHw6mngs=
 =5Mom
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull misc vfs updates from Christian Brauner:
 "Features:

   - Cheaper MAY_EXEC handling for path lookup. This elides MAY_WRITE
     permission checks during path lookup and adds the
     IOP_FASTPERM_MAY_EXEC flag so filesystems like btrfs can avoid
     expensive permission work.

   - Hide dentry_cache behind runtime const machinery.

   - Add German Maglione as virtiofs co-maintainer.

  Cleanups:

   - Tidy up and inline step_into() and walk_component() for improved
     code generation.

   - Re-enable IOCB_NOWAIT writes to files. This refactors file
     timestamp update logic, fixing a layering bypass in btrfs when
     updating timestamps on device files and improving FMODE_NOCMTIME
     handling in VFS now that nfsd started using it.

   - Path lookup optimizations extracting slowpaths into dedicated
     routines and adding branch prediction hints for mntput_no_expire(),
     fd_install(), lookup_slow(), and various other hot paths.

   - Enable clang's -fms-extensions flag, requiring a JFS rename to
     avoid conflicts.

   - Remove spurious exports in fs/file_attr.c.

   - Stop duplicating union pipe_index declaration. This depends on the
     shared kbuild branch that brings in -fms-extensions support which
     is merged into this branch.

   - Use MD5 library instead of crypto_shash in ecryptfs.

   - Use largest_zero_folio() in iomap_dio_zero().

   - Replace simple_strtol/strtoul with kstrtoint/kstrtouint in init and
     initrd code.

   - Various typo fixes.

  Fixes:

   - Fix emergency sync for btrfs. Btrfs requires an explicit sync_fs()
     call with wait == 1 to commit super blocks. The emergency sync path
     never passed this, leaving btrfs data uncommitted during emergency
     sync.

   - Use local kmap in watch_queue's post_one_notification().

   - Add hint prints in sb_set_blocksize() for LBS dependency on THP"

* tag 'vfs-6.19-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (35 commits)
  MAINTAINERS: add German Maglione as virtiofs co-maintainer
  fs: inline step_into() and walk_component()
  fs: tidy up step_into() & friends before inlining
  orangefs: use inode_update_timestamps directly
  btrfs: fix the comment on btrfs_update_time
  btrfs: use vfs_utimes to update file timestamps
  fs: export vfs_utimes
  fs: lift the FMODE_NOCMTIME check into file_update_time_flags
  fs: refactor file timestamp update logic
  include/linux/fs.h: trivial fix: regualr -> regular
  fs/splice.c: trivial fix: pipes -> pipe's
  fs: mark lookup_slow() as noinline
  fs: add predicts based on nd->depth
  fs: move mntput_no_expire() slowpath into a dedicated routine
  fs: remove spurious exports in fs/file_attr.c
  watch_queue: Use local kmap in post_one_notification()
  fs: touch up predicts in path lookup
  fs: move fd_install() slowpath into a dedicated routine and provide commentary
  fs: hide dentry_cache behind runtime const machinery
  fs: touch predicts in do_dentry_open()
  ...
2025-12-01 08:44:26 -08:00
Linus Torvalds
1885cdbfbb vfs-6.19-rc1.iomap
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZAAKCRCRxhvAZXjc
 ooCXAQCwzX2GS/55QHV6JXBBoNxguuSQ5dCj91ZmTfHzij0xNAEAhKEBw7iMGX72
 c2/x+xYf+Pc6mAfxdus5RLMggqBFPAk=
 =jInB
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc1.iomap' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull iomap updates from Christian Brauner:
 "FUSE iomap Support for Buffered Reads:

    This adds iomap support for FUSE buffered reads and readahead. This
    enables granular uptodate tracking with large folios so only
    non-uptodate portions need to be read. Also fixes a race condition
    with large folios + writeback cache that could cause data corruption
    on partial writes followed by reads.

     - Refactored iomap read/readahead bio logic into helpers
     - Added caller-provided callbacks for read operations
     - Moved buffered IO bio logic into new file
     - FUSE now uses iomap for read_folio and readahead

  Zero Range Folio Batch Support:

    Add folio batch support for iomap_zero_range() to handle dirty
    folios over unwritten mappings. Fix raciness issues where dirty data
    could be lost during zero range operations.

     - filemap_get_folios_tag_range() helper for dirty folio lookup
     - Optional zero range dirty folio processing
     - XFS fills dirty folios on zero range of unwritten mappings
     - Removed old partial EOF zeroing optimization

  DIO Write Completions from Interrupt Context:

    Restore pre-iomap behavior where pure overwrite completions run
    inline rather than being deferred to workqueue. Reduces context
    switches for high-performance workloads like ScyllaDB.

     - Removed unused IOCB_DIO_CALLER_COMP code
     - Error completions always run in user context (fixes zonefs)
     - Reworked REQ_FUA selection logic
     - Inverted IOMAP_DIO_INLINE_COMP to IOMAP_DIO_OFFLOAD_COMP

  Buffered IO Cleanups:

    Some performance and code clarity improvements:

     - Replace manual bitmap scanning with find_next_bit()
     - Simplify read skip logic for writes
     - Optimize pending async writeback accounting
     - Better variable naming
     - Documentation for iomap_finish_folio_write() requirements

  Misaligned Vectors for Zoned XFS:

    Enables sub-block aligned vectors in XFS always-COW mode for zoned
    devices via new IOMAP_DIO_FSBLOCK_ALIGNED flag.

  Bug Fixes:

     - Allocate s_dio_done_wq for async reads (fixes syzbot report after
       error completion changes)
     - Fix iomap_read_end() for already uptodate folios (regression fix)"

* tag 'vfs-6.19-rc1.iomap' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (40 commits)
  iomap: allocate s_dio_done_wq for async reads as well
  iomap: fix iomap_read_end() for already uptodate folios
  iomap: invert the polarity of IOMAP_DIO_INLINE_COMP
  iomap: support write completions from interrupt context
  iomap: rework REQ_FUA selection
  iomap: always run error completions in user context
  fs, iomap: remove IOCB_DIO_CALLER_COMP
  iomap: use find_next_bit() for uptodate bitmap scanning
  iomap: use find_next_bit() for dirty bitmap scanning
  iomap: simplify when reads can be skipped for writes
  iomap: simplify ->read_folio_range() error handling for reads
  iomap: optimize pending async writeback accounting
  docs: document iomap writeback's iomap_finish_folio_write() requirement
  iomap: account for unaligned end offsets when truncating read range
  iomap: rename bytes_pending/bytes_accounted to bytes_submitted/bytes_not_submitted
  xfs: support sub-block aligned vectors in always COW mode
  iomap: add IOMAP_DIO_FSBLOCK_ALIGNED flag
  xfs: error tag to force zeroing on debug kernels
  iomap: remove old partial eof zeroing optimization
  xfs: fill dirty folios on zero range of unwritten mappings
  ...
2025-12-01 08:14:00 -08:00
Borislav Petkov (AMD)
e2349c5811 Merge remote-tracking branches 'ras/edac-amd-atl', 'ras/edac-drivers' and 'ras/edac-misc' into edac-updates
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2025-12-01 12:06:08 +01:00
Ingo Molnar
6ec33db1aa objtool: Fix segfault on unknown alternatives
So 'objtool --link -d vmlinux.o' gets surprised by this endbr64+endbr64 pattern
in ___bpf_prog_run():

	___bpf_prog_run:
	1e7680:  ___bpf_prog_run+0x0                                                     push   %r12
	1e7682:  ___bpf_prog_run+0x2                                                     mov    %rdi,%r12
	1e7685:  ___bpf_prog_run+0x5                                                     push   %rbp
	1e7686:  ___bpf_prog_run+0x6                                                     xor    %ebp,%ebp
	1e7688:  ___bpf_prog_run+0x8                                                     push   %rbx
	1e7689:  ___bpf_prog_run+0x9                                                     mov    %rsi,%rbx
	1e768c:  ___bpf_prog_run+0xc                                                     movzbl (%rbx),%esi
	1e768f:  ___bpf_prog_run+0xf                                                     movzbl %sil,%edx
	1e7693:  ___bpf_prog_run+0x13                                                    mov    %esi,%eax
	1e7695:  ___bpf_prog_run+0x15                                                    mov    0x0(,%rdx,8),%rdx
	1e769d:  ___bpf_prog_run+0x1d                                                    jmp    0x1e76a2 <__x86_indirect_thunk_rdx>
	1e76a2:  ___bpf_prog_run+0x22                                                    endbr64
	1e76a6:  ___bpf_prog_run+0x26                                                    endbr64
	1e76aa:  ___bpf_prog_run+0x2a                                                    mov    0x4(%rbx),%edx

And crashes due to blindly dereferencing alt->insn->alt_group.

Bail out on NULL ->alt_group, which produces this warning and continues
with the disassembly, instead of a segfault:

  .git/O/vmlinux.o: warning: objtool: <alternative.1e769d>: failed to disassemble alternative

Cc: Alexandre Chartre <alexandre.chartre@oracle.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-12-01 10:42:27 +01:00
Randy Dunlap
43decb6b62 locking/local_lock: Fix all kernel-doc warnings
Modify kernel-doc comments in local_lock.h to prevent warnings:

  Warning: include/linux/local_lock.h:9 function parameter 'lock' not described in 'local_lock_init'
  Warning: include/linux/local_lock.h:56 function parameter 'lock' not described in 'local_trylock_init'
  Warning: include/linux/local_lock.h:56 expecting prototype for local_lock_init(). Prototype was for local_trylock_init() instead

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251128065925.917917-1-rdunlap@infradead.org
2025-12-01 06:56:16 +01:00
Vincent Mailhol
719e357fc0 locking/local_lock: s/l/__l/ and s/tl/__tl/ to reduce the risk of shadowing
The Linux kernel coding style advises to avoid common variable
names in function-like macros to reduce the risk of namespace
collisions.

Throughout local_lock_internal.h, several macros use the rather common
variable names 'l' and 'tl'. This already resulted in an actual
collision: the __local_lock_acquire() function like macro is currently
shadowing the parameter 'l' of the:

  class_##_name##_t class_##_name##_constructor(_type *l)

function factory from <linux/cleanup.h>.

Rename the variable 'l' to '__l' and the variable 'tl' to '__tl'
throughout the file to fix the current namespace collision and
to prevent future ones.

[ bigeasy: Rebase, update all l and tl instances in macros ]

Signed-off-by: Vincent Mailhol <mailhol@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://patch.msgid.link/20251127144140.215722-3-bigeasy@linutronix.de
2025-12-01 06:56:16 +01:00