1
0
mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-01-11 17:10:13 +00:00

Compare commits

...

125 Commits

Author SHA1 Message Date
Linus Torvalds
5164715690 Crypto library fixes for v6.19-rc2
- Fix a performance issue with the scoped_ksimd() macro (new in 6.19)
    where it unnecessarily initialized the entire fpsimd state.
 
  - Add a missing gitignore entry for a generated file added in 6.18.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaURPFxQcZWJpZ2dlcnNA
 a2VybmVsLm9yZwAKCRDzXCl4vpKOK5mJAQDGJf9ee26tDWeIEmd8YLv0tsptv8nB
 e0nV+HnkAujAkQD/XGQttmBq9vBElRd1xeiQEgzYS1i+VRZO4Wa9wPEQWwg=
 =KPuD
 -----END PGP SIGNATURE-----

Merge tag 'libcrypto-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux

Pull crypto library fixes from Eric Biggers:

 - Fix a performance issue with the scoped_ksimd() macro (new in 6.19)
   where it unnecessarily initialized the entire fpsimd state.

 - Add a missing gitignore entry for a generated file added in 6.18.

* tag 'libcrypto-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  lib/crypto: riscv: Add poly1305-core.S to .gitignore
  arm64/simd: Avoid pointless clearing of FP/SIMD buffer
2025-12-19 08:39:48 +12:00
Linus Torvalds
5caa3808bc ACPI fixes for 6.19-rc2
Add a missing PCC check for guaranteed_perf in the ACPI CPPC
 library and fix a static local variable access race condition
 in acpi_pcc_address_space_setup() (Pengjie Zhang).
 -----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmlEXAMSHHJqd0Byand5
 c29ja2kubmV0AAoJEO5fvZ0v1OO1mIIIAKfHlyXaIwE1cG+9PJZ5OrJjWva2pfmP
 itbphuEtHbgnvJqTgrajc3MKK080YvdyM6iwS99JzhWARLPbPvp3foS0Zl3Lh1dh
 uDUF4u7qpi/8J2chxFJUsPopeLHGKBfj6vxw9HWQGVM7QiQMm4IdHMuOZ2EoFfqc
 KeNNg1Ni/33UoDM5xilCnVZe1LkTBFCUakdtaJ2lp/2I0o7hjjHnv3NmEndGCFTb
 yAGd8O9IY/xyZluph5RmjLsoXlepY0LEb+XSDc0oIvrEge9mhdd6uIreRTdphJz8
 IuaauOVqkrlEQ1AtkOL6h2JFg9wZgxuL3uX/1fUf3HWbafiCur5Zn8k=
 =l3oz
 -----END PGP SIGNATURE-----

Merge tag 'acpi-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "These add a missing PCC check for guaranteed_perf in the ACPI CPPC
  library and fix a static local variable access race condition in
  acpi_pcc_address_space_setup() (Pengjie Zhang)"

* tag 'acpi-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: PCC: Fix race condition by removing static qualifier
  ACPI: CPPC: Fix missing PCC check for guaranteed_perf
2025-12-19 08:37:08 +12:00
Linus Torvalds
eb23a1198d Power management fixes for 6.19-rc2
- Fix CPU hotplug locking deadlock reported by lockdep after a recent
    update of the Intel RAPL power capping driver (Srinivas Pandruvada)
 
  - Fix sscanf() error return value handling in the power capping core
    and a race condition in register_control_type() (Sumeet Pawnikar)
 
  - Fix a concurrent bit field update issue in the runtime PM core code
    by only updating the bit field in question when runtime PM is
    disabled (Rafael Wysocki)
 -----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmlEWLsSHHJqd0Byand5
 c29ja2kubmV0AAoJEO5fvZ0v1OO1mloH/3ioUyzreSj90UTRA7B5HP/965pNWYOE
 io3nNcYvVt00cvIsCai2KfhGxC5rcan1jGAl0ZeGOFlKzK1u9DsVl8SBpOsRr39a
 QB7fuOPICwTJNuksPsHcvU7sBOMIPR7oNC9sWl41oa0IdKJDkJfHRSnNu9xO8B6L
 B8fBGPuHYMgKMOVuax1WgzMFMCMxbjARE1p+SU9b+/Ojjdp+lg6oRzGQsUoy0JEy
 9tnVLzMdroin+ae1UcXLLpJU6NsPBxeg1uV+tmMb9Hm8fZl8+WGDzF7/3mZV1a+k
 xDXrWpFtmtUYXYu6hK6rs4sY+49N4UI4sP2P0SKMedtQ1qmnMEil/wk=
 =AGwI
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix three issues in the power capping code including one recent
  regression and a runtime PM framework regression introduced during the
  6.17 development cycle:

   - Fix CPU hotplug locking deadlock reported by lockdep after a recent
     update of the Intel RAPL power capping driver (Srinivas Pandruvada)

   - Fix sscanf() error return value handling in the power capping core
     and a race condition in register_control_type() (Sumeet Pawnikar)

   - Fix a concurrent bit field update issue in the runtime PM core code
     by only updating the bit field in question when runtime PM is
     disabled (Rafael Wysocki)"

* tag 'pm-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  powercap: intel_rapl: Fix possible recursive lock warning
  PM: runtime: Do not clear needs_force_resume with enabled runtime PM
  powercap: fix sscanf() error return value handling
  powercap: fix race condition in register_control_type()
2025-12-19 08:28:02 +12:00
Linus Torvalds
14e0e8d0fc Thermal control updates for 6.19-rc2
- Set a feature flag in the int340x thermal driver to enable the
    power slider interface for Wildcat Lake processors (Srinivas
    Pandruvada)
 
  - Fix typo and indentation in comments in the thermal core (Thorsten
    Blum)
 -----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmlEW84SHHJqd0Byand5
 c29ja2kubmV0AAoJEO5fvZ0v1OO1dtoH/1RYu0QSZDMud/0UOX4Vipj8Gw9yIFaE
 W8yTaVSlnZ/S0T4Q7MyPvSaqPYs4l0sm3c796afTL12JiW9SqzOsGWPbJzZiDMQ6
 4zID4qZ/db4UN9WEmDAD6TGA9qZhtsXVNN01TQxYiJhK/KJ/62EYVEoI4o7rgto7
 jYsI/Ulfvdx7nwx2mmvdBARPstOvnxkj+2M7ezWYKjvx/oJpEdjZLeWe+zyCwkhj
 NAkOLKPWpBKLDH+wZPEeYkowOShVTSUFXPw/CEg+wcO5iUOwN+zb7gKNpyw1M1gc
 v/5EtTUdnHsuX7Q1O+Q8omTeGNCp14w12wc4hoEdsuGTpfQ0p9ofBo4=
 =8F9W
 -----END PGP SIGNATURE-----

Merge tag 'thermal-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control fixes from Rafael Wysocki:
 "These enable a new hardware feature in the int340x thermal driver and
  fix up comments in the thermal core code:

   - Set a feature flag in the int340x thermal driver to enable the
     power slider interface for Wildcat Lake processors (Srinivas
     Pandruvada)

   - Fix typo and indentation in comments in the thermal core (Thorsten
     Blum)"

* tag 'thermal-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: core: Fix typo and indentation in comments
  thermal: intel: int340x: Enable power slider interface for Wildcat Lake
2025-12-19 08:23:23 +12:00
Linus Torvalds
cf26839d7f iommufd 6.19 first rc pull request
Few minor fixes, other than the randconfig fix this is only relevant to
 test code, not releases.
 
 - Randconfig failure if CONFIG_DMA_SHARED_BUFFER is not set
 
 - Remove gcc warning in kselftest
 
 - Fix a refcount leak on an error path in the selftest support code
 
 - Fix missing overflow checks in the selftest support code
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCaURMcAAKCRCFwuHvBreF
 YUUIAP4u6RUth7SkqxnoM8KOqsV3YaT721AFc+lH/NA4D+XX3gD/VRHwORq8RiCQ
 APzvUXP/+j+FBy2b0NINfGOdr+cfLAI=
 =0fWH
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd

Pull iommufd fixes from Jason Gunthorpe:
 "A few minor fixes, other than the randconfig fix this is only relevant
  to test code, not releases:

   - Randconfig failure if CONFIG_DMA_SHARED_BUFFER is not set

   - Remove gcc warning in kselftest

   - Fix a refcount leak on an error path in the selftest support code

   - Fix missing overflow checks in the selftest support code"

* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
  iommufd/selftest: Check for overflow in IOMMU_TEST_OP_ADD_RESERVED
  iommufd/selftest: Do not leak the hwpt if IOMMU_TEST_OP_MD_CHECK_MAP fails
  iommufd/selftest: Make it clearer to gcc that the access is not out of bounds
  iommufd: Fix building without dmabuf
2025-12-19 08:11:53 +12:00
Linus Torvalds
7b8e9264f5 Including fixes from netfilter and CAN.
Current release - regressions:
 
   - netfilter: nf_conncount: fix leaked ct in error paths
 
   - sched: act_mirred: fix loop detection
 
   - sctp: fix potential deadlock in sctp_clone_sock()
 
   - can: fix build dependency
 
   - eth: mlx5e: do not update BQL of old txqs during channel reconfiguration
 
 Previous releases - regressions:
 
   - sched: ets: always remove class from active list before deleting it
 
   - inet: frags: flush pending skbs in fqdir_pre_exit()
 
   - netfilter:  nf_nat: remove bogus direction check
 
   - mptcp:
     - schedule rtx timer only after pushing data
     - avoid deadlock on fallback while reinjecting
 
   - can: gs_usb: fix error handling
 
   - eth: mlx5e:
     - avoid unregistering PSP twice
     - fix double unregister of HCA_PORTS component
 
   - eth: bnxt_en: fix XDP_TX path
 
   - eth: mlxsw: fix use-after-free when updating multicast route stats
 
 Previous releases - always broken:
 
   - ethtool: avoid overflowing userspace buffer on stats query
 
   - openvswitch: fix middle attribute validation in push_nsh() action
 
   - eth: mlx5: fw_tracer, validate format string parameters
 
   - eth: mlxsw: spectrum_router: fix neighbour use-after-free
 
   - eth: ipvlan: ignore PACKET_LOOPBACK in handle_mode_l2()
 
 Misc:
 
   - Jozsef Kadlecsik retires from maintaining netfilter
 
   - tools: ynl: fix build on systems with old kernel headers
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmlEPS0SHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkHLEP/3TnVI9qLivOGsZ48bn5UUcIaIlX0wiw
 i5gwpUhbtW3zcLSphO0Nh/CmProme6dFhaMOOkk48bKAdIxRpOMbCC20bfYDLyd0
 ZJTUQqheKpuI23vOnhfs2TTOkcqz6cM7txUq681taQ8Nfo8Dbf0fgOO2HT5ljRD+
 JXlxrdvicZDtve68sSdnsAj5M15EPQGKTPMyPkymBmCKnCMQK9SQKaeTEtaW6NIO
 yM0KqDSNoulDc/6LYMhx8DTtyE7yTiTxwe2NixjdyYljXsk0KiRDirCzxnZuSgh8
 oh+cFLdFDq4mwdYEKjgC5c3ifQyLEpZvwzlY5MKoobsVnT5SbigeSK53l5rEkO3V
 sM84lITHfqTJvlid0AF/ixEc6iWwV7nGRBHh2FXNbfoIKt45eF77jPi9YFsq6Z95
 vlCzYIbY0f2L1y3mPvZbzGQbh2Z12b5kyK8QA1j7SK+zNzxgXbf4+ZtYxHg7O3Ne
 gecmIpTKXMWodaZyfsRQPjR/F6UIlMqsgl9Ci9bfUw+XwL8x+7bJxQAQz+yVjLla
 ng14BItiYKBavcPZBjYlhGKqD1fzGhVZqQecrCkF0VTbMusRd9RcwytU1NG4QGDx
 V5aL28ht85KtMednEWOBkrg+PeXnNyZHzLAf2Xtx3UkaGgiDC8G4IxUv3orlOLD1
 sFPfZnSiGCof
 =6pla
 -----END PGP SIGNATURE-----

Merge tag 'net-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from netfilter and CAN.

  Current release - regressions:

   - netfilter: nf_conncount: fix leaked ct in error paths

   - sched: act_mirred: fix loop detection

   - sctp: fix potential deadlock in sctp_clone_sock()

   - can: fix build dependency

   - eth: mlx5e: do not update BQL of old txqs during channel
     reconfiguration

  Previous releases - regressions:

   - sched: ets: always remove class from active list before deleting it

   - inet: frags: flush pending skbs in fqdir_pre_exit()

   - netfilter: nf_nat: remove bogus direction check

   - mptcp:
      - schedule rtx timer only after pushing data
      - avoid deadlock on fallback while reinjecting

   - can: gs_usb: fix error handling

   - eth:
      - mlx5e:
         - avoid unregistering PSP twice
         - fix double unregister of HCA_PORTS component
      - bnxt_en: fix XDP_TX path
      - mlxsw: fix use-after-free when updating multicast route stats

  Previous releases - always broken:

   - ethtool: avoid overflowing userspace buffer on stats query

   - openvswitch: fix middle attribute validation in push_nsh() action

   - eth:
      - mlx5: fw_tracer, validate format string parameters
      - mlxsw: spectrum_router: fix neighbour use-after-free
      - ipvlan: ignore PACKET_LOOPBACK in handle_mode_l2()

  Misc:

   - Jozsef Kadlecsik retires from maintaining netfilter

   - tools: ynl: fix build on systems with old kernel headers"

* tag 'net-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits)
  net: hns3: add VLAN id validation before using
  net: hns3: using the num_tqps to check whether tqp_index is out of range when vf get ring info from mbx
  net: hns3: using the num_tqps in the vf driver to apply for resources
  net: enetc: do not transmit redirected XDP frames when the link is down
  selftests/tc-testing: Test case exercising potential mirred redirect deadlock
  net/sched: act_mirred: fix loop detection
  sctp: Clear inet_opt in sctp_v6_copy_ip_options().
  sctp: Fetch inet6_sk() after setting ->pinet6 in sctp_clone_sock().
  net/handshake: duplicate handshake cancellations leak socket
  net/mlx5e: Don't include PSP in the hard MTU calculations
  net/mlx5e: Do not update BQL of old txqs during channel reconfiguration
  net/mlx5e: Trigger neighbor resolution for unresolved destinations
  net/mlx5e: Use ip6_dst_lookup instead of ipv6_dst_lookup_flow for MAC init
  net/mlx5: Serialize firmware reset with devlink
  net/mlx5: fw_tracer, Handle escaped percent properly
  net/mlx5: fw_tracer, Validate format string parameters
  net/mlx5: Drain firmware reset in shutdown callback
  net/mlx5: fw reset, clear reset requested on drain_fw_reset
  net: dsa: mxl-gsw1xx: manually clear RANEG bit
  net: dsa: mxl-gsw1xx: fix .shutdown driver operation
  ...
2025-12-19 07:55:35 +12:00
Linus Torvalds
a91e1138b7 3 smb3 client fixes
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmlEL34ACgkQiiy9cAdy
 T1G7MAv/aPWShb8uwNhTy6OuZzR6sDxcx5eTS0L7zPY2YqWs9Pf7SRbYsW38uNhU
 v7biZqwHOjMBRlNPKCahU9gs+thliZMYrD+dkEL5FIMrD1O2kUq1Ulx3FxuzwFXh
 lCePdiRkJVNS2N/jsDjHapl+wdURV4SwFZ9k4McbUxwPJ9peifYdlQfTrKh4nz+v
 gF6S8B2ElSbFTQ6ejXpBd4aFPY/xZyvs0F9mW84Cty0I8cjoXMYaoStbS3llarPd
 FuSBB5za6Wx85O7iOPMt7qcRgZ8eSdtdWf4fVle2nbQZWW7XFptJ7V+En+or2g2e
 +d4qq5tyHgp3RCCEk1c7R/ndZp9xjbKuExPzoor3HQ/9qpnMem57HVeuiDPO0b+D
 pbH/mX4jcow+jU9+urs5BxzDoWOkZ3Hm5rzSGr407kX2YCw2KP6Oxy3sM00zh+kH
 aGG5oZzgXUBLWMEQkJi2b6DKQ6OF0zLA8rlow2dkaZHjmHZABf1eCqsgTZuui7Nq
 Lo8iUhHX
 =Oj0h
 -----END PGP SIGNATURE-----

Merge tag 'v6.19-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:

 - important fix for reconnect problem

 - minor cleanup

* tag 'v6.19-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update internal module version number
  smb: move some SMB1 definitions into common/smb1pdu.h
  smb: align durable reconnect v2 context to 8 byte boundary
2025-12-19 07:50:20 +12:00
Linus Torvalds
9a903e6d96 \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmlEKB4ACgkQnJ2qBz9k
 QNkY9gf6Av2Dz1zJiPdICxLBWxFYIWmw+tqzV9ZjpKkSV8K0jJ2wqfoqbh2LZ8AN
 Lh0uUMw8wvxQYtnEcvrKHVwd6zjng2GtzIi8nO6IxeBOQTwyuxxGvj6YfKxD9ffg
 AgpJ1oPmJz6/UiBeRGX/IobXkh3ZHHbP8M094RLjoHUekbzz0bIMTBpkXXZK04Bs
 iysFptvASPQ14D/bXou5HwP/egET+VprCgyGfQzsyQELK+Cijt9P07aVk7mdMyv2
 E45atP97TjtgJE018WMKL6LpO8j2mma7a2K/CosL9MslucuLfL8+QX+i2ZVhyuNo
 akchA3L1ugAfkxUDRVMrbim/rDBAGA==
 =tktL
 -----END PGP SIGNATURE-----

Merge tag 'fsnotify_for_v6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull fsnotify fixes from Jan Kara:
 "Two fsnotify fixes.

  The fix from Ahelenia makes sure we generate event when modifying
  inode flags, the fix from Amir disables sending of events from device
  inodes to their parent directory as it could concievably create a
  usable side channel attack in case of some devices and so far we
  aren't aware of anybody depending on the functionality"

* tag 'fsnotify_for_v6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fs: send fsnotify_xattr()/IN_ATTRIB from vfs_fileattr_set()/chattr(1)
  fsnotify: do not generate ACCESS/MODIFY events on child for special files
2025-12-19 07:41:17 +12:00
Rafael J. Wysocki
277141a897 Merge branch 'pm-powercap'
Merge power capping fixes for 6.19-rc2:

 - Fix CPU hotplug locking deadlock reported by lockdep after a recent
   update of the Intel RAPL power capping driver (Srinivas Pandruvada)

 - Fix sscanf() error return value handling in the power capping core
   and a race condition in register_control_type() (Sumeet Pawnikar)

* pm-powercap:
  powercap: intel_rapl: Fix possible recursive lock warning
  powercap: fix sscanf() error return value handling
  powercap: fix race condition in register_control_type()
2025-12-18 20:31:10 +01:00
Paolo Abeni
21a88f5d9c linux-can-fixes-for-6.19-20251218
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEn/sM2K9nqF/8FWzzDHRl3/mQkZwFAmlDyJcTHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRAMdGXf+ZCRnOZdB/9o/qoE3SiQkgNbVce4iDQqm/4jZFEf
 Wz4KjpucPxw00z25VbINeM6G4Tjfztmw6T6m1GarwEzW0HTNGUirjXJr60Bxz3OF
 DgjPfc/w5PU5f/I84RBdSnnFEGKt1mAd3WrJjBEGbLZYxx3EHh+PxJXDM2qAmxVT
 9Ud2U6X9oEdor78IFrO8Z/rt2YMPBZJ0JmldQYkvxmBmUiDMgcY/E9vNHejUwHTc
 ysE7c6MJ7tGalm6jXlV5kpWycsNbY52TbOD242jTJnWVHr2GKOu8k8GymkXSZnQP
 NG/h20elyIsD5N0OaxQODorM8UiEueovO24TqVqwl7hxTw8IFK+AUXC5
 =cXcU
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-6.19-20251218' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2025-12-18

this is a pull request of 3 patches for net/main.

Tetsuo Handa contributes 2 patches to fix race windows in the j1939
protocol to properly handle disappearing network devices.

The last patch is by me, it fixes a build dependency with the CAN
drivers, that got introduced while fixing a dependency between the CAN
protocol and CAN device code.

linux-can-fixes-for-6.19-20251218

* tag 'linux-can-fixes-for-6.19-20251218' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: fix build dependency
  can: j1939: make j1939_sk_bind() fail if device is no longer registered
  can: j1939: make j1939_session_activate() fail if device is no longer registered
====================

Link: https://patch.msgid.link/20251218123132.664533-1-mkl@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 17:23:07 +01:00
Paolo Abeni
373a34addc Merge branch 'there-are-some-bugfix-for-the-hns3-ethernet-driver'
Jijie Shao says:

====================
There are some bugfix for the HNS3 ethernet driver
====================

Link: https://patch.msgid.link/20251211023737.2327018-1-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 16:58:32 +01:00
Jian Shen
6ef935e659 net: hns3: add VLAN id validation before using
Currently, the VLAN id may be used without validation when
receive a VLAN configuration mailbox from VF. The length of
vlan_del_fail_bmap is BITS_TO_LONGS(VLAN_N_VID). It may cause
out-of-bounds memory access once the VLAN id is bigger than
or equal to VLAN_N_VID.

Therefore, VLAN id needs to be checked to ensure it is within
the range of VLAN_N_VID.

Fixes: fe4144d47eef ("net: hns3: sync VLAN filter entries when kill VLAN ID failed")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251211023737.2327018-4-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 16:58:28 +01:00
Jian Shen
d180c11aa8 net: hns3: using the num_tqps to check whether tqp_index is out of range when vf get ring info from mbx
Currently, rss_size = num_tqps / tc_num. If tc_num is 1, then num_tqps
equals rss_size. However, if the tc_num is greater than 1, then rss_size
will be less than num_tqps, causing the tqp_index check for subsequent TCs
using rss_size to always fail.

This patch uses the num_tqps to check whether tqp_index is out of range,
instead of rss_size.

Fixes: 326334aad024 ("net: hns3: add a check for tqp_index in hclge_get_ring_chain_from_mbx()")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251211023737.2327018-3-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 16:58:28 +01:00
Jian Shen
c2a1626974 net: hns3: using the num_tqps in the vf driver to apply for resources
Currently, hdev->htqp is allocated using hdev->num_tqps, and kinfo->tqp
is allocated using kinfo->num_tqps. However, kinfo->num_tqps is set to
min(new_tqps, hdev->num_tqps);  Therefore, kinfo->num_tqps may be smaller
than hdev->num_tqps, which causes some hdev->htqp[i] to remain
uninitialized in hclgevf_knic_setup().

Thus, this patch allocates hdev->htqp and kinfo->tqp using hdev->num_tqps,
ensuring that the lengths of hdev->htqp and kinfo->tqp are consistent
and that all elements are properly initialized.

Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251211023737.2327018-2-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 16:58:28 +01:00
Wei Fang
2939203ffe net: enetc: do not transmit redirected XDP frames when the link is down
In the current implementation, the enetc_xdp_xmit() always transmits
redirected XDP frames even if the link is down, but the frames cannot
be transmitted from TX BD rings when the link is down, so the frames
are still kept in the TX BD rings. If the XDP program is uninstalled,
users will see the following warning logs.

fsl_enetc 0000:00:00.0 eno0: timeout for tx ring #6 clear

More worse, the TX BD ring cannot work properly anymore, because the
HW PIR and CIR are not equal after the re-initialization of the TX
BD ring. At this point, the BDs between CIR and PIR are invalid,
which will cause a hardware malfunction.

Another reason is that there is internal context in the ring prefetch
logic that will retain the state from the first incarnation of the ring
and continue prefetching from the stale location when we re-initialize
the ring. The internal context is only reset by an FLR. That is to say,
for LS1028A ENETC, software cannot set the HW CIR and PIR when
initializing the TX BD ring.

It does not make sense to transmit redirected XDP frames when the link is
down. Add a link status check to prevent transmission in this condition.
This fixes part of the issue, but more complex cases remain. For example,
the TX BD ring may still contain unsent frames when the link goes down.
Those situations require additional patches, which will build on this
one.

Fixes: 9d2b68cc108d ("net: enetc: add support for XDP_REDIRECT")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Hariprasad Kelam <hkelam@marvell.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20251211020919.121113-1-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 16:45:13 +01:00
Victor Nogueira
5cba412d6a selftests/tc-testing: Test case exercising potential mirred redirect deadlock
Add a test case that reproduces deadlock scenario where the user has
a drr qdisc attached to root and has a mirred action that redirects to
self on egress

Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20251210162255.1057663-2-jhs@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 16:42:18 +01:00
Jamal Hadi Salim
1d856251a0 net/sched: act_mirred: fix loop detection
Fix a loop scenario of ethx:egress->ethx:egress

Example setup to reproduce:
tc qdisc add dev ethx root handle 1: drr
tc filter add dev ethx parent 1: protocol ip prio 1 matchall \
         action mirred egress redirect dev ethx

Now ping out of ethx and you get a deadlock:

[  116.892898][  T307] ============================================
[  116.893182][  T307] WARNING: possible recursive locking detected
[  116.893418][  T307] 6.18.0-rc6-01205-ge05021a829b8-dirty #204 Not tainted
[  116.893682][  T307] --------------------------------------------
[  116.893926][  T307] ping/307 is trying to acquire lock:
[  116.894133][  T307] ffff88800c122908 (&sch->root_lock_key){+...}-{3:3}, at: __dev_queue_xmit+0x2210/0x3b50
[  116.894517][  T307]
[  116.894517][  T307] but task is already holding lock:
[  116.894836][  T307] ffff88800c122908 (&sch->root_lock_key){+...}-{3:3}, at: __dev_queue_xmit+0x2210/0x3b50
[  116.895252][  T307]
[  116.895252][  T307] other info that might help us debug this:
[  116.895608][  T307]  Possible unsafe locking scenario:
[  116.895608][  T307]
[  116.895901][  T307]        CPU0
[  116.896057][  T307]        ----
[  116.896200][  T307]   lock(&sch->root_lock_key);
[  116.896392][  T307]   lock(&sch->root_lock_key);
[  116.896605][  T307]
[  116.896605][  T307]  *** DEADLOCK ***
[  116.896605][  T307]
[  116.896864][  T307]  May be due to missing lock nesting notation
[  116.896864][  T307]
[  116.897123][  T307] 6 locks held by ping/307:
[  116.897302][  T307]  #0: ffff88800b4b0250 (sk_lock-AF_INET){+.+.}-{0:0}, at: raw_sendmsg+0xb20/0x2cf0
[  116.897808][  T307]  #1: ffffffff88c839c0 (rcu_read_lock){....}-{1:3}, at: ip_output+0xa9/0x600
[  116.898138][  T307]  #2: ffffffff88c839c0 (rcu_read_lock){....}-{1:3}, at: ip_finish_output2+0x2c6/0x1ee0
[  116.898459][  T307]  #3: ffffffff88c83960 (rcu_read_lock_bh){....}-{1:3}, at: __dev_queue_xmit+0x200/0x3b50
[  116.898782][  T307]  #4: ffff88800c122908 (&sch->root_lock_key){+...}-{3:3}, at: __dev_queue_xmit+0x2210/0x3b50
[  116.899132][  T307]  #5: ffffffff88c83960 (rcu_read_lock_bh){....}-{1:3}, at: __dev_queue_xmit+0x200/0x3b50
[  116.899442][  T307]
[  116.899442][  T307] stack backtrace:
[  116.899667][  T307] CPU: 2 UID: 0 PID: 307 Comm: ping Not tainted 6.18.0-rc6-01205-ge05021a829b8-dirty #204 PREEMPT(voluntary)
[  116.899672][  T307] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  116.899675][  T307] Call Trace:
[  116.899678][  T307]  <TASK>
[  116.899680][  T307]  dump_stack_lvl+0x6f/0xb0
[  116.899688][  T307]  print_deadlock_bug.cold+0xc0/0xdc
[  116.899695][  T307]  __lock_acquire+0x11f7/0x1be0
[  116.899704][  T307]  lock_acquire+0x162/0x300
[  116.899707][  T307]  ? __dev_queue_xmit+0x2210/0x3b50
[  116.899713][  T307]  ? srso_alias_return_thunk+0x5/0xfbef5
[  116.899717][  T307]  ? stack_trace_save+0x93/0xd0
[  116.899723][  T307]  _raw_spin_lock+0x30/0x40
[  116.899728][  T307]  ? __dev_queue_xmit+0x2210/0x3b50
[  116.899731][  T307]  __dev_queue_xmit+0x2210/0x3b50

Fixes: 178ca30889a1 ("Revert "net/sched: Fix mirred deadlock on device recursion"")
Tested-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20251210162255.1057663-1-jhs@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 16:42:18 +01:00
Paolo Abeni
cdc3074c00 Merge branch 'sctp-fix-two-issues-in-sctp_clone_sock'
Kuniyuki Iwashima says:

====================
sctp: Fix two issues in sctp_clone_sock().

syzbot reported two issues in sctp_clone_sock().

This series fixes the issues.

v1: https://lore.kernel.org/netdev/20251208133728.157648-1-kuniyu@google.com/
====================

Link: https://patch.msgid.link/20251210081206.1141086-1-kuniyu@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 16:18:02 +01:00
Kuniyuki Iwashima
d7ff61e6f3 sctp: Clear inet_opt in sctp_v6_copy_ip_options().
syzbot reported the splat below. [0]

Since the cited commit, the child socket inherits all fields
of its parent socket unless explicitly cleared.

syzbot set IP_OPTIONS to AF_INET6 socket and created a child
socket inheriting inet_sk(sk)->inet_opt.

sctp_v6_copy_ip_options() only clones np->opt, and leaving
inet_opt results in double-free.

Let's clear inet_opt in sctp_v6_copy_ip_options().

[0]:
BUG: KASAN: double-free in inet_sock_destruct+0x538/0x740 net/ipv4/af_inet.c:159
Free of addr ffff8880304b6d40 by task ksoftirqd/0/15

CPU: 0 UID: 0 PID: 15 Comm: ksoftirqd/0 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025
Call Trace:
 <TASK>
 dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
 print_address_description mm/kasan/report.c:378 [inline]
 print_report+0xca/0x240 mm/kasan/report.c:482
 kasan_report_invalid_free+0xea/0x110 mm/kasan/report.c:557
 check_slab_allocation+0xe1/0x130 include/linux/page-flags.h:-1
 kasan_slab_pre_free include/linux/kasan.h:198 [inline]
 slab_free_hook mm/slub.c:2484 [inline]
 slab_free mm/slub.c:6630 [inline]
 kfree+0x148/0x6d0 mm/slub.c:6837
 inet_sock_destruct+0x538/0x740 net/ipv4/af_inet.c:159
 __sk_destruct+0x89/0x660 net/core/sock.c:2350
 sock_put include/net/sock.h:1991 [inline]
 sctp_endpoint_destroy_rcu+0xa1/0xf0 net/sctp/endpointola.c:197
 rcu_do_batch kernel/rcu/tree.c:2605 [inline]
 rcu_core+0xcab/0x1770 kernel/rcu/tree.c:2861
 handle_softirqs+0x286/0x870 kernel/softirq.c:622
 run_ksoftirqd+0x9b/0x100 kernel/softirq.c:1063
 smpboot_thread_fn+0x542/0xa60 kernel/smpboot.c:160
 kthread+0x711/0x8a0 kernel/kthread.c:463
 ret_from_fork+0x4bc/0x870 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
 </TASK>

Allocated by task 6003:
 kasan_save_stack mm/kasan/common.c:56 [inline]
 kasan_save_track+0x3e/0x80 mm/kasan/common.c:77
 poison_kmalloc_redzone mm/kasan/common.c:400 [inline]
 __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:417
 kasan_kmalloc include/linux/kasan.h:262 [inline]
 __do_kmalloc_node mm/slub.c:5642 [inline]
 __kmalloc_noprof+0x411/0x7f0 mm/slub.c:5654
 kmalloc_noprof include/linux/slab.h:961 [inline]
 kzalloc_noprof include/linux/slab.h:1094 [inline]
 ip_options_get+0x51/0x4c0 net/ipv4/ip_options.c:517
 do_ip_setsockopt+0x1d9b/0x2d00 net/ipv4/ip_sockglue.c:1087
 ip_setsockopt+0x66/0x110 net/ipv4/ip_sockglue.c:1417
 do_sock_setsockopt+0x17c/0x1b0 net/socket.c:2360
 __sys_setsockopt net/socket.c:2385 [inline]
 __do_sys_setsockopt net/socket.c:2391 [inline]
 __se_sys_setsockopt net/socket.c:2388 [inline]
 __x64_sys_setsockopt+0x13f/0x1b0 net/socket.c:2388
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Freed by task 15:
 kasan_save_stack mm/kasan/common.c:56 [inline]
 kasan_save_track+0x3e/0x80 mm/kasan/common.c:77
 __kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:587
 kasan_save_free_info mm/kasan/kasan.h:406 [inline]
 poison_slab_object mm/kasan/common.c:252 [inline]
 __kasan_slab_free+0x5c/0x80 mm/kasan/common.c:284
 kasan_slab_free include/linux/kasan.h:234 [inline]
 slab_free_hook mm/slub.c:2539 [inline]
 slab_free mm/slub.c:6630 [inline]
 kfree+0x19a/0x6d0 mm/slub.c:6837
 inet_sock_destruct+0x538/0x740 net/ipv4/af_inet.c:159
 __sk_destruct+0x89/0x660 net/core/sock.c:2350
 sock_put include/net/sock.h:1991 [inline]
 sctp_endpoint_destroy_rcu+0xa1/0xf0 net/sctp/endpointola.c:197
 rcu_do_batch kernel/rcu/tree.c:2605 [inline]
 rcu_core+0xcab/0x1770 kernel/rcu/tree.c:2861
 handle_softirqs+0x286/0x870 kernel/softirq.c:622
 run_ksoftirqd+0x9b/0x100 kernel/softirq.c:1063
 smpboot_thread_fn+0x542/0xa60 kernel/smpboot.c:160
 kthread+0x711/0x8a0 kernel/kthread.c:463
 ret_from_fork+0x4bc/0x870 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

Fixes: 16942cf4d3e31 ("sctp: Use sk_clone() in sctp_accept().")
Reported-by: syzbot+ec33a1a006ed5abe7309@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6936d112.a70a0220.38f243.00a8.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20251210081206.1141086-3-kuniyu@google.com
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 16:18:00 +01:00
Kuniyuki Iwashima
b98f06f9a5 sctp: Fetch inet6_sk() after setting ->pinet6 in sctp_clone_sock().
syzbot reported the lockdep splat below. [0]

sctp_clone_sock() sets the child socket's ipv6_mc_list to NULL,
but somehow sock_release() in an error path finally acquires
lock_sock() in ipv6_sock_mc_close().

The root cause is that sctp_clone_sock() fetches inet6_sk(newsk)
before setting newinet->pinet6, meaning that the parent's
ipv6_mc_list was actually cleared.

Also, sctp_v6_copy_ip_options() uses inet6_sk() but is called
before newinet->pinet6 is set.

Let's use inet6_sk() only after setting newinet->pinet6.

[0]:
WARNING: possible recursive locking detected
syzkaller #0 Not tainted

syz.0.17/5996 is trying to acquire lock:
ffff888031af4c60 (sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1700 [inline]
ffff888031af4c60 (sk_lock-AF_INET6){+.+.}-{0:0}, at: ipv6_sock_mc_close+0xd3/0x140 net/ipv6/mcast.c:348

but task is already holding lock:
ffff888031af4320 (sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1700 [inline]
ffff888031af4320 (sk_lock-AF_INET6){+.+.}-{0:0}, at: sctp_getsockopt+0x135/0xb60 net/sctp/socket.c:8131

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(sk_lock-AF_INET6);
  lock(sk_lock-AF_INET6);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

1 lock held by syz.0.17/5996:
 #0: ffff888031af4320 (sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1700 [inline]
 #0: ffff888031af4320 (sk_lock-AF_INET6){+.+.}-{0:0}, at: sctp_getsockopt+0x135/0xb60 net/sctp/socket.c:8131

stack backtrace:
CPU: 0 UID: 0 PID: 5996 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
Call Trace:
 <TASK>
 dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
 print_deadlock_bug+0x279/0x290 kernel/locking/lockdep.c:3041
 check_deadlock kernel/locking/lockdep.c:3093 [inline]
 validate_chain kernel/locking/lockdep.c:3895 [inline]
 __lock_acquire+0x2540/0x2cf0 kernel/locking/lockdep.c:5237
 lock_acquire+0x117/0x340 kernel/locking/lockdep.c:5868
 lock_sock_nested+0x48/0x100 net/core/sock.c:3780
 lock_sock include/net/sock.h:1700 [inline]
 ipv6_sock_mc_close+0xd3/0x140 net/ipv6/mcast.c:348
 inet6_release+0x47/0x70 net/ipv6/af_inet6.c:482
 __sock_release net/socket.c:653 [inline]
 sock_release+0x85/0x150 net/socket.c:681
 sctp_getsockopt_peeloff_common+0x56b/0x770 net/sctp/socket.c:5732
 sctp_getsockopt_peeloff_flags+0x13b/0x230 net/sctp/socket.c:5801
 sctp_getsockopt+0x3ab/0xb60 net/sctp/socket.c:8151
 do_sock_getsockopt+0x2b4/0x3d0 net/socket.c:2399
 __sys_getsockopt net/socket.c:2428 [inline]
 __do_sys_getsockopt net/socket.c:2435 [inline]
 __se_sys_getsockopt net/socket.c:2432 [inline]
 __x64_sys_getsockopt+0x1a5/0x250 net/socket.c:2432
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xfa/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f8f8c38f749
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffcfdade018 EFLAGS: 00000246 ORIG_RAX: 0000000000000037
RAX: ffffffffffffffda RBX: 00007f8f8c5e5fa0 RCX: 00007f8f8c38f749
RDX: 000000000000007a RSI: 0000000000000084 RDI: 0000000000000003
RBP: 00007f8f8c413f91 R08: 0000200000000040 R09: 0000000000000000
R10: 0000200000000340 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f8f8c5e5fa0 R14: 00007f8f8c5e5fa0 R15: 0000000000000005
 </TASK>

Fixes: 16942cf4d3e31 ("sctp: Use sk_clone() in sctp_accept().")
Reported-by: syzbot+c59e6bb54e7620495725@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6936d112.a70a0220.38f243.00a7.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20251210081206.1141086-2-kuniyu@google.com
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 16:17:59 +01:00
Scott Mayhew
15564bd67e net/handshake: duplicate handshake cancellations leak socket
When a handshake request is cancelled it is removed from the
handshake_net->hn_requests list, but it is still present in the
handshake_rhashtbl until it is destroyed.

If a second cancellation request arrives for the same handshake request,
then remove_pending() will return false... and assuming
HANDSHAKE_F_REQ_COMPLETED isn't set in req->hr_flags, we'll continue
processing through the out_true label, where we put another reference on
the sock and a refcount underflow occurs.

This can happen for example if a handshake times out - particularly if
the SUNRPC client sends the AUTH_TLS probe to the server but doesn't
follow it up with the ClientHello due to a problem with tlshd.  When the
timeout is hit on the server, the server will send a FIN, which triggers
a cancellation request via xs_reset_transport().  When the timeout is
hit on the client, another cancellation request happens via
xs_tls_handshake_sync().

Add a test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED) in the pending cancel
path so duplicate cancels can be detected.

Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
Suggested-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20251209193015.3032058-1-smayhew@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 16:01:36 +01:00
Paolo Abeni
3e82accd3e netfilter pull request nf-25-12-16
-----BEGIN PGP SIGNATURE-----
 
 iQJBBAABCAArFiEEgKkgxbID4Gn1hq6fcJGo2a1f9gAFAmlBqhYNHGZ3QHN0cmxl
 bi5kZQAKCRBwkajZrV/2AAL3EACpOsJRrAyYtltuXzaPyIeXMSFXqyl8r1fl/t35
 BXPX9ZuZLInkmPvbSvBzZjoN7PXhByu31Y2f4/u4qoylwTFs+Cs0WLQF3QVRTV0q
 zo1hTOxm2Lt6pB3GdLLV0W2gz14E9xDAwW2BvjgDVg/RhqlOb+Hn5QTdyJVnFiov
 je1n5mWT2nGtL/p6KwtJdVja23Hx3egqSv5qLYE2sEwTEVBt6wupqCRX8mc8/Gpx
 rir7tmZGHtFbgwgJ+tShfqcBzSfc+uFHvq4tyzoWE+y0REUBApvwrxTFMjb6yA+q
 baGKDcJFugtfXBVMdg3HM1NDKO4n/Rl7qJzdz2OYyqHIh8jHDyYakXtDu6U2quoX
 W2hhrPMpYVYy6TCocFDRIq6xEiORqBMZecnL8dm+LkHVtxhB4vituweJd3hXCyUu
 Zj3dlW3idI3NGzpNh/wBMgUE7ghHfbgLJvPT+sCypYUH59UL5WgvP5uiZrR+rNRF
 gujC1jLIxwDR0Yz6X3/fO7xscNOnx8zu2llX+8y5PeQCntMpUybdeKcHbx3k206D
 fYQV9QSk2AN0XeID26afkzer0Ibk7vC0Q6AD2lRqIPeQ6yGGS5VfFEutCl64M4Mp
 fyBvxp7336W3YOKwF7vxH1BjHgzqfdVyw97vXA/2oPs85+4tWzqNex+X5HMzruM/
 hrI1iA==
 =8bX+
 -----END PGP SIGNATURE-----

Merge tag 'nf-25-12-16' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Florian Westphal says:

====================
netfilter: updates for net

The following patchset contains Netfilter fixes for *net*:

1)  Jozsef Kadlecsik is retiring.  Fortunately Jozsef will still keep an
    eye on ipset patches.

2)  remove a bogus direction check from nat core, this caused spurious
    flakes in the 'reverse clash' selftest, from myself.

3) nf_tables doesn't need to do chain validation on register store,
   from Pablo Neira Ayuso.

4) nf_tables shouldn't revisit chains during ruleset (graph) validation
   if possible.  Both 3 and 4 were slated for -next initially but there
   are now two independent reports of people hitting soft lockup errors
   during ruleset validation, so it makes no sense anymore to route
   this via -next given this is -stable material. From myself.

5) call cond_resched() in a more frequently visited place during nf_tables
   chain validation, this wasn't possible earlier due to rcu read lock,
   but nowadays its not held anymore during set walks.

6) Don't fail conntrack packetdrill test with HZ=100 kernels.

netfilter pull request nf-25-12-16

* tag 'nf-25-12-16' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  selftests: netfilter: packetdrill: avoid failure on HZ=100 kernel
  netfilter: nf_tables: avoid softlockup warnings in nft_chain_validate
  netfilter: nf_tables: avoid chain re-validation if possible
  netfilter: nf_tables: remove redundant chain validation on register store
  netfilter: nf_nat: remove bogus direction check
  MAINTAINERS: Remove Jozsef Kadlecsik from MAINTAINERS file
====================

Link: https://patch.msgid.link/20251216190904.14507-1-fw@strlen.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 13:55:01 +01:00
Paolo Abeni
78a47532ab Merge branch 'mlx5-misc-fixes-2025-12-09'
Tariq Toukan says:

====================
mlx5 misc fixes 2025-12-09

This patchset provides misc bug fixes from the team to the mlx5 core and
Eth drivers.
====================

Link: https://patch.msgid.link/1765284977-1363052-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 13:39:32 +01:00
Cosmin Ratiu
4198a14c8c net/mlx5e: Don't include PSP in the hard MTU calculations
Commit [1] added the 40 bytes required by the PSP header+trailer and the
UDP header to MLX5E_ETH_HARD_MTU, which limits the device-wide max
software MTU that could be set. This is not okay, because most packets
are not PSP packets and it doesn't make sense to always reserve space
for headers which won't get added in most cases.

As it turns out, for TCP connections, PSP overhead is already taken into
account in the TCP MSS calculations via inet_csk(sk)->icsk_ext_hdr_len.
This was added in commit [2]. This means that the extra space reserved
in the hard MTU for mlx5 ends up unused and wasted.

Remove the unnecessary 40 byte reservation from hard MTU.

[1] commit e5a1861a298e ("net/mlx5e: Implement PSP Tx data path")
[2] commit e97269257fe4 ("net: psp: update the TCP MSS to reflect PSP
packet overhead")

Fixes: e5a1861a298e ("net/mlx5e: Implement PSP Tx data path")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1765284977-1363052-10-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 13:39:30 +01:00
Tariq Toukan
c8591decd9 net/mlx5e: Do not update BQL of old txqs during channel reconfiguration
During channel reconfiguration (e.g., ethtool private flags changes),
the driver can trigger a kernel BUG_ON in dql_completed() with the error
"kernel BUG at lib/dynamic_queue_limits.c:99".

The issue occurs in the following sequence:

During mlx5e_safe_switch_params(), old channels are deactivated via
mlx5e_deactivate_txqsq(). New channels are created and activated, taking
ownership of the netdev_queues and their BQL state.

When old channels are closed via mlx5e_close_txqsq(), there may be
pending TX descriptors (sq->cc != sq->pc) that were in-flight during the
deactivation.

mlx5e_free_txqsq_descs() frees these pending descriptors and attempts to
complete them via netdev_tx_completed_queue().

However, the BQL state (dql->num_queued and dql->num_completed) have
been reset in mlx5e_activate_txqsq and belong to the new queue owner,
leading to dql->num_queued - dql->num_completed < nbytes.

This triggers BUG_ON(count > num_queued - num_completed) in
dql_completed().

Fixes: 3b88a535a8e1 ("net/mlx5e: Defer channels closure to reduce interface down time")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: William Tu <witu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Link: https://patch.msgid.link/1765284977-1363052-9-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 13:39:29 +01:00
Jianbo Liu
9ab89bde13 net/mlx5e: Trigger neighbor resolution for unresolved destinations
When initializing the MAC addresses for an outbound IPsec packet offload
rule in mlx5e_ipsec_init_macs, the call to dst_neigh_lookup is used to
find the next-hop neighbor (typically the gateway in tunnel mode).
This call might create a new neighbor entry if one doesn't already
exist. This newly created entry starts in the INCOMPLETE state, as the
kernel hasn't yet sent an ARP or NDISC probe to resolve the MAC
address. In this case, neigh_ha_snapshot will correctly return an
all-zero MAC address.

IPsec packet offload requires the actual next-hop MAC address to
program the rule correctly. If the neighbor state is INCOMPLETE when
the rule is created, the hardware rule is programmed with an all-zero
destination MAC address. Packets sent using this rule will be
subsequently dropped by the receiving network infrastructure or host.

This patch adds a check specifically for the outbound offload path. If
neigh_ha_snapshot returns an all-zero MAC address, it proactively
calls neigh_event_send(n, NULL). This ensures the kernel immediately
sends the initial ARP or NDISC probe if one isn't already pending,
accelerating the resolution process. This helps prevent the hardware
rule from being programmed with an invalid MAC address and avoids
packet drops due to unresolved neighbors.

Fixes: 71670f766b8f ("net/mlx5e: Support routed networks during IPsec MACs initialization")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1765284977-1363052-8-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 13:39:29 +01:00
Jianbo Liu
e35d7da8dd net/mlx5e: Use ip6_dst_lookup instead of ipv6_dst_lookup_flow for MAC init
Replace ipv6_stub->ipv6_dst_lookup_flow() with ip6_dst_lookup() in
mlx5e_ipsec_init_macs() since IPsec transformations are not needed
during Security Association setup - only basic routing information is
required for nexthop MAC address resolution.

This resolves an issue where XfrmOutNoStates error counter would be
incremented when xfrm policy is configured before xfrm state, as the
IPsec-aware routing function would attempt policy checks during SA
initialization.

Fixes: 71670f766b8f ("net/mlx5e: Support routed networks during IPsec MACs initialization")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1765284977-1363052-7-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 13:39:29 +01:00
Shay Drory
367e501f8b net/mlx5: Serialize firmware reset with devlink
The firmware reset mechanism can be triggered by asynchronous events,
which may race with other devlink operations like devlink reload or
devlink dev eswitch set, potentially leading to inconsistent states.

This patch addresses the race by using the devl_lock to serialize the
firmware reset against other devlink operations. When a reset is
requested, the driver attempts to acquire the lock. If successful, it
sets a flag to block devlink reload or eswitch changes, ACKs the reset
to firmware and then releases the lock. If the lock is already held by
another operation, the driver NACKs the firmware reset request,
indicating that the reset cannot proceed.

Firmware reset does not keep the devl_lock and instead uses an internal
firmware reset bit. This is because firmware resets can be triggered by
asynchronous events, and processed in different threads. It is illegal
and unsafe to acquire a lock in one thread and attempt to release it in
another, as lock ownership is intrinsically thread-specific.

This change ensures that firmware resets and other devlink operations
are mutually exclusive during the critical reset request phase,
preventing race conditions.

Fixes: 38b9f903f22b ("net/mlx5: Handle sync reset request event")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mateusz Berezecki <mberezecki@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1765284977-1363052-6-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 13:39:29 +01:00
Shay Drory
c0289f67f7 net/mlx5: fw_tracer, Handle escaped percent properly
The firmware tracer's format string validation and parameter counting
did not properly handle escaped percent signs (%%). This caused
fw_tracer to count more parameters when trace format strings contained
literal percent characters.

To fix it, allow %% to pass string validation and skip %% sequences when
counting parameters since they represent literal percent signs rather
than format specifiers.

Fixes: 70dd6fdb8987 ("net/mlx5: FW tracer, parse traces and kernel tracing support")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reported-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Closes: https://lore.kernel.org/netdev/hanz6rzrb2bqbplryjrakvkbmv4y5jlmtthnvi3thg5slqvelp@t3s3erottr6s/
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1765284977-1363052-5-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 13:39:29 +01:00
Shay Drory
b35966042d net/mlx5: fw_tracer, Validate format string parameters
Add validation for format string parameters in the firmware tracer to
prevent potential security vulnerabilities and crashes from malformed
format strings received from firmware.

The firmware tracer receives format strings from the device firmware and
uses them to format trace messages. Without proper validation, bad
firmware could provide format strings with invalid format specifiers
(e.g., %s, %p, %n) that could lead to crashes, or other undefined
behavior.

Add mlx5_tracer_validate_params() to validate that all format specifiers
in trace strings are limited to safe integer/hex formats (%x, %d, %i,
%u, %llx, %lx, etc.). Reject strings containing other format types that
could be used to access arbitrary memory or cause crashes.
Invalid format strings are added to the trace output for visibility with
"BAD_FORMAT: " prefix.

Fixes: 70dd6fdb8987 ("net/mlx5: FW tracer, parse traces and kernel tracing support")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reported-by: Breno Leitao <leitao@debian.org>
Closes: https://lore.kernel.org/netdev/hanz6rzrb2bqbplryjrakvkbmv4y5jlmtthnvi3thg5slqvelp@t3s3erottr6s/
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1765284977-1363052-4-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 13:39:29 +01:00
Moshe Shemesh
5846a365fc net/mlx5: Drain firmware reset in shutdown callback
Invoke drain_fw_reset() in the shutdown callback to ensure all
firmware reset handling is completed before shutdown proceeds.

Fixes: 16d42d313350 ("net/mlx5: Drain fw_reset when removing device")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shay Drori <shayd@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1765284977-1363052-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 13:39:29 +01:00
Moshe Shemesh
89a898d63f net/mlx5: fw reset, clear reset requested on drain_fw_reset
drain_fw_reset() waits for ongoing firmware reset events and blocks new
event handling, but does not clear the reset requested flag, and may
keep sync reset polling.

To fix it, call mlx5_sync_reset_clear_reset_requested() to clear the
flag, stop sync reset polling, and resume health polling, ensuring
health issues are still detected after the firmware reset drain.

Fixes: 16d42d313350 ("net/mlx5: Drain fw_reset when removing device")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shay Drori <shayd@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1765284977-1363052-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 13:39:29 +01:00
Paolo Abeni
71e6b15d21 Merge branch 'net-dsa-lantiq-a-bunch-of-fixes'
Daniel Golle says:

====================
net: dsa: lantiq: a bunch of fixes

This series is the continuation and result of comments received for a fix
for the SGMII restart-an bit not actually being self-clearing, which was
reported by by Rasmus Villemoes.

A closer investigation and testing the .remove and the .shutdown paths
of the mxl-gsw1xx.c and lantiq_gswip.c drivers has revealed a couple of
existing problems, which are also addressed in this series.
====================

Link: https://patch.msgid.link/cover.1765241054.git.daniel@makrotopia.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 12:53:24 +01:00
Daniel Golle
7b103aaf0d net: dsa: mxl-gsw1xx: manually clear RANEG bit
Despite being documented as self-clearing, the RANEG bit sometimes
remains set, preventing auto-negotiation from happening.

Manually clear the RANEG bit after 10ms as advised by MaxLinear.
In order to not hold RTNL during the 10ms of waiting schedule
delayed work to take care of clearing the bit asynchronously, which
is similar to the self-clearing behavior.

Fixes: 22335939ec90 ("net: dsa: add driver for MaxLinear GSW1xx switch family")
Reported-by: Rasmus Villemoes <ravi@prevas.dk>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://patch.msgid.link/76745fceb5a3f53088110fb7a96acf88434088ca.1765241054.git.daniel@makrotopia.org
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 12:53:21 +01:00
Daniel Golle
651b253b80 net: dsa: mxl-gsw1xx: fix .shutdown driver operation
The .shutdown operation should call dsa_switch_shutdown() just like
it is done also by the sibling lantiq_gswip driver. Not doing that
results in shutdown or reboot hanging and waiting for the CPU port
becoming free, which introduces a longer delay and a WARNING before
shutdown or reboot in case the driver is built-into the kernel.
Fix this by calling dsa_switch_shutdown() in the driver's shutdown
operation, harmonizing it with what is done in the lantiq_gswip
driver. As a side-effect this now allows to remove the previously
exported gswip_disable_switch() function which no longer got any
users.

Fixes: 22335939ec907 ("net: dsa: add driver for MaxLinear GSW1xx switch family")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://patch.msgid.link/77ed91a5206e5dbf5d3e83d7e364ebfda90d31fd.1765241054.git.daniel@makrotopia.org
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 12:53:21 +01:00
Daniel Golle
8e4c0f08f6 net: dsa: mxl-gsw1xx: fix order in .remove operation
The driver's .remove operation was calling gswip_disable_switch() which
clears the GSWIP_MDIO_GLOB_ENABLE bit before calling
dsa_unregister_switch() and thereby violating a Golden Rule of driver
development to always unpublish userspace interfaces before disabling
hardware, as pointed out by Russell King.

Fix this by relying in GSWIP_MDIO_GLOB_ENABLE being cleared by the
.teardown operation introduced by the previous commit
("net: dsa: lantiq_gswip: fix teardown order").

Fixes: 22335939ec907 ("net: dsa: add driver for MaxLinear GSW1xx switch family")
Suggested-by: "Russell King (Oracle)" <linux@armlinux.org.uk>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://patch.msgid.link/63f882eeb910cf24503c35a443b541cc54a930f2.1765241054.git.daniel@makrotopia.org
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 12:53:21 +01:00
Daniel Golle
377d66fa86 net: dsa: lantiq_gswip: fix order in .remove operation
Russell King pointed out that disabling the switch by clearing
GSWIP_MDIO_GLOB_ENABLE before calling dsa_unregister_switch() is
problematic, as it violates a Golden Rule of driver development to
always first unpublish userspace interfaces and then disable the
hardware.

Fix this, and also simplify the probe() function, by introducing a
dsa_switch_ops teardown() operation which takes care of clearing the
GSWIP_MDIO_GLOB_ENABLE bit.

Fixes: 14fceff4771e5 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Suggested-by: "Russell King (Oracle)" <linux@armlinux.org.uk>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://patch.msgid.link/4ebd72a29edc1e4059b9666a26a0bb5d906a829a.1765241054.git.daniel@makrotopia.org
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 12:53:21 +01:00
Gal Pressman
7b07be1ff1 ethtool: Avoid overflowing userspace buffer on stats query
The ethtool -S command operates across three ioctl calls:
ETHTOOL_GSSET_INFO for the size, ETHTOOL_GSTRINGS for the names, and
ETHTOOL_GSTATS for the values.

If the number of stats changes between these calls (e.g., due to device
reconfiguration), userspace's buffer allocation will be incorrect,
potentially leading to buffer overflow.

Drivers are generally expected to maintain stable stat counts, but some
drivers (e.g., mlx5, bnx2x, bna, ksz884x) use dynamic counters, making
this scenario possible.

Some drivers try to handle this internally:
- bnad_get_ethtool_stats() returns early in case stats.n_stats is not
  equal to the driver's stats count.
- micrel/ksz884x also makes sure not to write anything beyond
  stats.n_stats and overflow the buffer.

However, both use stats.n_stats which is already assigned with the value
returned from get_sset_count(), hence won't solve the issue described
here.

Change ethtool_get_strings(), ethtool_get_stats(),
ethtool_get_phy_stats() to not return anything in case of a mismatch
between userspace's size and get_sset_size(), to prevent buffer
overflow.
The returned n_stats value will be equal to zero, to reflect that
nothing has been returned.

This could result in one of two cases when using upstream ethtool,
depending on when the size change is detected:
1. When detected in ethtool_get_strings():
    # ethtool -S eth2
    no stats available

2. When detected in get stats, all stats will be reported as zero.

Both cases are presumably transient, and a subsequent ethtool call
should succeed.

Other than the overflow avoidance, these two cases are very evident (no
output/cleared stats), which is arguably better than presenting
incorrect/shifted stats.
I also considered returning an error instead of a "silent" response, but
that seems more destructive towards userspace apps.

Notes:
- This patch does not claim to fix the inherent race, it only makes sure
  that we do not overflow the userspace buffer, and makes for a more
  predictable behavior.

- RTNL lock is held during each ioctl, the race window exists between
  the separate ioctl calls when the lock is released.

- Userspace ethtool always fills stats.n_stats, but it is likely that
  these stats ioctls are implemented in other userspace applications
  which might not fill it. The added code checks that it's not zero,
  to prevent any regressions.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/20251208121901.3203692-1-gal@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-18 12:24:25 +01:00
Marc Kleine-Budde
5a5aff6338 can: fix build dependency
Arnd Bergmann's patch [1] fixed the build dependency problem introduced by
bugfix commit cb2dc6d2869a ("can: Kconfig: select CAN driver infrastructure
by default"). This ended up as commit 6abd4577bccc ("can: fix build
dependency"), but I broke Arnd's fix by removing a dependency that we
thought was superfluous.

[1] https://lore.kernel.org/all/20251204100015.1033688-1-arnd@kernel.org/

Meanwhile the problem was also found by intel's kernel test robot,
complaining about undefined symbols:

| ERROR: modpost: "m_can_class_unregister" [drivers/net/can/m_can/m_can_platform.ko] undefined!
| ERROR: modpost: "m_can_class_free_dev" [drivers/net/can/m_can/m_can_platform.ko] undefined!
| ERROR: modpost: "m_can_class_allocate_dev" [drivers/net/can/m_can/m_can_platform.ko] undefined!
| ERROR: modpost: "m_can_class_get_clocks" [drivers/net/can/m_can/m_can_platform.ko] undefined!
| ERROR: modpost: "m_can_class_register" [drivers/net/can/m_can/m_can_platform.ko] undefined!

To fix this problem, add the missing dependency again.

Cc: Vincent Mailhol <mailhol@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202512132253.vO9WFDJK-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202512180808.fTAUQ2XN-lkp@intel.com/
Reported-by: Arnd Bergmann <arnd@arndb.de>
Closes: https://lore.kernel.org/all/7427949a-ea7d-4854-9fe4-e01db7d878c7@app.fastmail.com/
Fixes: 6abd4577bccc ("can: fix build dependency")
Fixes: cb2dc6d2869a ("can: Kconfig: select CAN driver infrastructure by default")
Acked-by: Vincent Mailhol <mailhol@kernel.org>
Link: https://patch.msgid.link/20251217-can-fix-dependency-v1-1-fd2d4f2a2bf5@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-12-18 09:04:47 +01:00
Srinivas Pandruvada
dcd0b625fe powercap: intel_rapl: Fix possible recursive lock warning
With the RAPL PMU addition, there is a recursive locking when CPU online
callback function calls rapl_package_add_pmu(). Here cpu_hotplug_lock
is already acquired by cpuhp_thread_fun() and rapl_package_add_pmu()
tries to acquire again.

<4>[ 8.197433] ============================================
<4>[ 8.197437] WARNING: possible recursive locking detected
<4>[ 8.197440] 6.19.0-rc1-lgci-xe-xe-4242-05b7c58b3367dca84+ #1 Not tainted
<4>[ 8.197444] --------------------------------------------
<4>[ 8.197447] cpuhp/0/20 is trying to acquire lock:
<4>[ 8.197450] ffffffff83487870 (cpu_hotplug_lock){++++}-{0:0}, at:
rapl_package_add_pmu+0x37/0x370 [intel_rapl_common]
<4>[ 8.197463]
but task is already holding lock:
<4>[ 8.197466] ffffffff83487870 (cpu_hotplug_lock){++++}-{0:0}, at:
cpuhp_thread_fun+0x6d/0x290
<4>[ 8.197477]
other info that might help us debug this:
<4>[ 8.197480] Possible unsafe locking scenario:

<4>[ 8.197483] CPU0
<4>[ 8.197485] ----
<4>[ 8.197487] lock(cpu_hotplug_lock);
<4>[ 8.197490] lock(cpu_hotplug_lock);
<4>[ 8.197493]
*** DEADLOCK ***
..
..
<4>[ 8.197542] __lock_acquire+0x146e/0x2790
<4>[ 8.197548] lock_acquire+0xc4/0x2c0
<4>[ 8.197550] ? rapl_package_add_pmu+0x37/0x370 [intel_rapl_common]
<4>[ 8.197556] cpus_read_lock+0x41/0x110
<4>[ 8.197558] ? rapl_package_add_pmu+0x37/0x370 [intel_rapl_common]
<4>[ 8.197561] rapl_package_add_pmu+0x37/0x370 [intel_rapl_common]
<4>[ 8.197565] rapl_cpu_online+0x85/0x87 [intel_rapl_msr]
<4>[ 8.197568] ? __pfx_rapl_cpu_online+0x10/0x10 [intel_rapl_msr]
<4>[ 8.197570] cpuhp_invoke_callback+0x41f/0x6c0
<4>[ 8.197573] ? cpuhp_thread_fun+0x6d/0x290
<4>[ 8.197575] cpuhp_thread_fun+0x1e2/0x290
<4>[ 8.197578] ? smpboot_thread_fn+0x26/0x290
<4>[ 8.197581] smpboot_thread_fn+0x12f/0x290
<4>[ 8.197584] ? __pfx_smpboot_thread_fn+0x10/0x10
<4>[ 8.197586] kthread+0x11f/0x250
<4>[ 8.197589] ? __pfx_kthread+0x10/0x10
<4>[ 8.197592] ret_from_fork+0x344/0x3a0
<4>[ 8.197595] ? __pfx_kthread+0x10/0x10
<4>[ 8.197597] ret_from_fork_asm+0x1a/0x30
<4>[ 8.197604] </TASK>

Fix this issue in the same way as rapl powercap package domain is added
from the same CPU online callback by introducing another interface which
doesn't call cpus_read_lock(). Add rapl_package_add_pmu_locked() and
rapl_package_remove_pmu_locked() which don't call cpus_read_lock().

Fixes: 748d6ba43afd ("powercap: intel_rapl: Enable MSR-based RAPL PMU support")
Reported-by: Borah, Chaitanya Kumar <chaitanya.kumar.borah@intel.com>
Closes: https://lore.kernel.org/linux-pm/5427ede1-57a0-43d1-99f3-8ca4b0643e82@intel.com/T/#u
Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Tested-by: RavitejaX Veesam <ravitejax.veesam@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20251217153455.3560176-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-12-17 17:24:28 +01:00
Tetsuo Handa
46cea215dc can: j1939: make j1939_sk_bind() fail if device is no longer registered
There is a theoretical race window in j1939_sk_netdev_event_unregister()
where two j1939_sk_bind() calls jump in between read_unlock_bh() and
lock_sock().

The assumption jsk->priv == priv can fail if the first j1939_sk_bind()
call once made jsk->priv == NULL due to failed j1939_local_ecu_get() call
and the second j1939_sk_bind() call again made jsk->priv != NULL due to
successful j1939_local_ecu_get() call.

Since the socket lock is held by both j1939_sk_netdev_event_unregister()
and j1939_sk_bind(), checking ndev->reg_state with the socket lock held can
reliably make the second j1939_sk_bind() call fail (and close this race
window).

Fixes: 7fcbe5b2c6a4 ("can: j1939: implement NETDEV_UNREGISTER notification handler")
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/5732921e-247e-4957-a364-da74bd7031d7@I-love.SAKURA.ne.jp
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-12-17 10:47:33 +01:00
Tetsuo Handa
5d5602236f can: j1939: make j1939_session_activate() fail if device is no longer registered
syzbot is still reporting

  unregister_netdevice: waiting for vcan0 to become free. Usage count = 2

even after commit 93a27b5891b8 ("can: j1939: add missing calls in
NETDEV_UNREGISTER notification handler") was added. A debug printk() patch
found that j1939_session_activate() can succeed even after
j1939_cancel_active_session() from j1939_netdev_notify(NETDEV_UNREGISTER)
has completed.

Since j1939_cancel_active_session() is processed with the session list lock
held, checking ndev->reg_state in j1939_session_activate() with the session
list lock held can reliably close the race window.

Reported-by: syzbot <syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/b9653191-d479-4c8b-8536-1326d028db5c@I-love.SAKURA.ne.jp
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-12-17 10:46:39 +01:00
Steve French
d8a4af8f3d cifs: update internal module version number
to 2.58

Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-16 17:43:02 -06:00
ZhangGuoDong
94d5b8dbc5 smb: move some SMB1 definitions into common/smb1pdu.h
These definitions are only used by SMB1, so move them into the new
common/smb1pdu.h.

KSMBD only implements SMB_COM_NEGOTIATE, see MS-SMB2 3.3.5.2.

Co-developed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: ZhangGuoDong <zhangguodong@kylinos.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-16 17:43:01 -06:00
Bharath SM
05f5e355cf smb: align durable reconnect v2 context to 8 byte boundary
Add a 4-byte Pad to create_durable_handle_reconnect_v2 so the DH2C
create context is 8 byte aligned.
This avoids malformed CREATE contexts on reconnect.
Recent change removed this Padding, adding it back.

Fixes: 81a45de432c6 ("smb: move create_durable_handle_reconnect_v2 to common/smb2pdu.h")

Signed-off-by: Bharath SM <bharathsm@microsoft.com>
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-16 17:42:49 -06:00
Jason Gunthorpe
e6a973af11 iommufd/selftest: Check for overflow in IOMMU_TEST_OP_ADD_RESERVED
syzkaller found it could overflow math in the test infrastructure and
cause a WARN_ON by corrupting the reserved interval tree. This only
effects test kernels with CONFIG_IOMMUFD_TEST.

Validate the user input length in the test ioctl.

Fixes: f4b20bb34c83 ("iommufd: Add kernel support for testing iommufd")
Link: https://patch.msgid.link/r/0-v1-cd99f6049ba5+51-iommufd_syz_add_resv_jgg@nvidia.com
Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Reported-by: syzbot+57fdb0cf6a0c5d1f15a2@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/69368129.a70a0220.38f243.008f.GAE@google.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-12-16 11:53:40 -04:00
Rafael J. Wysocki
359afc8eb0 PM: runtime: Do not clear needs_force_resume with enabled runtime PM
Commit 89d9cec3b1e9 ("PM: runtime: Clear power.needs_force_resume in
pm_runtime_reinit()") added provisional clearing of power.needs_force_resume
to pm_runtime_reinit(), but it is done unconditionally which is a
mistake because pm_runtime_reinit() may race with driver probing
and removal [1].

To address this, notice that power.needs_force_resume should never
be set when runtime PM is enabled and so it only needs to be cleared
when runtime PM is disabled, and update pm_runtime_init() to only
clear that flag when runtime PM is disabled.

Fixes: 89d9cec3b1e9 ("PM: runtime: Clear power.needs_force_resume in pm_runtime_reinit()")
Reported-by: Ed Tsai <ed.tsai@mediatek.com>
Closes: https://lore.kernel.org/linux-pm/20251215122154.3180001-1-ed.tsai@mediatek.com/ [1]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: 6.17+ <stable@vger.kernel.org> # 6.17+
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://patch.msgid.link/12807571.O9o76ZdvQC@rafael.j.wysocki
2025-12-16 12:58:57 +01:00
Jason Gunthorpe
b80fab2813 iommufd/selftest: Do not leak the hwpt if IOMMU_TEST_OP_MD_CHECK_MAP fails
If the input validation fails it returned without freeing the hwpt
refcount causing a leak. This triggers a WARN_ON when closing the fd:

  WARNING: drivers/iommu/iommufd/main.c:369 at iommufd_fops_release+0x385/0x430, CPU#1: repro/724

Found by szykaller.

Fixes: e93d5945ed5b ("iommufd: Change the selftest to use iommupt instead of xarray")
Link: https://patch.msgid.link/r/0-v1-c8ed57e24380+44ae-iommufd_selftest_hwpt_leak_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reported-by: "Lai, Yi" <yi1.lai@linux.intel.com>
Closes: https://lore.kernel.org/r/aTJGMaqwQK0ASj0G@ly-workstation
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-12-15 20:34:41 -04:00
Jason Gunthorpe
5b244b077c iommufd/selftest: Make it clearer to gcc that the access is not out of bounds
GCC gets a bit confused and reports:

   In function '_test_cmd_get_hw_info',
       inlined from 'iommufd_ioas_get_hw_info' at iommufd.c:779:3,
       inlined from 'wrapper_iommufd_ioas_get_hw_info' at iommufd.c:752:1:
>> iommufd_utils.h:804:37: warning: array subscript 'struct iommu_test_hw_info[0]' is partly outside array bounds of 'struct iommu_test_hw_info_buffer_smaller[1]' [-Warray-bounds=]
     804 |                         assert(!info->flags);
         |                                 ~~~~^~~~~~~
   iommufd.c: In function 'wrapper_iommufd_ioas_get_hw_info':
   iommufd.c:761:11: note: object 'buffer_smaller' of size 4
     761 |         } buffer_smaller;
         |           ^~~~~~~~~~~~~~

While it is true that "struct iommu_test_hw_info[0]" is partly out of
bounds of the input pointer, it is not true that info->flags is out of
bounds. Unclear why it warns on this.

Reuse an existing properly sized stack buffer and pass a truncated length
instead to test the same thing.

Fixes: af4fde93c319 ("iommufd/selftest: Add coverage for IOMMU_GET_HW_INFO ioctl")
Link: https://patch.msgid.link/r/0-v1-63a2cffb09da+4486-iommufd_gcc_bounds_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202512032344.kaAcKFIM-lkp@intel.com/
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-12-15 20:34:41 -04:00
Arnd Bergmann
69dc538a4f iommufd: Fix building without dmabuf
When DMABUF is disabled, trying to use it causes a link failure:

x86_64-linux-ld: drivers/iommu/iommufd/io_pagetable.o: in function `iopt_map_file_pages':
io_pagetable.c:(.text+0x1735): undefined reference to `dma_buf_get'
x86_64-linux-ld: io_pagetable.c:(.text+0x1775): undefined reference to `dma_buf_put'

Fixes: 44ebaa1744fd ("iommufd: Accept a DMABUF through IOMMU_IOAS_MAP_FILE")
Link: https://patch.msgid.link/r/20251204100333.1034767-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-12-15 20:34:41 -04:00
Florian Westphal
fec7b07955 selftests: netfilter: packetdrill: avoid failure on HZ=100 kernel
packetdrill --ip_version=ipv4 --mtu=1500 --tolerance_usecs=1000000 --non_fatal packet conntrack_syn_challenge_ack.pkt
conntrack v1.4.8 (conntrack-tools): 1 flow entries have been shown.
conntrack_syn_challenge_ack.pkt:32: error executing `conntrack -f $NFCT_IP_VERSION \
-L -p tcp --dport 8080 | grep UNREPLIED | grep -q SYN_SENT` command: non-zero status 1

Affected kernel had CONFIG_HZ=100; reset packet was still sitting in
backlog.

Reported-by: Yi Chen <yiche@redhat.com>
Fixes: a8a388c2aae4 ("selftests: netfilter: add packetdrill based conntrack tests")
Signed-off-by: Florian Westphal <fw@strlen.de>
2025-12-15 15:04:04 +01:00
Florian Westphal
7e7a817f2d netfilter: nf_tables: avoid softlockup warnings in nft_chain_validate
This reverts commit
314c82841602 ("netfilter: nf_tables: can't schedule in nft_chain_validate"):
Since commit a60a5abe19d6 ("netfilter: nf_tables: allow iter callbacks to sleep")
the iterator callback is invoked without rcu read lock held, so this
cond_resched() is now valid.

Signed-off-by: Florian Westphal <fw@strlen.de>
2025-12-15 15:04:04 +01:00
Florian Westphal
8e1a1bc4f5 netfilter: nf_tables: avoid chain re-validation if possible
Hamza Mahfooz reports cpu soft lock-ups in
nft_chain_validate():

 watchdog: BUG: soft lockup - CPU#1 stuck for 27s! [iptables-nft-re:37547]
[..]
 RIP: 0010:nft_chain_validate+0xcb/0x110 [nf_tables]
[..]
  nft_immediate_validate+0x36/0x50 [nf_tables]
  nft_chain_validate+0xc9/0x110 [nf_tables]
  nft_immediate_validate+0x36/0x50 [nf_tables]
  nft_chain_validate+0xc9/0x110 [nf_tables]
  nft_immediate_validate+0x36/0x50 [nf_tables]
  nft_chain_validate+0xc9/0x110 [nf_tables]
  nft_immediate_validate+0x36/0x50 [nf_tables]
  nft_chain_validate+0xc9/0x110 [nf_tables]
  nft_immediate_validate+0x36/0x50 [nf_tables]
  nft_chain_validate+0xc9/0x110 [nf_tables]
  nft_immediate_validate+0x36/0x50 [nf_tables]
  nft_chain_validate+0xc9/0x110 [nf_tables]
  nft_table_validate+0x6b/0xb0 [nf_tables]
  nf_tables_validate+0x8b/0xa0 [nf_tables]
  nf_tables_commit+0x1df/0x1eb0 [nf_tables]
[..]

Currently nf_tables will traverse the entire table (chain graph), starting
from the entry points (base chains), exploring all possible paths
(chain jumps).  But there are cases where we could avoid revalidation.

Consider:
1  input -> j2 -> j3
2  input -> j2 -> j3
3  input -> j1 -> j2 -> j3

Then the second rule does not need to revalidate j2, and, by extension j3,
because this was already checked during validation of the first rule.
We need to validate it only for rule 3.

This is needed because chain loop detection also ensures we do not exceed
the jump stack: Just because we know that j2 is cycle free, its last jump
might now exceed the allowed stack size.  We also need to update all
reachable chains with the new largest observed call depth.

Care has to be taken to revalidate even if the chain depth won't be an
issue: chain validation also ensures that expressions are not called from
invalid base chains.  For example, the masquerade expression can only be
called from NAT postrouting base chains.

Therefore we also need to keep record of the base chain context (type,
hooknum) and revalidate if the chain becomes reachable from a different
hook location.

Reported-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com>
Closes: https://lore.kernel.org/netfilter-devel/20251118221735.GA5477@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net/
Tested-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
2025-12-15 15:02:44 +01:00
Pengjie Zhang
f103fa127c ACPI: PCC: Fix race condition by removing static qualifier
Local variable 'ret' in acpi_pcc_address_space_setup() is currently
declared as 'static'. This can lead to race conditions in a
multithreaded environment.

Remove the 'static' qualifier to ensure that 'ret' will be allocated
directly on the stack as a local variable.

Fixes: a10b1c99e2dc ("ACPI: PCC: Setup PCC Opregion handler only if platform interrupt is available")
Signed-off-by: Pengjie Zhang <zhangpengjie2@huawei.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: lihuisong@huawei.com
Cc: 6.2+ <stable@vger.kernel.org> # 6.2+
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20251210132634.2050033-1-zhangpengjie2@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-12-15 13:00:33 +01:00
Pengjie Zhang
6ea3a44cef ACPI: CPPC: Fix missing PCC check for guaranteed_perf
The current implementation overlooks the 'guaranteed_perf'
register in this check.

If the Guaranteed Performance register is located in the PCC
subspace, the function currently attempts to read it without
acquiring the lock and without sending the CMD_READ doorbell
to the firmware. This can result in reading stale data.

Fixes: 29523f095397 ("ACPI / CPPC: Add support for guaranteed performance")
Signed-off-by: Pengjie Zhang <zhangpengjie2@huawei.com>
Cc: 4.20+ <stable@vger.kernel.org> # 4.20+
[ rjw: Subject and changelog edits ]
Link: https://patch.msgid.link/20251210132227.1988380-1-zhangpengjie2@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-12-15 12:56:46 +01:00
Thorsten Blum
d113735421 thermal: core: Fix typo and indentation in comments
s/tmperature/temperature/ and adjust the indentation of the @ops
parameter description to improve readability.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://patch.msgid.link/20251206174245.116391-2-thorsten.blum@linux.dev
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-12-15 12:47:39 +01:00
Srinivas Pandruvada
450f9cde66 thermal: intel: int340x: Enable power slider interface for Wildcat Lake
Set the PROC_THERMAL_FEATURE_SOC_POWER_SLIDER feature flag in
proc_thermal_pci_ids[] for Wildcat Lake to enable power slider interface.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20251205230007.2218533-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-12-15 12:44:59 +01:00
Sumeet Pawnikar
efc4c35b74 powercap: fix sscanf() error return value handling
Fix inconsistent error handling for sscanf() return value check.

Implicit boolean conversion is used instead of explicit return
value checks. The code checks if (!sscanf(...)) which is incorrect
because:
 1. sscanf returns the number of successfully parsed items
 2. On success, it returns 1 (one item passed)
 3. On failure, it returns 0 or EOF
 4. The check 'if (!sscanf(...))' is wrong because it treats
    success (1) as failure

All occurrences of sscanf() now uses explicit return value check.
With this behavior it returns '-EINVAL' when parsing fails (returns
0 or EOF), and continues when parsing succeeds (returns 1).

Signed-off-by: Sumeet Pawnikar <sumeet4linux@gmail.com>
[ rjw: Subject and changelog edits ]
Link: https://patch.msgid.link/20251207151549.202452-1-sumeet4linux@gmail.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-12-15 12:33:59 +01:00
Sumeet Pawnikar
7bda1910c4 powercap: fix race condition in register_control_type()
The device becomes visible to userspace via device_register()
even before it fully initialized by idr_init(). If userspace
or another thread tries to register a zone immediately after
device_register(), the control_type_valid() will fail because
the control_type is not yet in the list. The IDR is not yet
initialized, so this race condition causes zone registration
failure.

Move idr_init() and list addition before device_register()
fix the race condition.

Signed-off-by: Sumeet Pawnikar <sumeet4linux@gmail.com>
[ rjw: Subject adjustment, empty line added ]
Link: https://patch.msgid.link/20251205190216.5032-1-sumeet4linux@gmail.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-12-15 12:29:11 +01:00
Ahelenia Ziemiańska
6f7c877cc3 fs: send fsnotify_xattr()/IN_ATTRIB from vfs_fileattr_set()/chattr(1)
Currently it seems impossible to observe these changes to the file's
attributes. It's useful to be able to do this to see when the file
becomes immutable, for example, so emit IN_ATTRIB via fsnotify_xattr(),
like when changing other inode attributes.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Link: https://patch.msgid.link/iyvn6qjotpu6cei5jdtsoibfcp6l6rgvn47cwgaucgtucpfy2s@tarta.nabijaczleweli.xyz
Signed-off-by: Jan Kara <jack@suse.cz>
2025-12-15 10:19:52 +01:00
Amir Goldstein
635bc4def0 fsnotify: do not generate ACCESS/MODIFY events on child for special files
inotify/fanotify do not allow users with no read access to a file to
subscribe to events (e.g. IN_ACCESS/IN_MODIFY), but they do allow the
same user to subscribe for watching events on children when the user
has access to the parent directory (e.g. /dev).

Users with no read access to a file but with read access to its parent
directory can still stat the file and see if it was accessed/modified
via atime/mtime change.

The same is not true for special files (e.g. /dev/null). Users will not
generally observe atime/mtime changes when other users read/write to
special files, only when someone sets atime/mtime via utimensat().

Align fsnotify events with this stat behavior and do not generate
ACCESS/MODIFY events to parent watchers on read/write of special files.
The events are still generated to parent watchers on utimensat(). This
closes some side-channels that could be possibly used for information
exfiltration [1].

[1] https://snee.la/pdf/pubs/file-notification-attacks.pdf

Reported-by: Sudheendra Raghav Neela <sneela@tugraz.at>
CC: stable@vger.kernel.org
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2025-12-15 10:18:46 +01:00
Charles Mirabile
5a0b188250 lib/crypto: riscv: Add poly1305-core.S to .gitignore
poly1305-core.S is an auto-generated file, so it should be ignored.

Fixes: bef9c7559869 ("lib/crypto: riscv/poly1305: Import OpenSSL/CRYPTOGAMS implementation")
Cc: stable@vger.kernel.org
Signed-off-by: Charles Mirabile <cmirabil@redhat.com>
Link: https://lore.kernel.org/r/20251212184717.133701-1-cmirabil@redhat.com
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-12-14 10:18:22 -08:00
Ard Biesheuvel
c4b502d60a arm64/simd: Avoid pointless clearing of FP/SIMD buffer
The buffer provided to kernel_neon_begin() is only used if the task is
scheduled out while the FP/SIMD is in use by the kernel, or when such a
section is interrupted by a softirq that also uses the FP/SIMD.

IOW, this happens rarely, and even if it happened often, there is still
no reason for this buffer to be cleared beforehand, which happens
unconditionally, due to the use of a compound literal expression.

So define that buffer variable explicitly, and mark it as
__uninitialized so that it will not get cleared, even when
-ftrivial-auto-var-init is in effect.

This requires some preprocessor gymnastics, due to the fact that the
variable must be defined throughout the entire guarded scope, and the
expression

  ({ struct user_fpsimd_state __uninitialized st; &st; })

is problematic in that regard, even though the compilers seem to
permit it. So instead, repeat the 'for ()' trick that is also used in
the implementation of the guarded scope helpers.

Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Fixes: 4fa617cc6851 ("arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack")
Link: https://lore.kernel.org/r/20251209054848.998878-2-ardb@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-12-14 10:18:22 -08:00
Pablo Neira Ayuso
a67fd55f6a netfilter: nf_tables: remove redundant chain validation on register store
This validation predates the introduction of the state machine that
determines when to enter slow path validation for error reporting.

Currently, table validation is perform when:

- new rule contains expressions that need validation.
- new set element with jump/goto verdict.

Validation on register store skips most checks with no basechains, still
this walks the graph searching for loops and ensuring expressions are
called from the right hook. Remove this.

Fixes: a654de8fdc18 ("netfilter: nf_tables: fix chain dependency validation")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
2025-12-11 13:08:43 +01:00
Florian Westphal
5ec8ca26fe netfilter: nf_nat: remove bogus direction check
Jakub reports spurious failures of the 'conntrack_reverse_clash.sh'
selftest.  A bogus test makes nat core resort to port rewrite even
though there is no need for this.

When the test is made, nf_nat_used_tuple() would already have caused us
to return if no other CPU had added a colliding entry.
Moreover, nf_nat_used_tuple() would have ignored the colliding entry if
their origin tuples had been the same.

All that is left to check is if the colliding entry in the hash table
is subject to NAT, and, if its not, if our entry matches in the reverse
direction, e.g. hash table has

addr1:1234 -> addr2:80, and we want to commit
addr2:80   -> addr1:1234.

Because we already checked that neither the new nor the committed entry is
subject to NAT we only have to check origin vs. reply tuple:
for non-nat entries, the reply tuple is always the inverted original.

Just in case there are more problems extend the error reporting
in the selftest while at it and dump conntrack table/stats on error.

Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20251206175135.4a56591b@kernel.org/
Fixes: d8f84a9bc7c4 ("netfilter: nf_nat: don't try nat source port reallocation for reverse dir clash")
Signed-off-by: Florian Westphal <fw@strlen.de>
2025-12-11 13:08:37 +01:00
Jozsef Kadlecsik
99c6931fe1 MAINTAINERS: Remove Jozsef Kadlecsik from MAINTAINERS file
I'm retiring from maintaining netfilter. I'll still keep an
eye on ipset and respond to anything related to it.

Thank you!

Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
2025-12-11 12:40:46 +01:00
Dan Carpenter
885bebac99 nfc: pn533: Fix error code in pn533_acr122_poweron_rdr()
Set the error code if "transferred != sizeof(cmd)" instead of
returning success.

Fixes: dbafc28955fa ("NFC: pn533: don't send USB data off of the stack")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://patch.msgid.link/aTfIJ9tZPmeUF4W1@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-11 01:40:00 -08:00
Victor Nogueira
5914428e0e selftests/tc-testing: Create tests to exercise ets classes active list misplacements
Add a test case for a bug fixed by Jamal [1] and for scenario where an
ets drr class is inserted into the active list twice.

- Try to delete ets drr class' qdisc while still keeping it in the
  active list
- Try to add ets class to the active list twice

[1] https://lore.kernel.org/netdev/20251128151919.576920-1-jhs@mojatatu.com/

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20251208190125.1868423-2-victor@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-11 01:36:40 -08:00
Victor Nogueira
b1e125ae42 net/sched: ets: Remove drr class from the active list if it changes to strict
Whenever a user issues an ets qdisc change command, transforming a
drr class into a strict one, the ets code isn't checking whether that
class was in the active list and removing it. This means that, if a
user changes a strict class (which was in the active list) back to a drr
one, that class will be added twice to the active list [1].

Doing so with the following commands:

tc qdisc add dev lo root handle 1: ets bands 2 strict 1
tc qdisc add dev lo parent 1:2 handle 20: \
    tbf rate 8bit burst 100b latency 1s
tc filter add dev lo parent 1: basic classid 1:2
ping -c1 -W0.01 -s 56 127.0.0.1
tc qdisc change dev lo root handle 1: ets bands 2 strict 2
tc qdisc change dev lo root handle 1: ets bands 2 strict 1
ping -c1 -W0.01 -s 56 127.0.0.1

Will trigger the following splat with list debug turned on:

[   59.279014][  T365] ------------[ cut here ]------------
[   59.279452][  T365] list_add double add: new=ffff88801d60e350, prev=ffff88801d60e350, next=ffff88801d60e2c0.
[   59.280153][  T365] WARNING: CPU: 3 PID: 365 at lib/list_debug.c:35 __list_add_valid_or_report+0x17f/0x220
[   59.280860][  T365] Modules linked in:
[   59.281165][  T365] CPU: 3 UID: 0 PID: 365 Comm: tc Not tainted 6.18.0-rc7-00105-g7e9f13163c13-dirty #239 PREEMPT(voluntary)
[   59.281977][  T365] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   59.282391][  T365] RIP: 0010:__list_add_valid_or_report+0x17f/0x220
[   59.282842][  T365] Code: 89 c6 e8 d4 b7 0d ff 90 0f 0b 90 90 31 c0 e9 31 ff ff ff 90 48 c7 c7 e0 a0 22 9f 48 89 f2 48 89 c1 4c 89 c6 e8 b2 b7 0d ff 90 <0f> 0b 90 90 31 c0 e9 0f ff ff ff 48 89 f7 48 89 44 24 10 4c 89 44
...
[   59.288812][  T365] Call Trace:
[   59.289056][  T365]  <TASK>
[   59.289224][  T365]  ? srso_alias_return_thunk+0x5/0xfbef5
[   59.289546][  T365]  ets_qdisc_change+0xd2b/0x1e80
[   59.289891][  T365]  ? __lock_acquire+0x7e7/0x1be0
[   59.290223][  T365]  ? __pfx_ets_qdisc_change+0x10/0x10
[   59.290546][  T365]  ? srso_alias_return_thunk+0x5/0xfbef5
[   59.290898][  T365]  ? __mutex_trylock_common+0xda/0x240
[   59.291228][  T365]  ? __pfx___mutex_trylock_common+0x10/0x10
[   59.291655][  T365]  ? srso_alias_return_thunk+0x5/0xfbef5
[   59.291993][  T365]  ? srso_alias_return_thunk+0x5/0xfbef5
[   59.292313][  T365]  ? trace_contention_end+0xc8/0x110
[   59.292656][  T365]  ? srso_alias_return_thunk+0x5/0xfbef5
[   59.293022][  T365]  ? srso_alias_return_thunk+0x5/0xfbef5
[   59.293351][  T365]  tc_modify_qdisc+0x63a/0x1cf0

Fix this by always checking and removing an ets class from the active list
when changing it to strict.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/tree/net/sched/sch_ets.c?id=ce052b9402e461a9aded599f5b47e76bc727f7de#n663

Fixes: cd9b50adc6bb9 ("net/sched: ets: fix crash when flipping from 'strict' to 'quantum'")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20251208190125.1868423-1-victor@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-11 01:36:40 -08:00
Junrui Luo
8a11ff0948 caif: fix integer underflow in cffrml_receive()
The cffrml_receive() function extracts a length field from the packet
header and, when FCS is disabled, subtracts 2 from this length without
validating that len >= 2.

If an attacker sends a malicious packet with a length field of 0 or 1
to an interface with FCS disabled, the subtraction causes an integer
underflow.

This can lead to memory exhaustion and kernel instability, potential
information disclosure if padding contains uninitialized kernel memory.

Fix this by validating that len >= 2 before performing the subtraction.

Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Reported-by: Junrui Luo <moonafterrain@outlook.com>
Fixes: b482cd2053e3 ("net-caif: add CAIF core protocol stack")
Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/SYBPR01MB7881511122BAFEA8212A1608AFA6A@SYBPR01MB7881.ausprd01.prod.outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-11 01:35:41 -08:00
Marcus Hughes
71cfa7c893 net: sfp: extend Potron XGSPON quirk to cover additional EEPROM variant
Some Potron SFP+ XGSPON ONU sticks are shipped with different EEPROM
vendor ID and vendor name strings, but are otherwise functionally
identical to the existing "Potron SFP+ XGSPON ONU Stick" handled by
sfp_quirk_potron().

These modules, including units distributed under the "Better Internet"
branding, use the same UART pin assignment and require the same
TX_FAULT/LOS behaviour and boot delay. Re-use the existing Potron
quirk for this EEPROM variant.

Signed-off-by: Marcus Hughes <marcus.hughes@betterinternet.ltd>
Link: https://patch.msgid.link/20251207210355.333451-1-marcus.hughes@betterinternet.ltd
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-11 01:25:14 -08:00
Jakub Kicinski
95f3013e88 linux-can-fixes-for-6.19-20251210
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEn/sM2K9nqF/8FWzzDHRl3/mQkZwFAmk5L74THG1rbEBwZW5n
 dXRyb25peC5kZQAKCRAMdGXf+ZCRnOSNCACjywpibubStsI/mEHXfsP8Iavu7U/w
 YZXnfsoTaR/TTYIKmuGo0KVEGrErzV4zBI/kaSWlXweZCepAASJ31AUykVWEoaYx
 x0hYagsrICSMEE+/amWMEnKNU8FGNHB6lxlkAdyKZWyhBgT55C35SAmavBEVjw/i
 stQyDAunBKfT2AEtL1CedxM7c1rTMxoK81t1igJCeB4i5khDjC4iQuOTLunbAVES
 XJcKa+1VOWt+/VLpm+6UhYHzb8nM9NuifbdMmb2d9rnDoTG40YSw0/5/X48TCty9
 hNwwkH3xDhP052Qq0aFlXde4U7rM1sftTvGrP6EeZRuu/creETOCn/Y7
 =69I5
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-6.19-20251210' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2025-12-10

Arnd Bergmann's patch fixes a build dependency with the CAN protocols
and drivers introduced in the current development cycle.

The last patch is by me and fixes the error handling cleanup in the
gs_usb driver.

* tag 'linux-can-fixes-for-6.19-20251210' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: gs_usb: gs_can_open(): fix error handling
  can: fix build dependency
====================

Link: https://patch.msgid.link/20251210083448.2116869-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-11 01:02:30 -08:00
Jakub Kicinski
898ae76a70 netfilter pull request nf-25-12-10
-----BEGIN PGP SIGNATURE-----
 
 iQJBBAABCAArFiEEgKkgxbID4Gn1hq6fcJGo2a1f9gAFAmk5UkYNHGZ3QHN0cmxl
 bi5kZQAKCRBwkajZrV/2ABGKEAChqKDT9NB2rbp1uxyj26TE8KtWB90FioA/yZcV
 6H9lmTp5CXYRsYvkmRDM+xihYHSJpwNptxMVQJ/QonL+LPe8TwiP+THBBwknqrmY
 JY5rgcEbhAbYBnmHJ9T1soK2FHsisUoyNjrA/6rvQQpkoi9+2j/WPrDT765RdXmM
 I1fiF3qvKP1rYwpibOqXRozrlPVmBDB/ffibM7rNGR+K2N8B3vZPRbhQrc6B60ao
 gj6LLdTWZJ4wduPOVo03CzDqzu8hpL1HGwGPYqP1ol3TFlSy8+ByEM+92Nj4BEfm
 qX+3WDqxYc0U01pjz+TiXfI8nC1o5hzZiDWZ39eBp0zvxBoAyNCjc5ZKLE++EjvI
 6hdDcJCcaOipFDv5GI/UmPadYjXf33YWsPK/9mXdChagXvaXtEB/b2KwbHUO4wMT
 eJEtA9NmuvRAk6tGC9L2T7hurre/Dt8Jx35uFz0nJowqNfLxvFmAKeEB6v3yQKGv
 Jb+LD3cEY3a4q7o+Vq3AVUZTqQtifW5JYwIailWHWXtxbnEtADQdWFa7NZD1Nm12
 U2HFPN0z697WaDByNuK33Svc4CetE5oMXl+0WwE6Qa9hNCSnAkpyZKL3bTwJqEtQ
 Ij+4IXoQndgTLbr84koJYcwlTGBDu9li7ii59xh1B1K+BLxJmu/RULxlIR4x010v
 ygvezw==
 =LRTM
 -----END PGP SIGNATURE-----

Merge tag 'nf-25-12-10' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Florian Westphal says:

====================
netfilter: updates for net

1) Fix refcount leaks in nf_conncount, from Fernando Fernandez Mancera.
   This addresses a recent regression that came in the last -next
   pull request.

2) Fix a null dereference in route error handling in IPVS, from Slavin
   Liu.  This is an ancient issue dating back to 5.1 days.

3) Always set ifindex in route tuple in the flowtable output path, from
   Lorenzo Bianconi.  This bug came in with the recent output path refactoring.

4) Prefer 'exit $ksft_xfail' over 'exit $ksft_skip' when we fail to
   trigger a nat race condition to exercise the clash resolution path in
   selftest infra, $ksft_skip should be reserved for missing tooling,
   From myself.

* tag 'nf-25-12-10' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  selftests: netfilter: prefer xfail in case race wasn't triggered
  netfilter: always set route tuple out ifindex
  ipvs: fix ipv4 null-ptr-deref in route error path
  netfilter: nf_conncount: fix leaked ct in error paths
====================

Link: https://patch.msgid.link/20251210110754.22620-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-11 00:58:38 -08:00
Jakub Kicinski
237c1e152b Merge branch 'selftests-forwarding-vxlan_bridge_1q_mc_ul-fix-flakiness'
Petr Machata says:

====================
selftests: forwarding: vxlan_bridge_1q_mc_ul: Fix flakiness

The net/forwarding/vxlan_bridge_1q_mc_ul selftest runs an overlay traffic,
forwarded over a multicast-routed VXLAN underlay. In order to determine
whether packets reach their intended destination, it uses a TC match. For
convenience, it uses a flower match, which however does not allow matching
on the encapsulated packet. So various service traffic ends up being
indistinguishable from the test packets, and ends up confusing the test. To
alleviate the problem, the test uses sleep to allow the necessary service
traffic to run and clear the channel, before running the test traffic. This
worked for a while, but lately we have nevertheless seen flakiness of the
test in the CI.

In this patchset, first generalize tc_rule_stats_get() to support u32 in
patch #1, then in patch #2 convert the test to use u32 to allow parsing
deeper into the packet, and in #3 drop the now-unnecessary sleep.
====================

Link: https://patch.msgid.link/cover.1765289566.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-11 00:53:19 -08:00
Petr Machata
514520b34b selftests: forwarding: vxlan_bridge_1q_mc_ul: Drop useless sleeping
After fixing traffic matching in the previous patch, the test does not need
to use the sleep anymore. So drop vx_wait() altogether, migrate all callers
of vx{10,20}_create_wait() to the corresponding _create(), and drop the now
unused _create_wait() helpers.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/eabfe4fa12ae788cf3b8c5c876a989de81dfc3d3.1765289566.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-11 00:53:16 -08:00
Petr Machata
0c8b9a68b3 selftests: forwarding: vxlan_bridge_1q_mc_ul: Fix flakiness
This test runs an overlay traffic, forwarded over a multicast-routed VXLAN
underlay. In order to determine whether packets reach their intended
destination, it uses a TC match. For convenience, it uses a flower match,
which however does not allow matching on the encapsulated packet. So
various service traffic ends up being indistinguishable from the test
packets, and ends up confusing the test. To alleviate the problem, the test
uses sleep to allow the necessary service traffic to run and clear the
channel, before running the test traffic. This worked for a while, but
lately we have nevertheless seen flakiness of the test in the CI.

Fix the issue by using u32 to match the encapsulated packet as well. The
confusing packets seem to always be IPv6 multicast listener reports.
Realistically they could be ARP or other ICMP6 traffic as well. Therefore
look for ethertype IPv4 in the IPv4 traffic test, and for IPv6 / UDP
combination in the IPv6 traffic test.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/6438cb1613a2a667d3ff64089eb5994778f247af.1765289566.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-11 00:53:15 -08:00
Petr Machata
0842e34849 selftests: net: lib: tc_rule_stats_get(): Don't hard-code array index
Flower is commonly used to match on packets in many bash-based selftests.
A dump of a flower filter including statistics looks something like this:

 [
   {
     "protocol": "all",
     "pref": 49152,
     "kind": "flower",
     "chain": 0
   },
   {
     ...
     "options": {
       ...
       "actions": [
         {
	   ...
           "stats": {
             "bytes": 0,
             "packets": 0,
             "drops": 0,
             "overlimits": 0,
             "requeues": 0,
             "backlog": 0,
             "qlen": 0
           }
         }
       ]
     }
   }
 ]

The JQ query in the helper function tc_rule_stats_get() assumes this form
and looks for the second element of the array.

However, a dump of a u32 filter looks like this:

 [
   {
     "protocol": "all",
     "pref": 49151,
     "kind": "u32",
     "chain": 0
   },
   {
     "protocol": "all",
     "pref": 49151,
     "kind": "u32",
     "chain": 0,
     "options": {
       "fh": "800:",
       "ht_divisor": 1
     }
   },
   {
     ...
     "options": {
       ...
       "actions": [
         {
	   ...
           "stats": {
             "bytes": 0,
             "packets": 0,
             "drops": 0,
             "overlimits": 0,
             "requeues": 0,
             "backlog": 0,
             "qlen": 0
           }
         }
       ]
     }
   },
 ]

There's an extra element which the JQ query ends up choosing.

Instead of hard-coding a particular index, look for the entry on which a
selector .options.actions yields anything.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/12982a44471c834511a0ee6c1e8f57e3a5307105.1765289566.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-11 00:53:15 -08:00
Florian Westphal
b8a81b0ce5 selftests: netfilter: prefer xfail in case race wasn't triggered
Jakub says: "We try to reserve SKIP for tests skipped because tool is
missing in env, something isn't built into the kernel etc."

use xfail, we can't force the race condition to appear at will
so its expected that the test 'fails' occasionally.

Fixes: 78a588363587 ("selftests: netfilter: add conntrack clash resolution test case")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20251206175647.5c32f419@kernel.org/
Signed-off-by: Florian Westphal <fw@strlen.de>
2025-12-10 11:55:59 +01:00
Lorenzo Bianconi
2bdc536c9d netfilter: always set route tuple out ifindex
Always set nf_flow_route tuple out ifindex even if the indev is not one
of the flowtable configured devices since otherwise the outdev lookup in
nf_flow_offload_ip_hook() or nf_flow_offload_ipv6_hook() for
FLOW_OFFLOAD_XMIT_NEIGH flowtable entries will fail.
The above issue occurs in the following configuration since IP6IP6
tunnel does not support flowtable acceleration yet:

$ip addr show
5: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:11:22:33:22:55 brd ff:ff:ff:ff:ff:ff link-netns ns1
    inet6 2001:db8:1::2/64 scope global nodad
       valid_lft forever preferred_lft forever
    inet6 fe80::211:22ff:fe33:2255/64 scope link tentative proto kernel_ll
       valid_lft forever preferred_lft forever
6: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:22:22:33:22:55 brd ff:ff:ff:ff:ff:ff link-netns ns3
    inet6 2001:db8:2::1/64 scope global nodad
       valid_lft forever preferred_lft forever
    inet6 fe80::222:22ff:fe33:2255/64 scope link tentative proto kernel_ll
       valid_lft forever preferred_lft forever
7: tun0@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1452 qdisc noqueue state UNKNOWN group default qlen 1000
    link/tunnel6 2001:db8:2::1 peer 2001:db8:2::2 permaddr a85:e732:2c37::
    inet6 2002:db8:1::1/64 scope global nodad
       valid_lft forever preferred_lft forever
    inet6 fe80::885:e7ff:fe32:2c37/64 scope link proto kernel_ll
       valid_lft forever preferred_lft forever

$ip -6 route show
2001:db8:1::/64 dev eth0 proto kernel metric 256 pref medium
2001:db8:2::/64 dev eth1 proto kernel metric 256 pref medium
2002:db8:1::/64 dev tun0 proto kernel metric 256 pref medium
default via 2002:db8:1::2 dev tun0 metric 1024 pref medium

$nft list ruleset
table inet filter {
        flowtable ft {
                hook ingress priority filter
                devices = { eth0, eth1 }
        }

        chain forward {
                type filter hook forward priority filter; policy accept;
                meta l4proto { tcp, udp } flow add @ft
        }
}

Fixes: b5964aac51e0 ("netfilter: flowtable: consolidate xmit path")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
2025-12-10 11:55:58 +01:00
Slavin Liu
ad891bb3d0 ipvs: fix ipv4 null-ptr-deref in route error path
The IPv4 code path in __ip_vs_get_out_rt() calls dst_link_failure()
without ensuring skb->dev is set, leading to a NULL pointer dereference
in fib_compute_spec_dst() when ipv4_link_failure() attempts to send
ICMP destination unreachable messages.

The issue emerged after commit ed0de45a1008 ("ipv4: recompile ip options
in ipv4_link_failure") started calling __ip_options_compile() from
ipv4_link_failure(). This code path eventually calls fib_compute_spec_dst()
which dereferences skb->dev. An attempt was made to fix the NULL skb->dev
dereference in commit 0113d9c9d1cc ("ipv4: fix null-deref in
ipv4_link_failure"), but it only addressed the immediate dev_net(skb->dev)
dereference by using a fallback device. The fix was incomplete because
fib_compute_spec_dst() later in the call chain still accesses skb->dev
directly, which remains NULL when IPVS calls dst_link_failure().

The crash occurs when:
1. IPVS processes a packet in NAT mode with a misconfigured destination
2. Route lookup fails in __ip_vs_get_out_rt() before establishing a route
3. The error path calls dst_link_failure(skb) with skb->dev == NULL
4. ipv4_link_failure() → ipv4_send_dest_unreach() →
   __ip_options_compile() → fib_compute_spec_dst()
5. fib_compute_spec_dst() dereferences NULL skb->dev

Apply the same fix used for IPv6 in commit 326bf17ea5d4 ("ipvs: fix
ipv6 route unreach panic"): set skb->dev from skb_dst(skb)->dev before
calling dst_link_failure().

KASAN: null-ptr-deref in range [0x0000000000000328-0x000000000000032f]
CPU: 1 PID: 12732 Comm: syz.1.3469 Not tainted 6.6.114 #2
RIP: 0010:__in_dev_get_rcu include/linux/inetdevice.h:233
RIP: 0010:fib_compute_spec_dst+0x17a/0x9f0 net/ipv4/fib_frontend.c:285
Call Trace:
  <TASK>
  spec_dst_fill net/ipv4/ip_options.c:232
  spec_dst_fill net/ipv4/ip_options.c:229
  __ip_options_compile+0x13a1/0x17d0 net/ipv4/ip_options.c:330
  ipv4_send_dest_unreach net/ipv4/route.c:1252
  ipv4_link_failure+0x702/0xb80 net/ipv4/route.c:1265
  dst_link_failure include/net/dst.h:437
  __ip_vs_get_out_rt+0x15fd/0x19e0 net/netfilter/ipvs/ip_vs_xmit.c:412
  ip_vs_nat_xmit+0x1d8/0xc80 net/netfilter/ipvs/ip_vs_xmit.c:764

Fixes: ed0de45a1008 ("ipv4: recompile ip options in ipv4_link_failure")
Signed-off-by: Slavin Liu <slavin452@gmail.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Florian Westphal <fw@strlen.de>
2025-12-10 11:55:58 +01:00
Fernando Fernandez Mancera
2e2a720766 netfilter: nf_conncount: fix leaked ct in error paths
There are some situations where ct might be leaked as error paths are
skipping the refcounted check and return immediately. In order to solve
it make sure that the check is always called.

Fixes: be102eb6a0e7 ("netfilter: nf_conncount: rework API to use sk_buff directly")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
2025-12-10 11:55:58 +01:00
Jakub Kicinski
6bcb7727d9 Merge branch 'inet-frags-flush-pending-skbs-in-fqdir_pre_exit'
Jakub Kicinski says:

====================
inet: frags: flush pending skbs in fqdir_pre_exit()

Fix the issue reported by NIPA starting on Sep 18th [1], where
pernet_ops_rwsem is constantly held by a reader, preventing writers
from grabbing it (specifically driver modules from loading).

The fact that reports started around that time seems coincidental.
The issue seems to be skbs queued for defrag preventing conntrack
from exiting.

First patch fixes another theoretical issue, it's mostly a leftover
from an attempt to get rid of the inet_frag_queue refcnt, which
I gave up on (still think it's doable but a bit of a time sink).
Second patch is a minor refactor.

The real fix is in the third patch. It's the simplest fix I can
think of which is to flush the frag queues. Perhaps someone has
a better suggestion?

Last patch adds an explicit warning for conntrack getting stuck,
as this seems like something that can easily happen if bugs sneak in.
The warning will hopefully save us the first 20% of the investigation
effort.

Link: https://lore.kernel.org/20251001082036.0fc51440@kernel.org # [1]
====================

Link: https://patch.msgid.link/20251207010942.1672972-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-10 01:15:33 -08:00
Jakub Kicinski
92df4c56cf netfilter: conntrack: warn when cleanup is stuck
nf_conntrack_cleanup_net_list() calls schedule() so it does not
show up as a hung task. Add an explicit check to make debugging
leaked skbs/conntack references more obvious.

Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251207010942.1672972-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-10 01:15:27 -08:00
Jakub Kicinski
006a5035b4 inet: frags: flush pending skbs in fqdir_pre_exit()
We have been seeing occasional deadlocks on pernet_ops_rwsem since
September in NIPA. The stuck task was usually modprobe (often loading
a driver like ipvlan), trying to take the lock as a Writer.
lockdep does not track readers for rwsems so the read wasn't obvious
from the reports.

On closer inspection the Reader holding the lock was conntrack looping
forever in nf_conntrack_cleanup_net_list(). Based on past experience
with occasional NIPA crashes I looked thru the tests which run before
the crash and noticed that the crash follows ip_defrag.sh. An immediate
red flag. Scouring thru (de)fragmentation queues reveals skbs sitting
around, holding conntrack references.

The problem is that since conntrack depends on nf_defrag_ipv6,
nf_defrag_ipv6 will load first. Since nf_defrag_ipv6 loads first its
netns exit hooks run _after_ conntrack's netns exit hook.

Flush all fragment queue SKBs during fqdir_pre_exit() to release
conntrack references before conntrack cleanup runs. Also flush
the queues in timer expiry handlers when they discover fqdir->dead
is set, in case packet sneaks in while we're running the pre_exit
flush.

The commit under Fixes is not exactly the culprit, but I think
previously the timer firing would eventually unblock the spinning
conntrack.

Fixes: d5dd88794a13 ("inet: fix various use-after-free in defrags units")
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251207010942.1672972-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-10 01:15:27 -08:00
Jakub Kicinski
1231eec699 inet: frags: add inet_frag_queue_flush()
Instead of exporting inet_frag_rbtree_purge() which requires that
caller takes care of memory accounting, add a new helper. We will
need to call it from a few places in the next patch.

Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251207010942.1672972-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-10 01:15:27 -08:00
Jakub Kicinski
8ef522c8a5 inet: frags: avoid theoretical race in ip_frag_reinit()
In ip_frag_reinit() we want to move the frag timeout timer into
the future. If the timer fires in the meantime we inadvertently
scheduled it again, and since the timer assumes a ref on frag_queue
we need to acquire one to balance things out.

This is technically racy, we should have acquired the reference
_before_ we touch the timer, it may fire again before we take the ref.
Avoid this entire dance by using mod_timer_pending() which only modifies
the timer if its pending (and which exists since Linux v2.6.30)

Note that this was the only place we ever took a ref on frag_queue
since Eric's conversion to RCU. So we could potentially replace
the whole refcnt field with an atomic flag and a bit more RCU.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251207010942.1672972-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-10 01:15:27 -08:00
Jakub Kicinski
2f6e056e95 Merge branch 'selftests-fix-build-warnings-and-errors' (part)
Guenter Roeck says:

====================
selftests: Fix build warnings and errors

This series fixes build warnings and errors observed when trying to build
selftests.
====================

Link: https://patch.msgid.link/20251205171010.515236-1-linux@roeck-us.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-10 01:11:12 -08:00
Guenter Roeck
91dc09a609 selftests: net: tfo: Fix build warning
Fix

tfo.c: In function ‘run_server’:
tfo.c:84:9: warning: ignoring return value of ‘read’ declared with attribute ‘warn_unused_result’

by evaluating the return value from read() and displaying an error message
if it reports an error.

Fixes: c65b5bb2329e3 ("selftests: net: add passive TFO test binary")
Cc: David Wei <dw@davidwei.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Link: https://patch.msgid.link/20251205171010.515236-14-linux@roeck-us.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-10 01:11:12 -08:00
Guenter Roeck
59546e8744 selftests: net: Fix build warnings
Fix

ksft.h: In function ‘ksft_ready’:
ksft.h:27:9: warning: ignoring return value of ‘write’ declared with attribute ‘warn_unused_result’

ksft.h: In function ‘ksft_wait’:
ksft.h:51:9: warning: ignoring return value of ‘read’ declared with attribute ‘warn_unused_result’

by checking the return value of the affected functions and displaying
an error message if an error is seen.

Fixes: 2b6d490b82668 ("selftests: drv-net: Factor out ksft C helpers")
Cc: Joe Damato <jdamato@fastly.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Link: https://patch.msgid.link/20251205171010.515236-11-linux@roeck-us.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-10 01:11:12 -08:00
Guenter Roeck
06f7cae92f selftest: af_unix: Support compilers without flex-array-member-not-at-end support
Fix:

gcc: error: unrecognized command-line option ‘-Wflex-array-member-not-at-end’

by making the compiler option dependent on its support.

Fixes: 1838731f1072c ("selftest: af_unix: Add -Wall and -Wflex-array-member-not-at-end to CFLAGS.")
Cc: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Link: https://patch.msgid.link/20251205171010.515236-7-linux@roeck-us.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-10 01:08:51 -08:00
Ankit Khushwaha
9580f6d47d selftests: tls: fix warning of uninitialized variable
In 'poll_partial_rec_async' a uninitialized char variable 'token' with
is used for write/read instruction to synchronize between threads
via a pipe.

tls.c:2833:26: warning: variable 'token' is uninitialized
      		   when passed as a const pointer argument

Initialize 'token' to '\0' to silence compiler warning.

Signed-off-by: Ankit Khushwaha <ankitkhushwaha.linux@gmail.com>
Link: https://patch.msgid.link/20251205163242.14615-1-ankitkhushwaha.linux@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-10 01:03:02 -08:00
Alexey Simakov
50b3db3e11 broadcom: b44: prevent uninitialized value usage
On execution path with raised B44_FLAG_EXTERNAL_PHY, b44_readphy()
leaves bmcr value uninitialized and it is used later in the code.

Add check of this flag at the beginning of the b44_nway_reset() and
exit early of the function with restarting autonegotiation if an
external PHY is used.

Fixes: 753f492093da ("[B44]: port to native ssb support")
Reviewed-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Alexey Simakov <bigalex934@gmail.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20251205155815.4348-1-bigalex934@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-10 01:02:20 -08:00
caoping
6af2a01d65 net/handshake: restore destructor on submit failure
handshake_req_submit() replaces sk->sk_destruct but never restores it when
submission fails before the request is hashed. handshake_sk_destruct() then
returns early and the original destructor never runs, leaking the socket.
Restore sk_destruct on the error path.

Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: caoping <caoping@cmss.chinamobile.com>
Link: https://patch.msgid.link/20251204091058.1545151-1-caoping@cmss.chinamobile.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-10 00:51:49 -08:00
Arnd Bergmann
9e7477a427 net: ti: icssg-prueth: add PTP_1588_CLOCK_OPTIONAL dependency
The new icssg-prueth driver needs the same dependency as the other parts
that use the ptp-1588:

WARNING: unmet direct dependencies detected for TI_ICSS_IEP
  Depends on [m]: NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_TI [=y] && PTP_1588_CLOCK_OPTIONAL [=m] && TI_PRUSS [=y]
  Selected by [y]:
  - TI_PRUETH [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_TI [=y] && PRU_REMOTEPROC [=y] && NET_SWITCHDEV [=y]

Add the correct dependency on the two drivers missing it, and remove
the pointless 'imply' in the process.

Fixes: e654b85a693e ("net: ti: icssg-prueth: Add ICSSG Ethernet driver for AM65x SR1.0 platforms")
Fixes: 511f6c1ae093 ("net: ti: icssm-prueth: Adds ICSSM Ethernet driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251204100138.1034175-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-10 00:49:56 -08:00
Ilya Maximets
5ace7ef87f net: openvswitch: fix middle attribute validation in push_nsh() action
The push_nsh() action structure looks like this:

 OVS_ACTION_ATTR_PUSH_NSH(OVS_KEY_ATTR_NSH(OVS_NSH_KEY_ATTR_BASE,...))

The outermost OVS_ACTION_ATTR_PUSH_NSH attribute is OK'ed by the
nla_for_each_nested() inside __ovs_nla_copy_actions().  The innermost
OVS_NSH_KEY_ATTR_BASE/MD1/MD2 are OK'ed by the nla_for_each_nested()
inside nsh_key_put_from_nlattr().  But nothing checks if the attribute
in the middle is OK.  We don't even check that this attribute is the
OVS_KEY_ATTR_NSH.  We just do a double unwrap with a pair of nla_data()
calls - first time directly while calling validate_push_nsh() and the
second time as part of the nla_for_each_nested() macro, which isn't
safe, potentially causing invalid memory access if the size of this
attribute is incorrect.  The failure may not be noticed during
validation due to larger netlink buffer, but cause trouble later during
action execution where the buffer is allocated exactly to the size:

 BUG: KASAN: slab-out-of-bounds in nsh_hdr_from_nlattr+0x1dd/0x6a0 [openvswitch]
 Read of size 184 at addr ffff88816459a634 by task a.out/22624

 CPU: 8 UID: 0 PID: 22624 6.18.0-rc7+ #115 PREEMPT(voluntary)
 Call Trace:
  <TASK>
  dump_stack_lvl+0x51/0x70
  print_address_description.constprop.0+0x2c/0x390
  kasan_report+0xdd/0x110
  kasan_check_range+0x35/0x1b0
  __asan_memcpy+0x20/0x60
  nsh_hdr_from_nlattr+0x1dd/0x6a0 [openvswitch]
  push_nsh+0x82/0x120 [openvswitch]
  do_execute_actions+0x1405/0x2840 [openvswitch]
  ovs_execute_actions+0xd5/0x3b0 [openvswitch]
  ovs_packet_cmd_execute+0x949/0xdb0 [openvswitch]
  genl_family_rcv_msg_doit+0x1d6/0x2b0
  genl_family_rcv_msg+0x336/0x580
  genl_rcv_msg+0x9f/0x130
  netlink_rcv_skb+0x11f/0x370
  genl_rcv+0x24/0x40
  netlink_unicast+0x73e/0xaa0
  netlink_sendmsg+0x744/0xbf0
  __sys_sendto+0x3d6/0x450
  do_syscall_64+0x79/0x2c0
  entry_SYSCALL_64_after_hwframe+0x76/0x7e
  </TASK>

Let's add some checks that the attribute is properly sized and it's
the only one attribute inside the action.  Technically, there is no
real reason for OVS_KEY_ATTR_NSH to be there, as we know that we're
pushing an NSH header already, it just creates extra nesting, but
that's how uAPI works today.  So, keeping as it is.

Fixes: b2d0f5d5dc53 ("openvswitch: enable NSH support")
Reported-by: Junvy Yang <zhuque@tencent.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eelco Chaudron echaudro@redhat.com
Reviewed-by: Aaron Conole <aconole@redhat.com>
Link: https://patch.msgid.link/20251204105334.900379-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-10 00:46:55 -08:00
Marc Kleine-Budde
3e54d3b4a8 can: gs_usb: gs_can_open(): fix error handling
Commit 2603be9e8167 ("can: gs_usb: gs_can_open(): improve error handling")
added missing error handling to the gs_can_open() function.

The driver uses 2 USB anchors to track the allocated URBs: the TX URBs in
struct gs_can::tx_submitted for each netdev and the RX URBs in struct
gs_usb::rx_submitted for the USB device. gs_can_open() allocates the RX
URBs, while TX URBs are allocated during gs_can_start_xmit().

The cleanup in gs_can_open() kills all anchored dev->tx_submitted
URBs (which is not necessary since the netdev is not yet registered), but
misses the parent->rx_submitted URBs.

Fix the problem by killing the rx_submitted instead of the tx_submitted.

Fixes: 2603be9e8167 ("can: gs_usb: gs_can_open(): improve error handling")
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251210-gs_usb-fix-error-handling-v1-1-d6a5a03f10bb@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-12-10 09:30:31 +01:00
Arnd Bergmann
6abd4577bc can: fix build dependency
A recent bugfix introduced a new problem with Kconfig dependencies:

WARNING: unmet direct dependencies detected for CAN_DEV
  Depends on [n]: NETDEVICES [=n] && CAN [=m]
  Selected by [m]:
  - CAN [=m] && NET [=y]

Since the CAN core code now links into the CAN device code, that
particular function needs to be available, though the rest of it
does not.

Revert the incomplete fix and instead use Makefile logic to avoid
the link failure.

Fixes: cb2dc6d2869a ("can: Kconfig: select CAN driver infrastructure by default")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202512091523.zty3CLmc-lkp@intel.com/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://patch.msgid.link/20251204100015.1033688-1-arnd@kernel.org
[mkl: removed module option from CAN_DEV help text (thanks Vincent)]
[mkl: removed '&& CAN' from Kconfig dependency (thanks Vincent)]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-12-10 09:19:34 +01:00
Jakub Kicinski
186468c67f Merge branch 'mptcp-misc-fixes-for-v6-19-rc1'
Matthieu Baerts says:

====================
mptcp: misc fixes for v6.19-rc1

Here are various unrelated fixes:

- Patches 1-2: ignore unknown in-kernel PM endpoint flags instead of
  pretending they are supported. A fix for v5.7.

- Patch 3: avoid potential unnecessary rtx timer expiration. A fix for
  v5.15.

- Patch 4: avoid a deadlock on fallback in case of SKB creation failure
  while re-injecting.
====================

Link: https://patch.msgid.link/20251205-net-mptcp-misc-fixes-6-19-rc1-v1-0-9e4781a6c1b8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-08 23:54:06 -08:00
Paolo Abeni
ffb8c27b05 mptcp: avoid deadlock on fallback while reinjecting
Jakub reported an MPTCP deadlock at fallback time:

 WARNING: possible recursive locking detected
 6.18.0-rc7-virtme #1 Not tainted
 --------------------------------------------
 mptcp_connect/20858 is trying to acquire lock:
 ff1100001da18b60 (&msk->fallback_lock){+.-.}-{3:3}, at: __mptcp_try_fallback+0xd8/0x280

 but task is already holding lock:
 ff1100001da18b60 (&msk->fallback_lock){+.-.}-{3:3}, at: __mptcp_retrans+0x352/0xaa0

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&msk->fallback_lock);
   lock(&msk->fallback_lock);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 3 locks held by mptcp_connect/20858:
  #0: ff1100001da18290 (sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_sendmsg+0x114/0x1bc0
  #1: ff1100001db40fd0 (k-sk_lock-AF_INET#2){+.+.}-{0:0}, at: __mptcp_retrans+0x2cb/0xaa0
  #2: ff1100001da18b60 (&msk->fallback_lock){+.-.}-{3:3}, at: __mptcp_retrans+0x352/0xaa0

 stack backtrace:
 CPU: 0 UID: 0 PID: 20858 Comm: mptcp_connect Not tainted 6.18.0-rc7-virtme #1 PREEMPT(full)
 Hardware name: Bochs, BIOS Bochs 01/01/2011
 Call Trace:
  <TASK>
  dump_stack_lvl+0x6f/0xa0
  print_deadlock_bug.cold+0xc0/0xcd
  validate_chain+0x2ff/0x5f0
  __lock_acquire+0x34c/0x740
  lock_acquire.part.0+0xbc/0x260
  _raw_spin_lock_bh+0x38/0x50
  __mptcp_try_fallback+0xd8/0x280
  mptcp_sendmsg_frag+0x16c2/0x3050
  __mptcp_retrans+0x421/0xaa0
  mptcp_release_cb+0x5aa/0xa70
  release_sock+0xab/0x1d0
  mptcp_sendmsg+0xd5b/0x1bc0
  sock_write_iter+0x281/0x4d0
  new_sync_write+0x3c5/0x6f0
  vfs_write+0x65e/0xbb0
  ksys_write+0x17e/0x200
  do_syscall_64+0xbb/0xfd0
  entry_SYSCALL_64_after_hwframe+0x4b/0x53
 RIP: 0033:0x7fa5627cbc5e
 Code: 4d 89 d8 e8 14 bd 00 00 4c 8b 5d f8 41 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 11 c9 c3 0f 1f 80 00 00 00 00 48 8b 45 10 0f 05 <c9> c3 83 e2 39 83 fa 08 75 e7 e8 13 ff ff ff 0f 1f 00 f3 0f 1e fa
 RSP: 002b:00007fff1fe14700 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
 RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fa5627cbc5e
 RDX: 0000000000001f9c RSI: 00007fff1fe16984 RDI: 0000000000000005
 RBP: 00007fff1fe14710 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000202 R12: 00007fff1fe16920
 R13: 0000000000002000 R14: 0000000000001f9c R15: 0000000000001f9c

The packet scheduler could attempt a reinjection after receiving an
MP_FAIL and before the infinite map has been transmitted, causing a
deadlock since MPTCP needs to do the reinjection atomically from WRT
fallback.

Address the issue explicitly avoiding the reinjection in the critical
scenario. Note that this is the only fallback critical section that
could potentially send packets and hit the double-lock.

Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://netdev-ctrl.bots.linux.dev/logs/vmksft/mptcp-dbg/results/412720/1-mptcp-join-sh/stderr
Fixes: f8a1d9b18c5e ("mptcp: make fallback action and fallback decision atomic")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251205-net-mptcp-misc-fixes-6-19-rc1-v1-4-9e4781a6c1b8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-08 23:54:03 -08:00
Paolo Abeni
2ea6190f42 mptcp: schedule rtx timer only after pushing data
The MPTCP protocol usually schedule the retransmission timer only
when there is some chances for such retransmissions to happen.

With a notable exception: __mptcp_push_pending() currently schedule
such timer unconditionally, potentially leading to unnecessary rtx
timer expiration.

The issue is present since the blamed commit below but become easily
reproducible after commit 27b0e701d387 ("mptcp: drop bogus optimization
in __mptcp_check_push()")

Fixes: 33d41c9cd74c ("mptcp: more accurate timeout")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251205-net-mptcp-misc-fixes-6-19-rc1-v1-3-9e4781a6c1b8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-08 23:54:03 -08:00
Matthieu Baerts (NGI0)
29f4801e9c selftests: mptcp: pm: ensure unknown flags are ignored
This validates the previous commit: the userspace can set unknown flags
-- the 7th bit is currently unused -- without errors, but only the
supported ones are printed in the endpoints dumps.

The 'Fixes' tag here below is the same as the one from the previous
commit: this patch here is not fixing anything wrong in the selftests,
but it validates the previous fix for an issue introduced by this commit
ID.

Fixes: 01cacb00b35c ("mptcp: add netlink-based PM")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251205-net-mptcp-misc-fixes-6-19-rc1-v1-2-9e4781a6c1b8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-08 23:54:02 -08:00
Matthieu Baerts (NGI0)
0ace3297a7 mptcp: pm: ignore unknown endpoint flags
Before this patch, the kernel was saving any flags set by the userspace,
even unknown ones. This doesn't cause critical issues because the kernel
is only looking at specific ones. But on the other hand, endpoints dumps
could tell the userspace some recent flags seem to be supported on older
kernel versions.

Instead, ignore all unknown flags when parsing them. By doing that, the
userspace can continue to set unsupported flags, but it has a way to
verify what is supported by the kernel.

Note that it sounds better to continue accepting unsupported flags not
to change the behaviour, but also that eases things on the userspace
side by adding "optional" endpoint types only supported by newer kernel
versions without having to deal with the different kernel versions.

A note for the backports: there will be conflicts in mptcp.h on older
versions not having the mentioned flags, the new line should still be
added last, and the '5' needs to be adapted to have the same value as
the last entry.

Fixes: 01cacb00b35c ("mptcp: add netlink-based PM")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251205-net-mptcp-misc-fixes-6-19-rc1-v1-1-9e4781a6c1b8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-08 23:54:02 -08:00
Jakub Kicinski
db6b35cffe tools: ynl: fix build on systems with old kernel headers
The wireguard YNL conversion was missing the customary .deps entry.
NIPA doesn't catch this but my CentOS 9 system complains:

 wireguard-user.c:72:10: error: ‘WGALLOWEDIP_A_FLAGS’ undeclared here
 wireguard-user.c:58:67: error: parameter 1 (‘value’) has incomplete type
 58 | const char *wireguard_wgallowedip_flags_str(enum wgallowedip_flag value)
    |                                             ~~~~~~~~~~~~~~~~~~~~~~^~~~~

And similarly does Ubuntu 22.04.

One extra complication here is that we renamed the header guard,
so we need to compat with both old and new guard define.

Reviewed-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Link: https://patch.msgid.link/20251207013848.1692990-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-08 23:53:17 -08:00
Jakub Kicinski
e56cadaa27 ynl: add regen hint to new headers
Recent commit 68e83f347266 ("tools: ynl-gen: add regeneration comment")
added a hint how to regenerate the code to the headers. Update
the new headers from this release cycle to also include it.

Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251207004740.1657799-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-08 23:52:43 -08:00
Eric Biggers
e9e5047df9 mptcp: select CRYPTO_LIB_UTILS instead of CRYPTO
Since the only crypto functions used by the mptcp code are the SHA-256
library functions and crypto_memneq(), select only the options needed
for those: CRYPTO_LIB_SHA256 and CRYPTO_LIB_UTILS.

Previously, CRYPTO was selected instead of CRYPTO_LIB_UTILS.  That does
pull in CRYPTO_LIB_UTILS as well, but it's unnecessarily broad.

Years ago, the CRYPTO_LIB_* options were visible only when CRYPTO.  That
may be another reason why CRYPTO is selected here.  However, that was
fixed years ago, and the libraries can now be selected directly.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Link: https://patch.msgid.link/20251204054417.491439-1-ebiggers@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-08 23:44:16 -08:00
Mateusz Guzik
2183a5c8a0 af_unix: annotate unix_gc_lock with __cacheline_aligned_in_smp
Otherwise the lock is susceptible to ever-changing false-sharing due to
unrelated changes. This in particular popped up here where an unrelated
change improved performance:
https://lore.kernel.org/oe-lkp/202511281306.51105b46-lkp@intel.com/

Stabilize it with an explicit annotation which also has a side effect
of furher improving scalability:
> in our oiginal report, 284922f4c5 has a 6.1% performance improvement comparing
> to parent 17d85f33a8.
> we applied your patch directly upon 284922f4c5. as below, now by
> "284922f4c5 + your patch"
> we observe a 12.8% performance improvements (still comparing to 17d85f33a8).

Note nothing was done for the other fields, so some fluctuation is still
possible.

Tested-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20251203100122.291550-1-mjguzik@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-08 23:43:24 -08:00
Michael Chan
0373d5c387 bnxt_en: Fix XDP_TX path
For XDP_TX action in bnxt_rx_xdp(), clearing of the event flags is not
correct.  __bnxt_poll_work() -> bnxt_rx_pkt() -> bnxt_rx_xdp() may be
looping within NAPI and some event flags may be set in earlier
iterations.  In particular, if BNXT_TX_EVENT is set earlier indicating
some XDP_TX packets are ready and pending, it will be cleared if it is
XDP_TX action again.  Normally, we will set BNXT_TX_EVENT again when we
successfully call __bnxt_xmit_xdp().  But if the TX ring has no more
room, the flag will not be set.  This will cause the TX producer to be
ahead but the driver will not hit the TX doorbell.

For multi-buf XDP_TX, there is no need to clear the event flags and set
BNXT_AGG_EVENT.  The BNXT_AGG_EVENT flag should have been set earlier in
bnxt_rx_pkt().

The visible symptom of this is that the RX ring associated with the
TX XDP ring will eventually become empty and all packets will be dropped.
Because this condition will cause the driver to not refill the RX ring
seeing that the TX ring has forever pending XDP_TX packets.

The fix is to only clear BNXT_RX_EVENT when we have successfully
called __bnxt_xmit_xdp().

Fixes: 7f0a168b0441 ("bnxt_en: Add completion ring pointer in TX and RX ring structures")
Reported-by: Pavel Dubovitsky <pdubovitsky@meta.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20251203003024.2246699-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-04 17:54:25 -08:00
Tim Hostetler
a479a27f4d gve: Move gve_init_clock to after AQ CONFIGURE_DEVICE_RESOURCES call
commit 46e7860ef941 ("gve: Move ptp_schedule_worker to gve_init_clock")
moved the first invocation of the AQ command REPORT_NIC_TIMESTAMP to
gve_probe(). However, gve_init_clock() invoking REPORT_NIC_TIMESTAMP is
not valid until after gve_probe() invokes the AQ command
CONFIGURE_DEVICE_RESOURCES.

Failure to do so results in the following error:

gve 0000:00:07.0: failed to read NIC clock -11

This was missed earlier because the driver under test was loaded at
runtime instead of boot-time. The boot-time driver had already
initialized the device, causing the runtime driver to successfully call
gve_init_clock() incorrectly.

Fixes: 46e7860ef941 ("gve: Move ptp_schedule_worker to gve_init_clock")
Reviewed-by: Ankit Garg <nktgrg@google.com>
Signed-off-by: Tim Hostetler <thostet@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251202200207.1434749-1-hramamurthy@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-04 17:54:16 -08:00
René Rebe
dd75c723ef r8169: fix RTL8117 Wake-on-Lan in DASH mode
Wake-on-Lan does currently not work for r8169 in DASH mode, e.g. the
ASUS Pro WS X570-ACE with RTL8168fp/RTL8117.

Fix by not returning early in rtl_prepare_power_down when dash_enabled.
While this fixes WoL, it still kills the OOB RTL8117 remote management
BMC connection. Fix by not calling rtl8168_driver_stop if WoL is enabled.

Fixes: 065c27c184d6 ("r8169: phy power ops")
Signed-off-by: René Rebe <rene@exactco.de>
Cc: stable@vger.kernel.org
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/20251202.194137.1647877804487085954.rene@exactco.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-04 17:54:07 -08:00
Jakub Kicinski
e7a9530d12 Merge branch 'mlxsw-three-m-router-fixes'
Petr Machata says:

====================
mlxsw: Three (m)router fixes

This patchset contains two fixes in mlxsw Spectrum router code, and one for
the Spectrum multicast router code. Please see the individual patches for
more details.
====================

Link: https://patch.msgid.link/cover.1764695650.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-04 17:53:53 -08:00
Ido Schimmel
8ac1dacec4 mlxsw: spectrum_mr: Fix use-after-free when updating multicast route stats
Cited commit added a dedicated mutex (instead of RTNL) to protect the
multicast route list, so that it will not change while the driver
periodically traverses it in order to update the kernel about multicast
route stats that were queried from the device.

One instance of list entry deletion (during route replace) was missed
and it can result in a use-after-free [1].

Fix by acquiring the mutex before deleting the entry from the list and
releasing it afterwards.

[1]
BUG: KASAN: slab-use-after-free in mlxsw_sp_mr_stats_update+0x4a5/0x540 drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c:1006 [mlxsw_spectrum]
Read of size 8 at addr ffff8881523c2fa8 by task kworker/2:5/22043

CPU: 2 UID: 0 PID: 22043 Comm: kworker/2:5 Not tainted 6.18.0-rc1-custom-g1a3d6d7cd014 #1 PREEMPT(full)
Hardware name: Mellanox Technologies Ltd. MSN2010/SA002610, BIOS 5.6.5 08/24/2017
Workqueue: mlxsw_core mlxsw_sp_mr_stats_update [mlxsw_spectrum]
Call Trace:
 <TASK>
 dump_stack_lvl+0xba/0x110
 print_report+0x174/0x4f5
 kasan_report+0xdf/0x110
 mlxsw_sp_mr_stats_update+0x4a5/0x540 drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c:1006 [mlxsw_spectrum]
 process_one_work+0x9cc/0x18e0
 worker_thread+0x5df/0xe40
 kthread+0x3b8/0x730
 ret_from_fork+0x3e9/0x560
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Allocated by task 29933:
 kasan_save_stack+0x30/0x50
 kasan_save_track+0x14/0x30
 __kasan_kmalloc+0x8f/0xa0
 mlxsw_sp_mr_route_add+0xd8/0x4770 [mlxsw_spectrum]
 mlxsw_sp_router_fibmr_event_work+0x371/0xad0 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:7965 [mlxsw_spectrum]
 process_one_work+0x9cc/0x18e0
 worker_thread+0x5df/0xe40
 kthread+0x3b8/0x730
 ret_from_fork+0x3e9/0x560
 ret_from_fork_asm+0x1a/0x30

Freed by task 29933:
 kasan_save_stack+0x30/0x50
 kasan_save_track+0x14/0x30
 __kasan_save_free_info+0x3b/0x70
 __kasan_slab_free+0x43/0x70
 kfree+0x14e/0x700
 mlxsw_sp_mr_route_add+0x2dea/0x4770 drivers/net/ethernet/mellanox/mlxsw/spectrum_mr.c:444 [mlxsw_spectrum]
 mlxsw_sp_router_fibmr_event_work+0x371/0xad0 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:7965 [mlxsw_spectrum]
 process_one_work+0x9cc/0x18e0
 worker_thread+0x5df/0xe40
 kthread+0x3b8/0x730
 ret_from_fork+0x3e9/0x560
 ret_from_fork_asm+0x1a/0x30

Fixes: f38656d06725 ("mlxsw: spectrum_mr: Protect multicast route list with a lock")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/f996feecfd59fde297964bfc85040b6d83ec6089.1764695650.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-04 17:53:49 -08:00
Ido Schimmel
8b0e69763e mlxsw: spectrum_router: Fix neighbour use-after-free
We sometimes observe use-after-free when dereferencing a neighbour [1].
The problem seems to be that the driver stores a pointer to the
neighbour, but without holding a reference on it. A reference is only
taken when the neighbour is used by a nexthop.

Fix by simplifying the reference counting scheme. Always take a
reference when storing a neighbour pointer in a neighbour entry. Avoid
taking a referencing when the neighbour is used by a nexthop as the
neighbour entry associated with the nexthop already holds a reference.

Tested by running the test that uncovered the problem over 300 times.
Without this patch the problem was reproduced after a handful of
iterations.

[1]
BUG: KASAN: slab-use-after-free in mlxsw_sp_neigh_entry_update+0x2d4/0x310
Read of size 8 at addr ffff88817f8e3420 by task ip/3929

CPU: 3 UID: 0 PID: 3929 Comm: ip Not tainted 6.18.0-rc4-virtme-g36b21a067510 #3 PREEMPT(full)
Hardware name: Nvidia SN5600/VMOD0013, BIOS 5.13 05/31/2023
Call Trace:
 <TASK>
 dump_stack_lvl+0x6f/0xa0
 print_address_description.constprop.0+0x6e/0x300
 print_report+0xfc/0x1fb
 kasan_report+0xe4/0x110
 mlxsw_sp_neigh_entry_update+0x2d4/0x310
 mlxsw_sp_router_rif_gone_sync+0x35f/0x510
 mlxsw_sp_rif_destroy+0x1ea/0x730
 mlxsw_sp_inetaddr_port_vlan_event+0xa1/0x1b0
 __mlxsw_sp_inetaddr_lag_event+0xcc/0x130
 __mlxsw_sp_inetaddr_event+0xf5/0x3c0
 mlxsw_sp_router_netdevice_event+0x1015/0x1580
 notifier_call_chain+0xcc/0x150
 call_netdevice_notifiers_info+0x7e/0x100
 __netdev_upper_dev_unlink+0x10b/0x210
 netdev_upper_dev_unlink+0x79/0xa0
 vrf_del_slave+0x18/0x50
 do_set_master+0x146/0x7d0
 do_setlink.isra.0+0x9a0/0x2880
 rtnl_newlink+0x637/0xb20
 rtnetlink_rcv_msg+0x6fe/0xb90
 netlink_rcv_skb+0x123/0x380
 netlink_unicast+0x4a3/0x770
 netlink_sendmsg+0x75b/0xc90
 __sock_sendmsg+0xbe/0x160
 ____sys_sendmsg+0x5b2/0x7d0
 ___sys_sendmsg+0xfd/0x180
 __sys_sendmsg+0x124/0x1c0
 do_syscall_64+0xbb/0xfd0
 entry_SYSCALL_64_after_hwframe+0x4b/0x53
[...]

Allocated by task 109:
 kasan_save_stack+0x30/0x50
 kasan_save_track+0x14/0x30
 __kasan_kmalloc+0x7b/0x90
 __kmalloc_noprof+0x2c1/0x790
 neigh_alloc+0x6af/0x8f0
 ___neigh_create+0x63/0xe90
 mlxsw_sp_nexthop_neigh_init+0x430/0x7e0
 mlxsw_sp_nexthop_type_init+0x212/0x960
 mlxsw_sp_nexthop6_group_info_init.constprop.0+0x81f/0x1280
 mlxsw_sp_nexthop6_group_get+0x392/0x6a0
 mlxsw_sp_fib6_entry_create+0x46a/0xfd0
 mlxsw_sp_router_fib6_replace+0x1ed/0x5f0
 mlxsw_sp_router_fib6_event_work+0x10a/0x2a0
 process_one_work+0xd57/0x1390
 worker_thread+0x4d6/0xd40
 kthread+0x355/0x5b0
 ret_from_fork+0x1d4/0x270
 ret_from_fork_asm+0x11/0x20

Freed by task 154:
 kasan_save_stack+0x30/0x50
 kasan_save_track+0x14/0x30
 __kasan_save_free_info+0x3b/0x60
 __kasan_slab_free+0x43/0x70
 kmem_cache_free_bulk.part.0+0x1eb/0x5e0
 kvfree_rcu_bulk+0x1f2/0x260
 kfree_rcu_work+0x130/0x1b0
 process_one_work+0xd57/0x1390
 worker_thread+0x4d6/0xd40
 kthread+0x355/0x5b0
 ret_from_fork+0x1d4/0x270
 ret_from_fork_asm+0x11/0x20

Last potentially related work creation:
 kasan_save_stack+0x30/0x50
 kasan_record_aux_stack+0x8c/0xa0
 kvfree_call_rcu+0x93/0x5b0
 mlxsw_sp_router_neigh_event_work+0x67d/0x860
 process_one_work+0xd57/0x1390
 worker_thread+0x4d6/0xd40
 kthread+0x355/0x5b0
 ret_from_fork+0x1d4/0x270
 ret_from_fork_asm+0x11/0x20

Fixes: 6cf3c971dc84 ("mlxsw: spectrum_router: Add private neigh table")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/92d75e21d95d163a41b5cea67a15cd33f547cba6.1764695650.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-04 17:53:49 -08:00
Ido Schimmel
b6b638bda2 mlxsw: spectrum_router: Fix possible neighbour reference count leak
mlxsw_sp_router_schedule_work() takes a reference on a neighbour,
expecting a work item to release it later on. However, we might fail to
schedule the work item, in which case the neighbour reference count will
be leaked.

Fix by taking the reference just before scheduling the work item. Note
that mlxsw_sp_router_schedule_work() can receive a NULL neighbour
pointer, but neigh_clone() handles that correctly.

Spotted during code review, did not actually observe the reference count
leak.

Fixes: 151b89f6025a ("mlxsw: spectrum_router: Reuse work neighbor initialization in work scheduler")
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/ec2934ae4aca187a8d8c9329a08ce93cca411378.1764695650.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-04 17:53:48 -08:00
Thorsten Blum
c4cdf73762 net: phy: marvell-88q2xxx: Fix clamped value in mv88q2xxx_hwmon_write
The local variable 'val' was never clamped to -75000 or 180000 because
the return value of clamp_val() was not used. Fix this by assigning the
clamped value back to 'val', and use clamp() instead of clamp_val().

Cc: stable@vger.kernel.org
Fixes: a557a92e6881 ("net: phy: marvell-88q2xxx: add support for temperature sensor")
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Dimitri Fedrau <dima.fedrau@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20251202172743.453055-3-thorsten.blum@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-04 17:49:31 -08:00
Gerd Bayer
6a107cfe9c net/mlx5: Fix double unregister of HCA_PORTS component
Clear hca_devcom_comp in device's private data after unregistering it in
LAG teardown. Otherwise a slightly lagging second pass through
mlx5_unload_one() might try to unregister it again and trip over
use-after-free.

On s390 almost all PCI level recovery events trigger two passes through
mxl5_unload_one() - one through the poll_health() method and one through
mlx5_pci_err_detected() as callback from generic PCI error recovery.
While testing PCI error recovery paths with more kernel debug features
enabled, this issue reproducibly led to kernel panics with the following
call chain:

 Unable to handle kernel pointer dereference in virtual kernel address space
 Failing address: 6b6b6b6b6b6b6000 TEID: 6b6b6b6b6b6b6803 ESOP-2 FSI
 Fault in home space mode while using kernel ASCE.
 AS:00000000705c4007 R3:0000000000000024
 Oops: 0038 ilc:3 [#1]SMP

 CPU: 14 UID: 0 PID: 156 Comm: kmcheck Kdump: loaded Not tainted
      6.18.0-20251130.rc7.git0.16131a59cab1.300.fc43.s390x+debug #1 PREEMPT

 Krnl PSW : 0404e00180000000 0000020fc86aa1dc (__lock_acquire+0x5c/0x15f0)
            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
 Krnl GPRS: 0000000000000000 0000020f00000001 6b6b6b6b6b6b6c33 0000000000000000
            0000000000000000 0000000000000000 0000000000000001 0000000000000000
            0000000000000000 0000020fca28b820 0000000000000000 0000010a1ced8100
            0000010a1ced8100 0000020fc9775068 0000018fce14f8b8 0000018fce14f7f8
 Krnl Code: 0000020fc86aa1cc: e3b003400004        lg      %r11,832
            0000020fc86aa1d2: a7840211           brc     8,0000020fc86aa5f4
           *0000020fc86aa1d6: c09000df0b25       larl    %r9,0000020fca28b820
           >0000020fc86aa1dc: d50790002000       clc     0(8,%r9),0(%r2)
            0000020fc86aa1e2: a7840209           brc     8,0000020fc86aa5f4
            0000020fc86aa1e6: c0e001100401       larl    %r14,0000020fca8aa9e8
            0000020fc86aa1ec: c01000e25a00       larl    %r1,0000020fca2f55ec
            0000020fc86aa1f2: a7eb00e8           aghi    %r14,232

 Call Trace:
  __lock_acquire+0x5c/0x15f0
  lock_acquire.part.0+0xf8/0x270
  lock_acquire+0xb0/0x1b0
  down_write+0x5a/0x250
  mlx5_detach_device+0x42/0x110 [mlx5_core]
  mlx5_unload_one_devl_locked+0x50/0xc0 [mlx5_core]
  mlx5_unload_one+0x42/0x60 [mlx5_core]
  mlx5_pci_err_detected+0x94/0x150 [mlx5_core]
  zpci_event_attempt_error_recovery+0xcc/0x388

Fixes: 5a977b5833b7 ("net/mlx5: Lag, move devcom registration to LAG layer")
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20251202-fix_lag-v1-1-59e8177ffce0@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-04 15:29:13 +01:00
Dmitry Skorodumov
0c57ff008a ipvlan: Ignore PACKET_LOOPBACK in handle_mode_l2()
Packets with pkt_type == PACKET_LOOPBACK are captured by
handle_frame() function, but they don't have L2 header.
We should not process them in handle_mode_l2().

This doesn't affect old L2 functionality, since handling
was anyway incorrect.

Handle them the same way as in br_handle_frame():
just pass the skb.

To observe invalid behaviour, just start "ping -b" on bcast address
of port-interface.

Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.")
Signed-off-by: Dmitry Skorodumov <skorodumov.dmitry@huawei.com>
Link: https://patch.msgid.link/20251202103906.4087675-1-skorodumov.dmitry@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-04 13:43:12 +01:00
Daniel Golle
5b48f49ee9 net: dsa: mxl-gsw1xx: fix SerDes RX polarity
According to MaxLinear engineer Benny Weng the RX lane of the SerDes
port of the GSW1xx switches is inverted in hardware, and the
SGMII_PHY_RX0_CFG2_INVERT bit is set by default in order to compensate
for that. Hence also set the SGMII_PHY_RX0_CFG2_INVERT bit by default in
gsw1xx_pcs_reset().

Fixes: 22335939ec90 ("net: dsa: add driver for MaxLinear GSW1xx switch family")
Reported-by: Rasmus Villemoes <ravi@prevas.dk>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/ca10e9f780c0152ecf9ae8cbac5bf975802e8f99.1764668951.git.daniel@makrotopia.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-04 13:29:37 +01:00
Ivan Galkin
4f0638b124 net: phy: RTL8211FVD: Restore disabling of PHY-mode EEE
When support for RTL8211F(D)(I)-VD-CG was introduced in commit
bb726b753f75 ("net: phy: realtek: add support for RTL8211F(D)(I)-VD-CG")
the implementation assumed that this PHY model doesn't have the
control register PHYCR2 (Page 0xa43 Address 0x19). This
assumption was based on the differences in CLKOUT configurations
between RTL8211FVD and the remaining RTL8211F PHYs. In the latter
commit 2c67301584f2
("net: phy: realtek: Avoid PHYCR2 access if PHYCR2 not present")
this assumption was expanded to the PHY-mode EEE.

I performed tests on RTL8211FI-VD-CG and confirmed that disabling
PHY-mode EEE works correctly and is uniform with other PHYs
supported by the driver. To validate the correctness,
I contacted Realtek support. Realtek confirmed that PHY-mode EEE on
RTL8211F(D)(I)-VD-CG is configured via Page 0xa43 Address 0x19 bit 5.

Moreover, Realtek informed me that the most recent datasheet
for RTL8211F(D)(I)-VD-CG v1.1 is incomplete and the naming of
control registers is partly inconsistent. The errata I
received from Realtek corrects the naming as follows:

| Register                | Datasheet v1.1 | Errata |
|-------------------------|----------------|--------|
| Page 0xa44 Address 0x11 | PHYCR2         | PHYCR3 |
| Page 0xa43 Address 0x19 | N/A            | PHYCR2 |

This information confirms that the supposedly missing control register,
PHYCR2, exists in the RTL8211F(D)(I)-VD-CG under the same address and
the same name. It controls widely the same configs as other PHYs from
the RTL8211F series (e.g. PHY-mode EEE). Clock out configuration is an
exception.

Given all this information, restore disabling of the PHY-mode EEE.

Fixes: 2c67301584f2 ("net: phy: realtek: Avoid PHYCR2 access if PHYCR2 not present")
Signed-off-by: Ivan Galkin <ivan.galkin@axis.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20251202-phy_eee-v1-1-fe0bf6ab3df0@axis.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-04 12:10:49 +01:00
Paolo Abeni
eb1e937e00 Merge branch 'mlx5-misc-fixes-2025-12-01'
Tariq Toukan says:

====================
mlx5 misc fixes 2025-12-01

This small patchset provides misc bug fixes from the team to the mlx5
core and Eth drivers.
====================

Link: https://patch.msgid.link/1764602008-1334866-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-04 11:55:22 +01:00
Cosmin Ratiu
35e93736f6 net/mlx5e: Avoid unregistering PSP twice
PSP is unregistered twice in:
_mlx5e_remove -> mlx5e_psp_unregister
mlx5e_nic_cleanup -> mlx5e_psp_unregister

This leads to a refcount underflow in some conditions:
------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 2 PID: 1694 at lib/refcount.c:28 refcount_warn_saturate+0xd8/0xe0
[...]
 mlx5e_psp_unregister+0x26/0x50 [mlx5_core]
 mlx5e_nic_cleanup+0x26/0x90 [mlx5_core]
 mlx5e_remove+0xe6/0x1f0 [mlx5_core]
 auxiliary_bus_remove+0x18/0x30
 device_release_driver_internal+0x194/0x1f0
 bus_remove_device+0xc6/0x130
 device_del+0x159/0x3c0
 mlx5_rescan_drivers_locked+0xbc/0x2a0 [mlx5_core]
[...]

Do not directly remove psp from the _mlx5e_remove path, the PSP cleanup
happens as part of profile cleanup.

Fixes: 89ee2d92f66c ("net/mlx5e: Support PSP offload functionality")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1764602008-1334866-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-04 11:55:20 +01:00
Moshe Shemesh
cd7671ef4c net/mlx5: make enable_mpesw idempotent
The enable_mpesw() function returns -EINVAL if ldev->mode is not
MLX5_LAG_MODE_NONE. This means attempting to enable MPESW mode when it's
already enabled will fail. In contrast, disable_mpesw() properly checks
if the mode is MLX5_LAG_MODE_MPESW before proceeding, making it
naturally idempotent and safe to call multiple times.

Fix enable_mpesw() to return success if mpesw is already enabled.

Fixes: a32327a3a02c ("net/mlx5: Lag, Control MultiPort E-Switch single FDB mode")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shay Drori <shayd@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1764602008-1334866-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-04 11:55:20 +01:00
Jamal Hadi Salim
ce052b9402 net/sched: ets: Always remove class from active list before deleting in ets_qdisc_change
zdi-disclosures@trendmicro.com says:

The vulnerability is a race condition between `ets_qdisc_dequeue` and
`ets_qdisc_change`.  It leads to UAF on `struct Qdisc` object.
Attacker requires the capability to create new user and network namespace
in order to trigger the bug.
See my additional commentary at the end of the analysis.

Analysis:

static int ets_qdisc_change(struct Qdisc *sch, struct nlattr *opt,
                          struct netlink_ext_ack *extack)
{
...

      // (1) this lock is preventing .change handler (`ets_qdisc_change`)
      //to race with .dequeue handler (`ets_qdisc_dequeue`)
      sch_tree_lock(sch);

      for (i = nbands; i < oldbands; i++) {
              if (i >= q->nstrict && q->classes[i].qdisc->q.qlen)
                      list_del_init(&q->classes[i].alist);
              qdisc_purge_queue(q->classes[i].qdisc);
      }

      WRITE_ONCE(q->nbands, nbands);
      for (i = nstrict; i < q->nstrict; i++) {
              if (q->classes[i].qdisc->q.qlen) {
		      // (2) the class is added to the q->active
                      list_add_tail(&q->classes[i].alist, &q->active);
                      q->classes[i].deficit = quanta[i];
              }
      }
      WRITE_ONCE(q->nstrict, nstrict);
      memcpy(q->prio2band, priomap, sizeof(priomap));

      for (i = 0; i < q->nbands; i++)
              WRITE_ONCE(q->classes[i].quantum, quanta[i]);

      for (i = oldbands; i < q->nbands; i++) {
              q->classes[i].qdisc = queues[i];
              if (q->classes[i].qdisc != &noop_qdisc)
                      qdisc_hash_add(q->classes[i].qdisc, true);
      }

      // (3) the qdisc is unlocked, now dequeue can be called in parallel
      // to the rest of .change handler
      sch_tree_unlock(sch);

      ets_offload_change(sch);
      for (i = q->nbands; i < oldbands; i++) {
	      // (4) we're reducing the refcount for our class's qdisc and
	      //  freeing it
              qdisc_put(q->classes[i].qdisc);
	      // (5) If we call .dequeue between (4) and (5), we will have
	      // a strong UAF and we can control RIP
              q->classes[i].qdisc = NULL;
              WRITE_ONCE(q->classes[i].quantum, 0);
              q->classes[i].deficit = 0;
              gnet_stats_basic_sync_init(&q->classes[i].bstats);
              memset(&q->classes[i].qstats, 0, sizeof(q->classes[i].qstats));
      }
      return 0;
}

Comment:
This happens because some of the classes have their qdiscs assigned to
NULL, but remain in the active list. This commit fixes this issue by always
removing the class from the active list before deleting and freeing its
associated qdisc

Reproducer Steps
(trimmed version of what was sent by zdi-disclosures@trendmicro.com)

```
DEV="${DEV:-lo}"
ROOT_HANDLE="${ROOT_HANDLE👎}"
BAND2_HANDLE="${BAND2_HANDLE:-20:}"   # child under 1:2
PING_BYTES="${PING_BYTES:-48}"
PING_COUNT="${PING_COUNT:-200000}"
PING_DST="${PING_DST:-127.0.0.1}"

SLOW_TBF_RATE="${SLOW_TBF_RATE:-8bit}"
SLOW_TBF_BURST="${SLOW_TBF_BURST:-100b}"
SLOW_TBF_LAT="${SLOW_TBF_LAT:-1s}"

cleanup() {
  tc qdisc del dev "$DEV" root 2>/dev/null
}
trap cleanup EXIT

ip link set "$DEV" up

tc qdisc del dev "$DEV" root 2>/dev/null || true

tc qdisc add dev "$DEV" root handle "$ROOT_HANDLE" ets bands 2 strict 2

tc qdisc add dev "$DEV" parent 1:2 handle "$BAND2_HANDLE" \
  tbf rate "$SLOW_TBF_RATE" burst "$SLOW_TBF_BURST" latency "$SLOW_TBF_LAT"

tc filter add dev "$DEV" parent 1: protocol all prio 1 u32 match u32 0 0 flowid 1:2
tc -s qdisc ls dev $DEV

ping -I "$DEV" -f -c "$PING_COUNT" -s "$PING_BYTES" -W 0.001 "$PING_DST" \
  >/dev/null 2>&1 &
tc qdisc change dev "$DEV" root handle "$ROOT_HANDLE" ets bands 2 strict 0
tc qdisc change dev "$DEV" root handle "$ROOT_HANDLE" ets bands 2 strict 2
tc -s qdisc ls dev $DEV
tc qdisc del dev "$DEV" parent 1:2 || true
tc -s qdisc ls dev $DEV
tc qdisc change dev "$DEV" root handle "$ROOT_HANDLE" ets bands 1 strict 1
```

KASAN report
```
==================================================================
BUG: KASAN: slab-use-after-free in ets_qdisc_dequeue+0x1071/0x11b0 kernel/net/sched/sch_ets.c:481
Read of size 8 at addr ffff8880502fc018 by task ping/12308
>
CPU: 0 UID: 0 PID: 12308 Comm: ping Not tainted 6.18.0-rc4-dirty #1 PREEMPT(full)
Hardware name: QEMU Ubuntu 25.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Call Trace:
 <IRQ>
 __dump_stack kernel/lib/dump_stack.c:94
 dump_stack_lvl+0x100/0x190 kernel/lib/dump_stack.c:120
 print_address_description kernel/mm/kasan/report.c:378
 print_report+0x156/0x4c9 kernel/mm/kasan/report.c:482
 kasan_report+0xdf/0x110 kernel/mm/kasan/report.c:595
 ets_qdisc_dequeue+0x1071/0x11b0 kernel/net/sched/sch_ets.c:481
 dequeue_skb kernel/net/sched/sch_generic.c:294
 qdisc_restart kernel/net/sched/sch_generic.c:399
 __qdisc_run+0x1c9/0x1b00 kernel/net/sched/sch_generic.c:417
 __dev_xmit_skb kernel/net/core/dev.c:4221
 __dev_queue_xmit+0x2848/0x4410 kernel/net/core/dev.c:4729
 dev_queue_xmit kernel/./include/linux/netdevice.h:3365
[...]

Allocated by task 17115:
 kasan_save_stack+0x30/0x50 kernel/mm/kasan/common.c:56
 kasan_save_track+0x14/0x30 kernel/mm/kasan/common.c:77
 poison_kmalloc_redzone kernel/mm/kasan/common.c:400
 __kasan_kmalloc+0xaa/0xb0 kernel/mm/kasan/common.c:417
 kasan_kmalloc kernel/./include/linux/kasan.h:262
 __do_kmalloc_node kernel/mm/slub.c:5642
 __kmalloc_node_noprof+0x34e/0x990 kernel/mm/slub.c:5648
 kmalloc_node_noprof kernel/./include/linux/slab.h:987
 qdisc_alloc+0xb8/0xc30 kernel/net/sched/sch_generic.c:950
 qdisc_create_dflt+0x93/0x490 kernel/net/sched/sch_generic.c:1012
 ets_class_graft+0x4fd/0x800 kernel/net/sched/sch_ets.c:261
 qdisc_graft+0x3e4/0x1780 kernel/net/sched/sch_api.c:1196
[...]

Freed by task 9905:
 kasan_save_stack+0x30/0x50 kernel/mm/kasan/common.c:56
 kasan_save_track+0x14/0x30 kernel/mm/kasan/common.c:77
 __kasan_save_free_info+0x3b/0x70 kernel/mm/kasan/generic.c:587
 kasan_save_free_info kernel/mm/kasan/kasan.h:406
 poison_slab_object kernel/mm/kasan/common.c:252
 __kasan_slab_free+0x5f/0x80 kernel/mm/kasan/common.c:284
 kasan_slab_free kernel/./include/linux/kasan.h:234
 slab_free_hook kernel/mm/slub.c:2539
 slab_free kernel/mm/slub.c:6630
 kfree+0x144/0x700 kernel/mm/slub.c:6837
 rcu_do_batch kernel/kernel/rcu/tree.c:2605
 rcu_core+0x7c0/0x1500 kernel/kernel/rcu/tree.c:2861
 handle_softirqs+0x1ea/0x8a0 kernel/kernel/softirq.c:622
 __do_softirq kernel/kernel/softirq.c:656
[...]

Commentary:

1. Maher Azzouzi working with Trend Micro Zero Day Initiative was reported as
the person who found the issue. I requested to get a proper email to add to the
reported-by tag but got no response. For this reason i will credit the person
i exchanged emails with i.e zdi-disclosures@trendmicro.com

2. Neither i nor Victor who did a much more thorough testing was able to
reproduce a UAF with the PoC or other approaches we tried. We were both able to
reproduce a null ptr deref. After exchange with zdi-disclosures@trendmicro.com
they sent a small change to be made to the code to add an extra delay which
was able to simulate the UAF. i.e, this:
   qdisc_put(q->classes[i].qdisc);
   mdelay(90);
   q->classes[i].qdisc = NULL;

I was informed by Thomas Gleixner(tglx@linutronix.de) that adding delays was
acceptable approach for demonstrating the bug, quote:
"Adding such delays is common exploit validation practice"
The equivalent delay could happen "by virt scheduling the vCPU out, SMIs,
NMIs, PREEMPT_RT enabled kernel"

3. I asked the OP to test and report back but got no response and after a
few days gave up and proceeded to submit this fix.

Fixes: de6d25924c2a ("net/sched: sch_ets: don't peek at classes beyond 'nbands'")
Reported-by: zdi-disclosures@trendmicro.com
Tested-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Link: https://patch.msgid.link/20251128151919.576920-1-jhs@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-04 11:43:45 +01:00
Shaurya Rane
188e0fa5a6 net/hsr: fix NULL pointer dereference in prp_get_untagged_frame()
prp_get_untagged_frame() calls __pskb_copy() to create frame->skb_std
but doesn't check if the allocation failed. If __pskb_copy() returns
NULL, skb_clone() is called with a NULL pointer, causing a crash:

Oops: general protection fault, probably for non-canonical address 0xdffffc000000000f: 0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000078-0x000000000000007f]
CPU: 0 UID: 0 PID: 5625 Comm: syz.1.18 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:skb_clone+0xd7/0x3a0 net/core/skbuff.c:2041
Code: 03 42 80 3c 20 00 74 08 4c 89 f7 e8 23 29 05 f9 49 83 3e 00 0f 85 a0 01 00 00 e8 94 dd 9d f8 48 8d 6b 7e 49 89 ee 49 c1 ee 03 <43> 0f b6 04 26 84 c0 0f 85 d1 01 00 00 44 0f b6 7d 00 41 83 e7 0c
RSP: 0018:ffffc9000d00f200 EFLAGS: 00010207
RAX: ffffffff892235a1 RBX: 0000000000000000 RCX: ffff88803372a480
RDX: 0000000000000000 RSI: 0000000000000820 RDI: 0000000000000000
RBP: 000000000000007e R08: ffffffff8f7d0f77 R09: 1ffffffff1efa1ee
R10: dffffc0000000000 R11: fffffbfff1efa1ef R12: dffffc0000000000
R13: 0000000000000820 R14: 000000000000000f R15: ffff88805144cc00
FS:  0000555557f6d500(0000) GS:ffff88808d72f000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555581d35808 CR3: 000000005040e000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 hsr_forward_do net/hsr/hsr_forward.c:-1 [inline]
 hsr_forward_skb+0x1013/0x2860 net/hsr/hsr_forward.c:741
 hsr_handle_frame+0x6ce/0xa70 net/hsr/hsr_slave.c:84
 __netif_receive_skb_core+0x10b9/0x4380 net/core/dev.c:5966
 __netif_receive_skb_one_core net/core/dev.c:6077 [inline]
 __netif_receive_skb+0x72/0x380 net/core/dev.c:6192
 netif_receive_skb_internal net/core/dev.c:6278 [inline]
 netif_receive_skb+0x1cb/0x790 net/core/dev.c:6337
 tun_rx_batched+0x1b9/0x730 drivers/net/tun.c:1485
 tun_get_user+0x2b65/0x3e90 drivers/net/tun.c:1953
 tun_chr_write_iter+0x113/0x200 drivers/net/tun.c:1999
 new_sync_write fs/read_write.c:593 [inline]
 vfs_write+0x5c9/0xb30 fs/read_write.c:686
 ksys_write+0x145/0x250 fs/read_write.c:738
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f0449f8e1ff
Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 f9 92 02 00 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 4c 93 02 00 48
RSP: 002b:00007ffd7ad94c90 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f044a1e5fa0 RCX: 00007f0449f8e1ff
RDX: 000000000000003e RSI: 0000200000000500 RDI: 00000000000000c8
RBP: 00007ffd7ad94d20 R08: 0000000000000000 R09: 0000000000000000
R10: 000000000000003e R11: 0000000000000293 R12: 0000000000000001
R13: 00007f044a1e5fa0 R14: 00007f044a1e5fa0 R15: 0000000000000003
 </TASK>

Add a NULL check immediately after __pskb_copy() to handle allocation
failures gracefully.

Reported-by: syzbot+2fa344348a579b779e05@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=2fa344348a579b779e05
Fixes: f266a683a480 ("net/hsr: Better frame dispatch")
Cc: stable@vger.kernel.org
Signed-off-by: Shaurya Rane <ssrane_b23@ee.vjti.ac.in>
Reviewed-by: Felix Maurer <fmaurer@redhat.com>
Tested-by: Felix Maurer <fmaurer@redhat.com>
Link: https://patch.msgid.link/20251129093718.25320-1-ssrane_b23@ee.vjti.ac.in
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-04 11:15:13 +01:00
Wang Liang
613d12dd79 netrom: Fix memory leak in nr_sendmsg()
syzbot reported a memory leak [1].

When function sock_alloc_send_skb() return NULL in nr_output(), the
original skb is not freed, which was allocated in nr_sendmsg(). Fix this
by freeing it before return.

[1]
BUG: memory leak
unreferenced object 0xffff888129f35500 (size 240):
  comm "syz.0.17", pid 6119, jiffies 4294944652
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 10 52 28 81 88 ff ff  ..........R(....
  backtrace (crc 1456a3e4):
    kmemleak_alloc_recursive include/linux/kmemleak.h:44 [inline]
    slab_post_alloc_hook mm/slub.c:4983 [inline]
    slab_alloc_node mm/slub.c:5288 [inline]
    kmem_cache_alloc_node_noprof+0x36f/0x5e0 mm/slub.c:5340
    __alloc_skb+0x203/0x240 net/core/skbuff.c:660
    alloc_skb include/linux/skbuff.h:1383 [inline]
    alloc_skb_with_frags+0x69/0x3f0 net/core/skbuff.c:6671
    sock_alloc_send_pskb+0x379/0x3e0 net/core/sock.c:2965
    sock_alloc_send_skb include/net/sock.h:1859 [inline]
    nr_sendmsg+0x287/0x450 net/netrom/af_netrom.c:1105
    sock_sendmsg_nosec net/socket.c:727 [inline]
    __sock_sendmsg net/socket.c:742 [inline]
    sock_write_iter+0x293/0x2a0 net/socket.c:1195
    new_sync_write fs/read_write.c:593 [inline]
    vfs_write+0x45d/0x710 fs/read_write.c:686
    ksys_write+0x143/0x170 fs/read_write.c:738
    do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
    do_syscall_64+0xa4/0xfa0 arch/x86/entry/syscall_64.c:94
    entry_SYSCALL_64_after_hwframe+0x77/0x7f

Reported-by: syzbot+d7abc36bbbb6d7d40b58@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d7abc36bbbb6d7d40b58
Tested-by: syzbot+d7abc36bbbb6d7d40b58@syzkaller.appspotmail.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Wang Liang <wangliang74@huawei.com>
Link: https://patch.msgid.link/20251129041315.1550766-1-wangliang74@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-04 11:01:17 +01:00
Wei Fang
e8e032cd24 net: fec: ERR007885 Workaround for XDP TX path
The ERR007885 will lead to a TDAR race condition for mutliQ when the
driver sets TDAR and the UDMA clears TDAR simultaneously or in a small
window (2-4 cycles). And it will cause the udma_tx and udma_tx_arbiter
state machines to hang. Therefore, the commit 53bb20d1faba ("net: fec:
add variable reg_desc_active to speed things up") and the commit
a179aad12bad ("net: fec: ERR007885 Workaround for conventional TX") have
added the workaround to fix the potential issue for the conventional TX
path. Similarly, the XDP TX path should also have the potential hang
issue, so add the workaround for XDP TX path.

Fixes: 6d6b39f180b8 ("net: fec: add initial XDP support")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20251128025915.2486943-1-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-04 09:54:13 +01:00
110 changed files with 926 additions and 356 deletions

View File

@ -1987,6 +1987,7 @@ D: netfilter: TCP window tracking code
D: netfilter: raw table
D: netfilter: iprange match
D: netfilter: new logging interfaces
D: netfilter: ipset
D: netfilter: various other hacks
S: Tata
S: Hungary

View File

@ -18028,7 +18028,6 @@ F: drivers/net/ethernet/neterion/
NETFILTER
M: Pablo Neira Ayuso <pablo@netfilter.org>
M: Jozsef Kadlecsik <kadlec@netfilter.org>
M: Florian Westphal <fw@strlen.de>
R: Phil Sutter <phil@nwl.cc>
L: netfilter-devel@vger.kernel.org

View File

@ -48,6 +48,13 @@ DEFINE_LOCK_GUARD_1(ksimd,
kernel_neon_begin(_T->lock),
kernel_neon_end(_T->lock))
#define scoped_ksimd() scoped_guard(ksimd, &(struct user_fpsimd_state){})
#define __scoped_ksimd(_label) \
for (struct user_fpsimd_state __uninitialized __st; \
true; ({ goto _label; })) \
if (0) { \
_label: break; \
} else scoped_guard(ksimd, &__st)
#define scoped_ksimd() __scoped_ksimd(__UNIQUE_ID(label))
#endif

View File

@ -52,7 +52,7 @@ acpi_pcc_address_space_setup(acpi_handle region_handle, u32 function,
struct pcc_data *data;
struct acpi_pcc_info *ctx = handler_context;
struct pcc_mbox_chan *pcc_chan;
static acpi_status ret;
acpi_status ret;
data = kzalloc(sizeof(*data), GFP_KERNEL);
if (!data)

View File

@ -1366,7 +1366,8 @@ int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps)
/* Are any of the regs PCC ?*/
if (CPC_IN_PCC(highest_reg) || CPC_IN_PCC(lowest_reg) ||
CPC_IN_PCC(lowest_non_linear_reg) || CPC_IN_PCC(nominal_reg) ||
CPC_IN_PCC(low_freq_reg) || CPC_IN_PCC(nom_freq_reg)) {
CPC_IN_PCC(low_freq_reg) || CPC_IN_PCC(nom_freq_reg) ||
CPC_IN_PCC(guaranteed_reg)) {
if (pcc_ss_id < 0) {
pr_debug("Invalid pcc_ss_id\n");
return -ENODEV;

View File

@ -1868,16 +1868,18 @@ void pm_runtime_init(struct device *dev)
*/
void pm_runtime_reinit(struct device *dev)
{
if (!pm_runtime_enabled(dev)) {
if (dev->power.runtime_status == RPM_ACTIVE)
pm_runtime_set_suspended(dev);
if (dev->power.irq_safe) {
spin_lock_irq(&dev->power.lock);
dev->power.irq_safe = 0;
spin_unlock_irq(&dev->power.lock);
if (dev->parent)
pm_runtime_put(dev->parent);
}
if (pm_runtime_enabled(dev))
return;
if (dev->power.runtime_status == RPM_ACTIVE)
pm_runtime_set_suspended(dev);
if (dev->power.irq_safe) {
spin_lock_irq(&dev->power.lock);
dev->power.irq_safe = 0;
spin_unlock_irq(&dev->power.lock);
if (dev->parent)
pm_runtime_put(dev->parent);
}
/*
* Clear power.needs_force_resume in case it has been set by

View File

@ -495,7 +495,11 @@ int iopt_map_file_pages(struct iommufd_ctx *ictx, struct io_pagetable *iopt,
return -EOVERFLOW;
start_byte = start - ALIGN_DOWN(start, PAGE_SIZE);
dmabuf = dma_buf_get(fd);
if (IS_ENABLED(CONFIG_DMA_SHARED_BUFFER))
dmabuf = dma_buf_get(fd);
else
dmabuf = ERR_PTR(-ENXIO);
if (!IS_ERR(dmabuf)) {
pages = iopt_alloc_dmabuf_pages(ictx, dmabuf, start_byte, start,
length,

View File

@ -1184,14 +1184,20 @@ static int iommufd_test_add_reserved(struct iommufd_ucmd *ucmd,
unsigned int mockpt_id,
unsigned long start, size_t length)
{
unsigned long last;
struct iommufd_ioas *ioas;
int rc;
if (!length)
return -EINVAL;
if (check_add_overflow(start, length - 1, &last))
return -EOVERFLOW;
ioas = iommufd_get_ioas(ucmd->ictx, mockpt_id);
if (IS_ERR(ioas))
return PTR_ERR(ioas);
down_write(&ioas->iopt.iova_rwsem);
rc = iopt_reserve_iova(&ioas->iopt, start, start + length - 1, NULL);
rc = iopt_reserve_iova(&ioas->iopt, start, last, NULL);
up_write(&ioas->iopt.iova_rwsem);
iommufd_put_object(ucmd->ictx, &ioas->obj);
return rc;
@ -1215,8 +1221,10 @@ static int iommufd_test_md_check_pa(struct iommufd_ucmd *ucmd,
page_size = 1 << __ffs(mock->domain.pgsize_bitmap);
if (iova % page_size || length % page_size ||
(uintptr_t)uptr % page_size ||
check_add_overflow((uintptr_t)uptr, (uintptr_t)length, &end))
return -EINVAL;
check_add_overflow((uintptr_t)uptr, (uintptr_t)length, &end)) {
rc = -EINVAL;
goto out_put;
}
for (; length; length -= page_size) {
struct page *pages[1];

View File

@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0-only
menuconfig CAN_DEV
tristate "CAN Device Drivers"
bool "CAN Device Drivers"
default y
depends on CAN
help
@ -17,10 +17,7 @@ menuconfig CAN_DEV
virtual ones. If you own such devices or plan to use the virtual CAN
interfaces to develop applications, say Y here.
To compile as a module, choose M here: the module will be called
can-dev.
if CAN_DEV
if CAN_DEV && CAN
config CAN_VCAN
tristate "Virtual Local CAN Interface (vcan)"

View File

@ -7,7 +7,7 @@ obj-$(CONFIG_CAN_VCAN) += vcan.o
obj-$(CONFIG_CAN_VXCAN) += vxcan.o
obj-$(CONFIG_CAN_SLCAN) += slcan/
obj-y += dev/
obj-$(CONFIG_CAN_DEV) += dev/
obj-y += esd/
obj-y += rcar/
obj-y += rockchip/

View File

@ -1,9 +1,8 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_CAN_DEV) += can-dev.o
can-dev-y += skb.o
obj-$(CONFIG_CAN) += can-dev.o
can-dev-$(CONFIG_CAN_DEV) += skb.o
can-dev-$(CONFIG_CAN_CALC_BITTIMING) += calc_bittiming.o
can-dev-$(CONFIG_CAN_NETLINK) += bittiming.o
can-dev-$(CONFIG_CAN_NETLINK) += dev.o

View File

@ -1074,7 +1074,7 @@ out_usb_free_urb:
usb_free_urb(urb);
out_usb_kill_anchored_urbs:
if (!parent->active_channels) {
usb_kill_anchored_urbs(&dev->tx_submitted);
usb_kill_anchored_urbs(&parent->rx_submitted);
if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP)
gs_usb_timestamp_stop(parent);

View File

@ -444,9 +444,6 @@ static void gswip_remove(struct platform_device *pdev)
if (!priv)
return;
/* disable the switch */
gswip_disable_switch(priv);
dsa_unregister_switch(priv->ds);
for (i = 0; i < priv->num_gphy_fw; i++)

View File

@ -294,8 +294,6 @@ struct gswip_priv {
u16 version;
};
void gswip_disable_switch(struct gswip_priv *priv);
int gswip_probe_common(struct gswip_priv *priv, u32 version);
#endif /* __LANTIQ_GSWIP_H */

View File

@ -752,6 +752,13 @@ static int gswip_setup(struct dsa_switch *ds)
return 0;
}
static void gswip_teardown(struct dsa_switch *ds)
{
struct gswip_priv *priv = ds->priv;
regmap_clear_bits(priv->mdio, GSWIP_MDIO_GLOB, GSWIP_MDIO_GLOB_ENABLE);
}
static enum dsa_tag_protocol gswip_get_tag_protocol(struct dsa_switch *ds,
int port,
enum dsa_tag_protocol mp)
@ -1629,6 +1636,7 @@ static const struct phylink_mac_ops gswip_phylink_mac_ops = {
static const struct dsa_switch_ops gswip_switch_ops = {
.get_tag_protocol = gswip_get_tag_protocol,
.setup = gswip_setup,
.teardown = gswip_teardown,
.port_setup = gswip_port_setup,
.port_enable = gswip_port_enable,
.port_disable = gswip_port_disable,
@ -1656,12 +1664,6 @@ static const struct dsa_switch_ops gswip_switch_ops = {
.port_hsr_leave = dsa_port_simple_hsr_leave,
};
void gswip_disable_switch(struct gswip_priv *priv)
{
regmap_clear_bits(priv->mdio, GSWIP_MDIO_GLOB, GSWIP_MDIO_GLOB_ENABLE);
}
EXPORT_SYMBOL_GPL(gswip_disable_switch);
static int gswip_validate_cpu_port(struct dsa_switch *ds)
{
struct gswip_priv *priv = ds->priv;
@ -1718,15 +1720,14 @@ int gswip_probe_common(struct gswip_priv *priv, u32 version)
err = gswip_validate_cpu_port(priv->ds);
if (err)
goto disable_switch;
goto unregister_switch;
dev_info(priv->dev, "probed GSWIP version %lx mod %lx\n",
GSWIP_VERSION_REV(version), GSWIP_VERSION_MOD(version));
return 0;
disable_switch:
gswip_disable_switch(priv);
unregister_switch:
dsa_unregister_switch(priv->ds);
return err;

View File

@ -11,10 +11,12 @@
#include <linux/bits.h>
#include <linux/delay.h>
#include <linux/jiffies.h>
#include <linux/module.h>
#include <linux/of_device.h>
#include <linux/of_mdio.h>
#include <linux/regmap.h>
#include <linux/workqueue.h>
#include <net/dsa.h>
#include "lantiq_gswip.h"
@ -29,6 +31,7 @@ struct gsw1xx_priv {
struct regmap *clk;
struct regmap *shell;
struct phylink_pcs pcs;
struct delayed_work clear_raneg;
phy_interface_t tbi_interface;
struct gswip_priv gswip;
};
@ -145,7 +148,9 @@ static void gsw1xx_pcs_disable(struct phylink_pcs *pcs)
{
struct gsw1xx_priv *priv = pcs_to_gsw1xx(pcs);
/* Assert SGMII shell reset */
cancel_delayed_work_sync(&priv->clear_raneg);
/* Assert SGMII shell reset (will also clear RANEG bit) */
regmap_set_bits(priv->shell, GSW1XX_SHELL_RST_REQ,
GSW1XX_RST_REQ_SGMII_SHELL);
@ -255,10 +260,16 @@ static int gsw1xx_pcs_reset(struct gsw1xx_priv *priv)
FIELD_PREP(GSW1XX_SGMII_PHY_RX0_CFG2_FILT_CNT,
GSW1XX_SGMII_PHY_RX0_CFG2_FILT_CNT_DEF);
/* TODO: Take care of inverted RX pair once generic property is
/* RX lane seems to be inverted internally, so bit
* GSW1XX_SGMII_PHY_RX0_CFG2_INVERT needs to be set for normal
* (ie. non-inverted) operation.
*
* TODO: Take care of inverted RX pair once generic property is
* available
*/
val |= GSW1XX_SGMII_PHY_RX0_CFG2_INVERT;
ret = regmap_write(priv->sgmii, GSW1XX_SGMII_PHY_RX0_CFG2, val);
if (ret < 0)
return ret;
@ -422,12 +433,29 @@ static int gsw1xx_pcs_config(struct phylink_pcs *pcs, unsigned int neg_mode,
return 0;
}
static void gsw1xx_pcs_clear_raneg(struct work_struct *work)
{
struct gsw1xx_priv *priv =
container_of(work, struct gsw1xx_priv, clear_raneg.work);
regmap_clear_bits(priv->sgmii, GSW1XX_SGMII_TBI_ANEGCTL,
GSW1XX_SGMII_TBI_ANEGCTL_RANEG);
}
static void gsw1xx_pcs_an_restart(struct phylink_pcs *pcs)
{
struct gsw1xx_priv *priv = pcs_to_gsw1xx(pcs);
cancel_delayed_work_sync(&priv->clear_raneg);
regmap_set_bits(priv->sgmii, GSW1XX_SGMII_TBI_ANEGCTL,
GSW1XX_SGMII_TBI_ANEGCTL_RANEG);
/* despite being documented as self-clearing, the RANEG bit
* sometimes remains set, preventing auto-negotiation from happening.
* MaxLinear advises to manually clear the bit after 10ms.
*/
schedule_delayed_work(&priv->clear_raneg, msecs_to_jiffies(10));
}
static void gsw1xx_pcs_link_up(struct phylink_pcs *pcs,
@ -630,6 +658,8 @@ static int gsw1xx_probe(struct mdio_device *mdiodev)
if (ret)
return ret;
INIT_DELAYED_WORK(&priv->clear_raneg, gsw1xx_pcs_clear_raneg);
ret = gswip_probe_common(&priv->gswip, version);
if (ret)
return ret;
@ -642,25 +672,31 @@ static int gsw1xx_probe(struct mdio_device *mdiodev)
static void gsw1xx_remove(struct mdio_device *mdiodev)
{
struct gswip_priv *priv = dev_get_drvdata(&mdiodev->dev);
struct gsw1xx_priv *gsw1xx_priv;
if (!priv)
return;
gswip_disable_switch(priv);
dsa_unregister_switch(priv->ds);
gsw1xx_priv = container_of(priv, struct gsw1xx_priv, gswip);
cancel_delayed_work_sync(&gsw1xx_priv->clear_raneg);
}
static void gsw1xx_shutdown(struct mdio_device *mdiodev)
{
struct gswip_priv *priv = dev_get_drvdata(&mdiodev->dev);
struct gsw1xx_priv *gsw1xx_priv;
if (!priv)
return;
dsa_switch_shutdown(priv->ds);
dev_set_drvdata(&mdiodev->dev, NULL);
gswip_disable_switch(priv);
gsw1xx_priv = container_of(priv, struct gsw1xx_priv, gswip);
cancel_delayed_work_sync(&gsw1xx_priv->clear_raneg);
}
static const struct gswip_hw_info gsw12x_data = {

View File

@ -1790,6 +1790,9 @@ static int b44_nway_reset(struct net_device *dev)
u32 bmcr;
int r;
if (bp->flags & B44_FLAG_EXTERNAL_PHY)
return phy_ethtool_nway_reset(dev);
spin_lock_irq(&bp->lock);
b44_readphy(bp, MII_BMCR, &bmcr);
b44_readphy(bp, MII_BMCR, &bmcr);

View File

@ -268,13 +268,11 @@ bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons,
case XDP_TX:
rx_buf = &rxr->rx_buf_ring[cons];
mapping = rx_buf->mapping - bp->rx_dma_offset;
*event &= BNXT_TX_CMP_EVENT;
if (unlikely(xdp_buff_has_frags(xdp))) {
struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
tx_needed += sinfo->nr_frags;
*event = BNXT_AGG_EVENT;
}
if (tx_avail < tx_needed) {
@ -287,6 +285,7 @@ bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons,
dma_sync_single_for_device(&pdev->dev, mapping + offset, *len,
bp->rx_dir);
*event &= ~BNXT_RX_EVENT;
*event |= BNXT_TX_EVENT;
__bnxt_xmit_xdp(bp, txr, mapping + offset, *len,
NEXT_RX(rxr->rx_prod), xdp);

View File

@ -1787,7 +1787,8 @@ int enetc_xdp_xmit(struct net_device *ndev, int num_frames,
int xdp_tx_bd_cnt, i, k;
int xdp_tx_frm_cnt = 0;
if (unlikely(test_bit(ENETC_TX_DOWN, &priv->flags)))
if (unlikely(test_bit(ENETC_TX_DOWN, &priv->flags) ||
!netif_carrier_ok(ndev)))
return -ENETDOWN;
enetc_lock_mdio();

View File

@ -3933,7 +3933,12 @@ static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep,
txq->bd.cur = bdp;
/* Trigger transmission start */
writel(0, txq->bd.reg_desc_active);
if (!(fep->quirks & FEC_QUIRK_ERR007885) ||
!readl(txq->bd.reg_desc_active) ||
!readl(txq->bd.reg_desc_active) ||
!readl(txq->bd.reg_desc_active) ||
!readl(txq->bd.reg_desc_active))
writel(0, txq->bd.reg_desc_active);
return 0;
}

View File

@ -647,12 +647,9 @@ static int gve_setup_device_resources(struct gve_priv *priv)
err = gve_alloc_counter_array(priv);
if (err)
goto abort_with_rss_config_cache;
err = gve_init_clock(priv);
if (err)
goto abort_with_counter;
err = gve_alloc_notify_blocks(priv);
if (err)
goto abort_with_clock;
goto abort_with_counter;
err = gve_alloc_stats_report(priv);
if (err)
goto abort_with_ntfy_blocks;
@ -683,10 +680,16 @@ static int gve_setup_device_resources(struct gve_priv *priv)
}
}
err = gve_init_clock(priv);
if (err) {
dev_err(&priv->pdev->dev, "Failed to init clock");
goto abort_with_ptype_lut;
}
err = gve_init_rss_config(priv, priv->rx_cfg.num_queues);
if (err) {
dev_err(&priv->pdev->dev, "Failed to init RSS config");
goto abort_with_ptype_lut;
goto abort_with_clock;
}
err = gve_adminq_report_stats(priv, priv->stats_report_len,
@ -698,6 +701,8 @@ static int gve_setup_device_resources(struct gve_priv *priv)
gve_set_device_resources_ok(priv);
return 0;
abort_with_clock:
gve_teardown_clock(priv);
abort_with_ptype_lut:
kvfree(priv->ptype_lut_dqo);
priv->ptype_lut_dqo = NULL;
@ -705,8 +710,6 @@ abort_with_stats_report:
gve_free_stats_report(priv);
abort_with_ntfy_blocks:
gve_free_notify_blocks(priv);
abort_with_clock:
gve_teardown_clock(priv);
abort_with_counter:
gve_free_counter_array(priv);
abort_with_rss_config_cache:

View File

@ -10555,6 +10555,9 @@ int hclge_set_vlan_filter(struct hnae3_handle *handle, __be16 proto,
bool writen_to_tbl = false;
int ret = 0;
if (vlan_id >= VLAN_N_VID)
return -EINVAL;
/* When device is resetting or reset failed, firmware is unable to
* handle mailbox. Just record the vlan id, and remove it after
* reset finished.

View File

@ -193,10 +193,10 @@ static int hclge_get_ring_chain_from_mbx(
return -EINVAL;
for (i = 0; i < ring_num; i++) {
if (req->msg.param[i].tqp_index >= vport->nic.kinfo.rss_size) {
if (req->msg.param[i].tqp_index >= vport->nic.kinfo.num_tqps) {
dev_err(&hdev->pdev->dev, "tqp index(%u) is out of range(0-%u)\n",
req->msg.param[i].tqp_index,
vport->nic.kinfo.rss_size - 1U);
vport->nic.kinfo.num_tqps - 1U);
return -EINVAL;
}
}

View File

@ -368,12 +368,12 @@ static int hclgevf_knic_setup(struct hclgevf_dev *hdev)
new_tqps = kinfo->rss_size * num_tc;
kinfo->num_tqps = min(new_tqps, hdev->num_tqps);
kinfo->tqp = devm_kcalloc(&hdev->pdev->dev, kinfo->num_tqps,
kinfo->tqp = devm_kcalloc(&hdev->pdev->dev, hdev->num_tqps,
sizeof(struct hnae3_queue *), GFP_KERNEL);
if (!kinfo->tqp)
return -ENOMEM;
for (i = 0; i < kinfo->num_tqps; i++) {
for (i = 0; i < hdev->num_tqps; i++) {
hdev->htqp[i].q.handle = &hdev->nic;
hdev->htqp[i].q.tqp_index = i;
kinfo->tqp[i] = &hdev->htqp[i].q;

View File

@ -197,6 +197,11 @@ static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change,
struct pci_dev *pdev = dev->pdev;
int ret = 0;
if (mlx5_fw_reset_in_progress(dev)) {
NL_SET_ERR_MSG_MOD(extack, "Can't reload during firmware reset");
return -EBUSY;
}
if (mlx5_dev_is_lightweight(dev)) {
if (action != DEVLINK_RELOAD_ACTION_DRIVER_REINIT)
return -EOPNOTSUPP;

View File

@ -33,6 +33,7 @@
#include "lib/eq.h"
#include "fw_tracer.h"
#include "fw_tracer_tracepoint.h"
#include <linux/ctype.h>
static int mlx5_query_mtrc_caps(struct mlx5_fw_tracer *tracer)
{
@ -358,6 +359,47 @@ static const char *VAL_PARM = "%llx";
static const char *REPLACE_64_VAL_PARM = "%x%x";
static const char *PARAM_CHAR = "%";
static bool mlx5_is_valid_spec(const char *str)
{
/* Parse format specifiers to find the actual type.
* Structure: %[flags][width][.precision][length]type
* Skip flags, width, precision & length.
*/
while (isdigit(*str) || *str == '#' || *str == '.' || *str == 'l')
str++;
/* Check if it's a valid integer/hex specifier or %%:
* Valid formats: %x, %d, %i, %u, etc.
*/
if (*str != 'x' && *str != 'X' && *str != 'd' && *str != 'i' &&
*str != 'u' && *str != 'c' && *str != '%')
return false;
return true;
}
static bool mlx5_tracer_validate_params(const char *str)
{
const char *substr = str;
if (!str)
return false;
substr = strstr(substr, PARAM_CHAR);
while (substr) {
if (!mlx5_is_valid_spec(substr + 1))
return false;
if (*(substr + 1) == '%')
substr = strstr(substr + 2, PARAM_CHAR);
else
substr = strstr(substr + 1, PARAM_CHAR);
}
return true;
}
static int mlx5_tracer_message_hash(u32 message_id)
{
return jhash_1word(message_id, 0) & (MESSAGE_HASH_SIZE - 1);
@ -419,6 +461,10 @@ static int mlx5_tracer_get_num_of_params(char *str)
char *substr, *pstr = str;
int num_of_params = 0;
/* Validate that all parameters are valid before processing */
if (!mlx5_tracer_validate_params(str))
return -EINVAL;
/* replace %llx with %x%x */
substr = strstr(pstr, VAL_PARM);
while (substr) {
@ -427,11 +473,15 @@ static int mlx5_tracer_get_num_of_params(char *str)
substr = strstr(pstr, VAL_PARM);
}
/* count all the % characters */
/* count all the % characters, but skip %% (escaped percent) */
substr = strstr(str, PARAM_CHAR);
while (substr) {
num_of_params += 1;
str = substr + 1;
if (*(substr + 1) != '%') {
num_of_params += 1;
str = substr + 1;
} else {
str = substr + 2;
}
substr = strstr(str, PARAM_CHAR);
}
@ -570,14 +620,17 @@ void mlx5_tracer_print_trace(struct tracer_string_format *str_frmt,
{
char tmp[512];
snprintf(tmp, sizeof(tmp), str_frmt->string,
str_frmt->params[0],
str_frmt->params[1],
str_frmt->params[2],
str_frmt->params[3],
str_frmt->params[4],
str_frmt->params[5],
str_frmt->params[6]);
if (str_frmt->invalid_string)
snprintf(tmp, sizeof(tmp), "BAD_FORMAT: %s", str_frmt->string);
else
snprintf(tmp, sizeof(tmp), str_frmt->string,
str_frmt->params[0],
str_frmt->params[1],
str_frmt->params[2],
str_frmt->params[3],
str_frmt->params[4],
str_frmt->params[5],
str_frmt->params[6]);
trace_mlx5_fw(dev->tracer, trace_timestamp, str_frmt->lost,
str_frmt->event_id, tmp);
@ -609,6 +662,13 @@ static int mlx5_tracer_handle_raw_string(struct mlx5_fw_tracer *tracer,
return 0;
}
static void mlx5_tracer_handle_bad_format_string(struct mlx5_fw_tracer *tracer,
struct tracer_string_format *cur_string)
{
cur_string->invalid_string = true;
list_add_tail(&cur_string->list, &tracer->ready_strings_list);
}
static int mlx5_tracer_handle_string_trace(struct mlx5_fw_tracer *tracer,
struct tracer_event *tracer_event)
{
@ -619,12 +679,18 @@ static int mlx5_tracer_handle_string_trace(struct mlx5_fw_tracer *tracer,
if (!cur_string)
return mlx5_tracer_handle_raw_string(tracer, tracer_event);
cur_string->num_of_params = mlx5_tracer_get_num_of_params(cur_string->string);
cur_string->last_param_num = 0;
cur_string->event_id = tracer_event->event_id;
cur_string->tmsn = tracer_event->string_event.tmsn;
cur_string->timestamp = tracer_event->string_event.timestamp;
cur_string->lost = tracer_event->lost_event;
cur_string->last_param_num = 0;
cur_string->num_of_params = mlx5_tracer_get_num_of_params(cur_string->string);
if (cur_string->num_of_params < 0) {
pr_debug("%s Invalid format string parameters\n",
__func__);
mlx5_tracer_handle_bad_format_string(tracer, cur_string);
return 0;
}
if (cur_string->num_of_params == 0) /* trace with no params */
list_add_tail(&cur_string->list, &tracer->ready_strings_list);
} else {
@ -634,6 +700,11 @@ static int mlx5_tracer_handle_string_trace(struct mlx5_fw_tracer *tracer,
__func__, tracer_event->string_event.tmsn);
return mlx5_tracer_handle_raw_string(tracer, tracer_event);
}
if (cur_string->num_of_params < 0) {
pr_debug("%s string parameter of invalid string, dumping\n",
__func__);
return 0;
}
cur_string->last_param_num += 1;
if (cur_string->last_param_num > TRACER_MAX_PARAMS) {
pr_debug("%s Number of params exceeds the max (%d)\n",

View File

@ -125,6 +125,7 @@ struct tracer_string_format {
struct list_head list;
u32 timestamp;
bool lost;
bool invalid_string;
};
enum mlx5_fw_tracer_ownership_state {

View File

@ -69,7 +69,7 @@ struct page_pool;
#define MLX5E_METADATA_ETHER_TYPE (0x8CE4)
#define MLX5E_METADATA_ETHER_LEN 8
#define MLX5E_ETH_HARD_MTU (ETH_HLEN + PSP_ENCAP_HLEN + PSP_TRL_SIZE + VLAN_HLEN + ETH_FCS_LEN)
#define MLX5E_ETH_HARD_MTU (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN)
#define MLX5E_HW2SW_MTU(params, hwmtu) ((hwmtu) - ((params)->hard_mtu))
#define MLX5E_SW2HW_MTU(params, swmtu) ((swmtu) + ((params)->hard_mtu))

View File

@ -342,9 +342,8 @@ static void mlx5e_ipsec_init_macs(struct mlx5e_ipsec_sa_entry *sa_entry,
rt_dst_entry = &rt->dst;
break;
case AF_INET6:
rt_dst_entry = ipv6_stub->ipv6_dst_lookup_flow(
dev_net(netdev), NULL, &fl6, NULL);
if (IS_ERR(rt_dst_entry))
if (!IS_ENABLED(CONFIG_IPV6) ||
ip6_dst_lookup(dev_net(netdev), NULL, &rt_dst_entry, &fl6))
goto neigh;
break;
default:
@ -359,6 +358,9 @@ static void mlx5e_ipsec_init_macs(struct mlx5e_ipsec_sa_entry *sa_entry,
neigh_ha_snapshot(addr, n, netdev);
ether_addr_copy(dst, addr);
if (attrs->dir == XFRM_DEV_OFFLOAD_OUT &&
is_zero_ether_addr(addr))
neigh_event_send(n, NULL);
dst_release(rt_dst_entry);
neigh_release(n);
return;

View File

@ -6825,7 +6825,6 @@ static void _mlx5e_remove(struct auxiliary_device *adev)
* is already unregistered before changing to NIC profile.
*/
if (priv->netdev->reg_state == NETREG_REGISTERED) {
mlx5e_psp_unregister(priv);
unregister_netdev(priv->netdev);
_mlx5e_suspend(adev, false);
} else {

View File

@ -939,7 +939,11 @@ void mlx5e_free_txqsq_descs(struct mlx5e_txqsq *sq)
sq->dma_fifo_cc = dma_fifo_cc;
sq->cc = sqcc;
netdev_tx_completed_queue(sq->txq, npkts, nbytes);
/* Do not update BQL for TXQs that got replaced by new active ones, as
* netdev_tx_reset_queue() is called for them in mlx5e_activate_txqsq().
*/
if (sq == sq->priv->txq2sq[sq->txq_ix])
netdev_tx_completed_queue(sq->txq, npkts, nbytes);
}
#ifdef CONFIG_MLX5_CORE_IPOIB

View File

@ -52,6 +52,7 @@
#include "devlink.h"
#include "lag/lag.h"
#include "en/tc/post_meter.h"
#include "fw_reset.h"
/* There are two match-all miss flows, one for unicast dst mac and
* one for multicast.
@ -3991,6 +3992,11 @@ int mlx5_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode,
if (IS_ERR(esw))
return PTR_ERR(esw);
if (mlx5_fw_reset_in_progress(esw->dev)) {
NL_SET_ERR_MSG_MOD(extack, "Can't change eswitch mode during firmware reset");
return -EBUSY;
}
if (esw_mode_from_devlink(mode, &mlx5_mode))
return -EINVAL;

View File

@ -15,6 +15,7 @@ enum {
MLX5_FW_RESET_FLAGS_DROP_NEW_REQUESTS,
MLX5_FW_RESET_FLAGS_RELOAD_REQUIRED,
MLX5_FW_RESET_FLAGS_UNLOAD_EVENT,
MLX5_FW_RESET_FLAGS_RESET_IN_PROGRESS,
};
struct mlx5_fw_reset {
@ -128,6 +129,16 @@ int mlx5_fw_reset_query(struct mlx5_core_dev *dev, u8 *reset_level, u8 *reset_ty
return mlx5_reg_mfrl_query(dev, reset_level, reset_type, NULL, NULL);
}
bool mlx5_fw_reset_in_progress(struct mlx5_core_dev *dev)
{
struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset;
if (!fw_reset)
return false;
return test_bit(MLX5_FW_RESET_FLAGS_RESET_IN_PROGRESS, &fw_reset->reset_flags);
}
static int mlx5_fw_reset_get_reset_method(struct mlx5_core_dev *dev,
u8 *reset_method)
{
@ -243,6 +254,8 @@ static void mlx5_fw_reset_complete_reload(struct mlx5_core_dev *dev)
BIT(DEVLINK_RELOAD_ACTION_FW_ACTIVATE));
devl_unlock(devlink);
}
clear_bit(MLX5_FW_RESET_FLAGS_RESET_IN_PROGRESS, &fw_reset->reset_flags);
}
static void mlx5_stop_sync_reset_poll(struct mlx5_core_dev *dev)
@ -462,27 +475,48 @@ static void mlx5_sync_reset_request_event(struct work_struct *work)
struct mlx5_fw_reset *fw_reset = container_of(work, struct mlx5_fw_reset,
reset_request_work);
struct mlx5_core_dev *dev = fw_reset->dev;
bool nack_request = false;
struct devlink *devlink;
int err;
err = mlx5_fw_reset_get_reset_method(dev, &fw_reset->reset_method);
if (err)
if (err) {
nack_request = true;
mlx5_core_warn(dev, "Failed reading MFRL, err %d\n", err);
} else if (!mlx5_is_reset_now_capable(dev, fw_reset->reset_method) ||
test_bit(MLX5_FW_RESET_FLAGS_NACK_RESET_REQUEST,
&fw_reset->reset_flags)) {
nack_request = true;
}
if (err || test_bit(MLX5_FW_RESET_FLAGS_NACK_RESET_REQUEST, &fw_reset->reset_flags) ||
!mlx5_is_reset_now_capable(dev, fw_reset->reset_method)) {
devlink = priv_to_devlink(dev);
/* For external resets, try to acquire devl_lock. Skip if devlink reset is
* pending (lock already held)
*/
if (nack_request ||
(!test_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP,
&fw_reset->reset_flags) &&
!devl_trylock(devlink))) {
err = mlx5_fw_reset_set_reset_sync_nack(dev);
mlx5_core_warn(dev, "PCI Sync FW Update Reset Nack %s",
err ? "Failed" : "Sent");
return;
}
if (mlx5_sync_reset_set_reset_requested(dev))
return;
goto unlock;
set_bit(MLX5_FW_RESET_FLAGS_RESET_IN_PROGRESS, &fw_reset->reset_flags);
err = mlx5_fw_reset_set_reset_sync_ack(dev);
if (err)
mlx5_core_warn(dev, "PCI Sync FW Update Reset Ack Failed. Error code: %d\n", err);
else
mlx5_core_warn(dev, "PCI Sync FW Update Reset Ack. Device reset is expected.\n");
unlock:
if (!test_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags))
devl_unlock(devlink);
}
static int mlx5_pci_link_toggle(struct mlx5_core_dev *dev, u16 dev_id)
@ -722,6 +756,8 @@ static void mlx5_sync_reset_abort_event(struct work_struct *work)
if (mlx5_sync_reset_clear_reset_requested(dev, true))
return;
clear_bit(MLX5_FW_RESET_FLAGS_RESET_IN_PROGRESS, &fw_reset->reset_flags);
mlx5_core_warn(dev, "PCI Sync FW Update Reset Aborted.\n");
}
@ -758,6 +794,7 @@ static void mlx5_sync_reset_timeout_work(struct work_struct *work)
if (mlx5_sync_reset_clear_reset_requested(dev, true))
return;
clear_bit(MLX5_FW_RESET_FLAGS_RESET_IN_PROGRESS, &fw_reset->reset_flags);
mlx5_core_warn(dev, "PCI Sync FW Update Reset Timeout.\n");
}
@ -844,7 +881,8 @@ void mlx5_drain_fw_reset(struct mlx5_core_dev *dev)
cancel_work_sync(&fw_reset->reset_reload_work);
cancel_work_sync(&fw_reset->reset_now_work);
cancel_work_sync(&fw_reset->reset_abort_work);
cancel_delayed_work(&fw_reset->reset_timeout_work);
if (test_bit(MLX5_FW_RESET_FLAGS_RESET_REQUESTED, &fw_reset->reset_flags))
mlx5_sync_reset_clear_reset_requested(dev, true);
}
static const struct devlink_param mlx5_fw_reset_devlink_params[] = {

View File

@ -10,6 +10,7 @@ int mlx5_fw_reset_query(struct mlx5_core_dev *dev, u8 *reset_level, u8 *reset_ty
int mlx5_fw_reset_set_reset_sync(struct mlx5_core_dev *dev, u8 reset_type_sel,
struct netlink_ext_ack *extack);
int mlx5_fw_reset_set_live_patch(struct mlx5_core_dev *dev);
bool mlx5_fw_reset_in_progress(struct mlx5_core_dev *dev);
int mlx5_fw_reset_wait_reset_done(struct mlx5_core_dev *dev);
void mlx5_sync_reset_unload_flow(struct mlx5_core_dev *dev, bool locked);

View File

@ -1413,6 +1413,7 @@ static int __mlx5_lag_dev_add_mdev(struct mlx5_core_dev *dev)
static void mlx5_lag_unregister_hca_devcom_comp(struct mlx5_core_dev *dev)
{
mlx5_devcom_unregister_component(dev->priv.hca_devcom_comp);
dev->priv.hca_devcom_comp = NULL;
}
static int mlx5_lag_register_hca_devcom_comp(struct mlx5_core_dev *dev)

View File

@ -67,12 +67,19 @@ err_metadata:
static int enable_mpesw(struct mlx5_lag *ldev)
{
int idx = mlx5_lag_get_dev_index_by_seq(ldev, MLX5_LAG_P1);
struct mlx5_core_dev *dev0;
int err;
int idx;
int i;
if (idx < 0 || ldev->mode != MLX5_LAG_MODE_NONE)
if (ldev->mode == MLX5_LAG_MODE_MPESW)
return 0;
if (ldev->mode != MLX5_LAG_MODE_NONE)
return -EINVAL;
idx = mlx5_lag_get_dev_index_by_seq(ldev, MLX5_LAG_P1);
if (idx < 0)
return -EINVAL;
dev0 = ldev->pf[idx].dev;

View File

@ -2231,6 +2231,7 @@ static void shutdown(struct pci_dev *pdev)
mlx5_core_info(dev, "Shutdown was called\n");
set_bit(MLX5_BREAK_FW_WAIT, &dev->intf_state);
mlx5_drain_fw_reset(dev);
mlx5_drain_health_wq(dev);
err = mlx5_try_fast_unload(dev);
if (err)

View File

@ -440,7 +440,9 @@ int mlxsw_sp_mr_route_add(struct mlxsw_sp_mr_table *mr_table,
rhashtable_remove_fast(&mr_table->route_ht,
&mr_orig_route->ht_node,
mlxsw_sp_mr_route_ht_params);
mutex_lock(&mr_table->route_list_lock);
list_del(&mr_orig_route->node);
mutex_unlock(&mr_table->route_list_lock);
mlxsw_sp_mr_route_destroy(mr_table, mr_orig_route);
}

View File

@ -2265,6 +2265,7 @@ mlxsw_sp_neigh_entry_alloc(struct mlxsw_sp *mlxsw_sp, struct neighbour *n,
if (!neigh_entry)
return NULL;
neigh_hold(n);
neigh_entry->key.n = n;
neigh_entry->rif = rif;
INIT_LIST_HEAD(&neigh_entry->nexthop_list);
@ -2274,6 +2275,7 @@ mlxsw_sp_neigh_entry_alloc(struct mlxsw_sp *mlxsw_sp, struct neighbour *n,
static void mlxsw_sp_neigh_entry_free(struct mlxsw_sp_neigh_entry *neigh_entry)
{
neigh_release(neigh_entry->key.n);
kfree(neigh_entry);
}
@ -2858,6 +2860,11 @@ static int mlxsw_sp_router_schedule_work(struct net *net,
if (!net_work)
return NOTIFY_BAD;
/* Take a reference to ensure the neighbour won't be destructed until
* we drop the reference in the work item.
*/
neigh_clone(n);
INIT_WORK(&net_work->work, cb);
net_work->mlxsw_sp = router->mlxsw_sp;
net_work->n = n;
@ -2881,11 +2888,6 @@ static int mlxsw_sp_router_schedule_neigh_work(struct mlxsw_sp_router *router,
struct net *net;
net = neigh_parms_net(n->parms);
/* Take a reference to ensure the neighbour won't be destructed until we
* drop the reference in delayed work.
*/
neigh_clone(n);
return mlxsw_sp_router_schedule_work(net, router, n,
mlxsw_sp_router_neigh_event_work);
}
@ -4320,6 +4322,8 @@ mlxsw_sp_nexthop_dead_neigh_replace(struct mlxsw_sp *mlxsw_sp,
if (err)
goto err_neigh_entry_insert;
neigh_release(old_n);
read_lock_bh(&n->lock);
nud_state = n->nud_state;
dead = n->dead;
@ -4328,14 +4332,10 @@ mlxsw_sp_nexthop_dead_neigh_replace(struct mlxsw_sp *mlxsw_sp,
list_for_each_entry(nh, &neigh_entry->nexthop_list,
neigh_list_node) {
neigh_release(old_n);
neigh_clone(n);
__mlxsw_sp_nexthop_neigh_update(nh, !entry_connected);
mlxsw_sp_nexthop_group_refresh(mlxsw_sp, nh->nhgi->nh_grp);
}
neigh_release(n);
return 0;
err_neigh_entry_insert:
@ -4428,6 +4428,11 @@ static int mlxsw_sp_nexthop_neigh_init(struct mlxsw_sp *mlxsw_sp,
}
}
/* Release the reference taken by neigh_lookup() / neigh_create() since
* neigh_entry already holds one.
*/
neigh_release(n);
/* If that is the first nexthop connected to that neigh, add to
* nexthop_neighs_list
*/
@ -4454,11 +4459,9 @@ static void mlxsw_sp_nexthop_neigh_fini(struct mlxsw_sp *mlxsw_sp,
struct mlxsw_sp_nexthop *nh)
{
struct mlxsw_sp_neigh_entry *neigh_entry = nh->neigh_entry;
struct neighbour *n;
if (!neigh_entry)
return;
n = neigh_entry->key.n;
__mlxsw_sp_nexthop_neigh_update(nh, true);
list_del(&nh->neigh_list_node);
@ -4472,8 +4475,6 @@ static void mlxsw_sp_nexthop_neigh_fini(struct mlxsw_sp *mlxsw_sp,
if (!neigh_entry->connected && list_empty(&neigh_entry->nexthop_list))
mlxsw_sp_neigh_entry_destroy(mlxsw_sp, neigh_entry);
neigh_release(n);
}
static bool mlxsw_sp_ipip_netdev_ul_up(struct net_device *ol_dev)

View File

@ -2655,9 +2655,6 @@ static void rtl_wol_enable_rx(struct rtl8169_private *tp)
static void rtl_prepare_power_down(struct rtl8169_private *tp)
{
if (tp->dash_enabled)
return;
if (tp->mac_version == RTL_GIGA_MAC_VER_32 ||
tp->mac_version == RTL_GIGA_MAC_VER_33)
rtl_ephy_write(tp, 0x19, 0xff64);
@ -4812,7 +4809,7 @@ static void rtl8169_down(struct rtl8169_private *tp)
rtl_disable_exit_l1(tp);
rtl_prepare_power_down(tp);
if (tp->dash_type != RTL_DASH_NONE)
if (tp->dash_type != RTL_DASH_NONE && !tp->saved_wolopts)
rtl8168_driver_stop(tp);
}

View File

@ -209,6 +209,7 @@ config TI_ICSSG_PRUETH_SR1
depends on PRU_REMOTEPROC
depends on NET_SWITCHDEV
depends on ARCH_K3 && OF && TI_K3_UDMA_GLUE_LAYER
depends on PTP_1588_CLOCK_OPTIONAL
help
Support dual Gigabit Ethernet ports over the ICSSG PRU Subsystem.
This subsystem is available on the AM65 SR1.0 platform.
@ -234,7 +235,7 @@ config TI_PRUETH
depends on PRU_REMOTEPROC
depends on NET_SWITCHDEV
select TI_ICSS_IEP
imply PTP_1588_CLOCK
depends on PTP_1588_CLOCK_OPTIONAL
help
Some TI SoCs has Programmable Realtime Unit (PRU) cores which can
support Single or Dual Ethernet ports with the help of firmware code

View File

@ -737,6 +737,9 @@ static rx_handler_result_t ipvlan_handle_mode_l2(struct sk_buff **pskb,
struct ethhdr *eth = eth_hdr(skb);
rx_handler_result_t ret = RX_HANDLER_PASS;
if (unlikely(skb->pkt_type == PACKET_LOOPBACK))
return RX_HANDLER_PASS;
if (is_multicast_ether_addr(eth->h_dest)) {
if (ipvlan_external_frame(skb, port)) {
struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);

View File

@ -698,7 +698,7 @@ static int mv88q2xxx_hwmon_write(struct device *dev,
switch (attr) {
case hwmon_temp_max:
clamp_val(val, -75000, 180000);
val = clamp(val, -75000, 180000);
val = (val / 1000) + 75;
val = FIELD_PREP(MDIO_MMD_PCS_MV_TEMP_SENSOR3_INT_THRESH_MASK,
val);

View File

@ -691,10 +691,6 @@ static int rtl8211f_config_aldps(struct phy_device *phydev)
static int rtl8211f_config_phy_eee(struct phy_device *phydev)
{
/* RTL8211FVD has no PHYCR2 register */
if (phydev->drv->phy_id == RTL_8211FVD_PHYID)
return 0;
/* Disable PHY-mode EEE so LPI is passed to the MAC */
return phy_modify_paged(phydev, RTL8211F_PHYCR_PAGE, RTL8211F_PHYCR2,
RTL8211F_PHYCR2_PHY_EEE_ENABLE, 0);

View File

@ -497,6 +497,8 @@ static const struct sfp_quirk sfp_quirks[] = {
SFP_QUIRK("ALCATELLUCENT", "3FE46541AA", sfp_quirk_2500basex,
sfp_fixup_nokia),
SFP_QUIRK_F("BIDB", "X-ONU-SFPP", sfp_fixup_potron),
// FLYPRO SFP-10GT-CS-30M uses Rollball protocol to talk to the PHY.
SFP_QUIRK_F("FLYPRO", "SFP-10GT-CS-30M", sfp_fixup_rollball),

View File

@ -406,7 +406,7 @@ static int pn533_acr122_poweron_rdr(struct pn533_usb_phy *phy)
if (rc || (transferred != sizeof(cmd))) {
nfc_err(&phy->udev->dev,
"Reader power on cmd error %d\n", rc);
return rc;
return rc ?: -EINVAL;
}
rc = usb_submit_urb(phy->in_urb, GFP_KERNEL);

View File

@ -2032,7 +2032,7 @@ end:
return ret;
}
int rapl_package_add_pmu(struct rapl_package *rp)
int rapl_package_add_pmu_locked(struct rapl_package *rp)
{
struct rapl_package_pmu_data *data = &rp->pmu_data;
int idx;
@ -2040,8 +2040,6 @@ int rapl_package_add_pmu(struct rapl_package *rp)
if (rp->has_pmu)
return -EEXIST;
guard(cpus_read_lock)();
for (idx = 0; idx < rp->nr_domains; idx++) {
struct rapl_domain *rd = &rp->domains[idx];
int domain = rd->id;
@ -2091,17 +2089,23 @@ int rapl_package_add_pmu(struct rapl_package *rp)
return rapl_pmu_update(rp);
}
EXPORT_SYMBOL_GPL(rapl_package_add_pmu_locked);
int rapl_package_add_pmu(struct rapl_package *rp)
{
guard(cpus_read_lock)();
return rapl_package_add_pmu_locked(rp);
}
EXPORT_SYMBOL_GPL(rapl_package_add_pmu);
void rapl_package_remove_pmu(struct rapl_package *rp)
void rapl_package_remove_pmu_locked(struct rapl_package *rp)
{
struct rapl_package *pos;
if (!rp->has_pmu)
return;
guard(cpus_read_lock)();
list_for_each_entry(pos, &rapl_packages, plist) {
/* PMU is still needed */
if (pos->has_pmu && pos != rp)
@ -2111,6 +2115,14 @@ void rapl_package_remove_pmu(struct rapl_package *rp)
perf_pmu_unregister(&rapl_pmu.pmu);
memset(&rapl_pmu, 0, sizeof(struct rapl_pmu));
}
EXPORT_SYMBOL_GPL(rapl_package_remove_pmu_locked);
void rapl_package_remove_pmu(struct rapl_package *rp)
{
guard(cpus_read_lock)();
rapl_package_remove_pmu_locked(rp);
}
EXPORT_SYMBOL_GPL(rapl_package_remove_pmu);
#endif

View File

@ -82,7 +82,7 @@ static int rapl_cpu_online(unsigned int cpu)
if (IS_ERR(rp))
return PTR_ERR(rp);
if (rapl_msr_pmu)
rapl_package_add_pmu(rp);
rapl_package_add_pmu_locked(rp);
}
cpumask_set_cpu(cpu, &rp->cpumask);
return 0;
@ -101,7 +101,7 @@ static int rapl_cpu_down_prep(unsigned int cpu)
lead_cpu = cpumask_first(&rp->cpumask);
if (lead_cpu >= nr_cpu_ids) {
if (rapl_msr_pmu)
rapl_package_remove_pmu(rp);
rapl_package_remove_pmu_locked(rp);
rapl_remove_package_cpuslocked(rp);
} else if (rp->lead_cpu == cpu) {
rp->lead_cpu = lead_cpu;

View File

@ -68,7 +68,7 @@ static ssize_t show_constraint_##_attr(struct device *dev, \
int id; \
struct powercap_zone_constraint *pconst;\
\
if (!sscanf(dev_attr->attr.name, "constraint_%d_", &id)) \
if (sscanf(dev_attr->attr.name, "constraint_%d_", &id) != 1) \
return -EINVAL; \
if (id >= power_zone->const_id_cnt) \
return -EINVAL; \
@ -93,7 +93,7 @@ static ssize_t store_constraint_##_attr(struct device *dev,\
int id; \
struct powercap_zone_constraint *pconst;\
\
if (!sscanf(dev_attr->attr.name, "constraint_%d_", &id)) \
if (sscanf(dev_attr->attr.name, "constraint_%d_", &id) != 1) \
return -EINVAL; \
if (id >= power_zone->const_id_cnt) \
return -EINVAL; \
@ -162,7 +162,7 @@ static ssize_t show_constraint_name(struct device *dev,
ssize_t len = -ENODATA;
struct powercap_zone_constraint *pconst;
if (!sscanf(dev_attr->attr.name, "constraint_%d_", &id))
if (sscanf(dev_attr->attr.name, "constraint_%d_", &id) != 1)
return -EINVAL;
if (id >= power_zone->const_id_cnt)
return -EINVAL;
@ -625,17 +625,23 @@ struct powercap_control_type *powercap_register_control_type(
INIT_LIST_HEAD(&control_type->node);
control_type->dev.class = &powercap_class;
dev_set_name(&control_type->dev, "%s", name);
result = device_register(&control_type->dev);
if (result) {
put_device(&control_type->dev);
return ERR_PTR(result);
}
idr_init(&control_type->idr);
mutex_lock(&powercap_cntrl_list_lock);
list_add_tail(&control_type->node, &powercap_cntrl_list);
mutex_unlock(&powercap_cntrl_list_lock);
result = device_register(&control_type->dev);
if (result) {
mutex_lock(&powercap_cntrl_list_lock);
list_del(&control_type->node);
mutex_unlock(&powercap_cntrl_list_lock);
idr_destroy(&control_type->idr);
put_device(&control_type->dev);
return ERR_PTR(result);
}
return control_type;
}
EXPORT_SYMBOL_GPL(powercap_register_control_type);

View File

@ -503,7 +503,8 @@ static const struct pci_device_id proc_thermal_pci_ids[] = {
{ PCI_DEVICE_DATA(INTEL, WCL_THERMAL, PROC_THERMAL_FEATURE_MSI_SUPPORT |
PROC_THERMAL_FEATURE_RAPL | PROC_THERMAL_FEATURE_DLVR |
PROC_THERMAL_FEATURE_DVFS | PROC_THERMAL_FEATURE_WT_HINT |
PROC_THERMAL_FEATURE_POWER_FLOOR | PROC_THERMAL_FEATURE_PTC) },
PROC_THERMAL_FEATURE_POWER_FLOOR | PROC_THERMAL_FEATURE_PTC |
PROC_THERMAL_FEATURE_SOC_POWER_SLIDER) },
{ PCI_DEVICE_DATA(INTEL, NVL_H_THERMAL, PROC_THERMAL_FEATURE_RAPL |
PROC_THERMAL_FEATURE_DLVR | PROC_THERMAL_FEATURE_DVFS |
PROC_THERMAL_FEATURE_MSI_SUPPORT | PROC_THERMAL_FEATURE_WT_HINT |

View File

@ -500,7 +500,7 @@ void thermal_zone_set_trip_hyst(struct thermal_zone_device *tz,
WRITE_ONCE(trip->hysteresis, hyst);
thermal_notify_tz_trip_change(tz, trip);
/*
* If the zone temperature is above or at the trip tmperature, the trip
* If the zone temperature is above or at the trip temperature, the trip
* is in the trips_reached list and its threshold is equal to its low
* temperature. It needs to stay in that list, but its threshold needs
* to be updated and the list ordering may need to be restored.
@ -1043,7 +1043,7 @@ static void thermal_cooling_device_init_complete(struct thermal_cooling_device *
* @np: a pointer to a device tree node.
* @type: the thermal cooling device type.
* @devdata: device private data.
* @ops: standard thermal cooling devices callbacks.
* @ops: standard thermal cooling devices callbacks.
*
* This interface function adds a new thermal cooling device (fan/processor/...)
* to /sys/class/thermal/ folder as cooling_device[0-*]. It tries to bind itself

View File

@ -2,6 +2,7 @@
#include <linux/fs.h>
#include <linux/security.h>
#include <linux/fscrypt.h>
#include <linux/fsnotify.h>
#include <linux/fileattr.h>
#include <linux/export.h>
#include <linux/syscalls.h>
@ -298,6 +299,7 @@ int vfs_fileattr_set(struct mnt_idmap *idmap, struct dentry *dentry,
err = inode->i_op->fileattr_set(idmap, dentry, fa);
if (err)
goto out;
fsnotify_xattr(dentry);
}
out:

View File

@ -270,8 +270,15 @@ int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
/*
* Include parent/name in notification either if some notification
* groups require parent info or the parent is interested in this event.
* The parent interest in ACCESS/MODIFY events does not apply to special
* files, where read/write are not on the filesystem of the parent and
* events can provide an undesirable side-channel for information
* exfiltration.
*/
parent_interested = mask & p_mask & ALL_FSNOTIFY_EVENTS;
parent_interested = mask & p_mask & ALL_FSNOTIFY_EVENTS &&
!(data_type == FSNOTIFY_EVENT_PATH &&
d_is_special(dentry) &&
(mask & (FS_ACCESS | FS_MODIFY)));
if (parent_needed || parent_interested) {
/* When notifying parent, child should be passed as data */
WARN_ON_ONCE(inode != fsnotify_data_inode(data, data_type));

View File

@ -145,6 +145,6 @@ extern const struct export_operations cifs_export_ops;
#endif /* CONFIG_CIFS_NFSD_EXPORT */
/* when changing internal version - update following two lines at same time */
#define SMB3_PRODUCT_BUILD 57
#define CIFS_VERSION "2.57"
#define SMB3_PRODUCT_BUILD 58
#define CIFS_VERSION "2.58"
#endif /* _CIFSFS_H */

View File

@ -12,7 +12,7 @@
#include <net/sock.h>
#include <linux/unaligned.h>
#include "../common/smbfsctl.h"
#include "../common/smb2pdu.h"
#include "../common/smb1pdu.h"
#define CIFS_PROT 0
#define POSIX_PROT (CIFS_PROT+1)

56
fs/smb/common/smb1pdu.h Normal file
View File

@ -0,0 +1,56 @@
/* SPDX-License-Identifier: LGPL-2.1 */
/*
*
* Copyright (C) International Business Machines Corp., 2002,2009
* 2018 Samsung Electronics Co., Ltd.
* Author(s): Steve French <sfrench@us.ibm.com>
* Namjae Jeon <linkinjeon@kernel.org>
*
*/
#ifndef _COMMON_SMB1_PDU_H
#define _COMMON_SMB1_PDU_H
#define SMB1_PROTO_NUMBER cpu_to_le32(0x424d53ff)
/*
* See MS-CIFS 2.2.3.1
* MS-SMB 2.2.3.1
*/
struct smb_hdr {
__u8 Protocol[4];
__u8 Command;
union {
struct {
__u8 ErrorClass;
__u8 Reserved;
__le16 Error;
} __packed DosError;
__le32 CifsError;
} __packed Status;
__u8 Flags;
__le16 Flags2; /* note: le */
__le16 PidHigh;
union {
struct {
__le32 SequenceNumber; /* le */
__u32 Reserved; /* zero */
} __packed Sequence;
__u8 SecuritySignature[8]; /* le */
} __packed Signature;
__u8 pad[2];
__u16 Tid;
__le16 Pid;
__u16 Uid;
__le16 Mid;
__u8 WordCount;
} __packed;
/* See MS-CIFS 2.2.4.52.1 */
typedef struct smb_negotiate_req {
struct smb_hdr hdr; /* wct = 0 */
__le16 ByteCount;
unsigned char DialectsArray[];
} __packed SMB_NEGOTIATE_REQ;
#endif /* _COMMON_SMB1_PDU_H */

View File

@ -1293,6 +1293,7 @@ struct create_durable_handle_reconnect_v2 {
struct create_context_hdr ccontext;
__u8 Name[8];
struct durable_reconnect_context_v2 dcontext;
__u8 Pad[4];
} __packed;
/* See MS-SMB2 2.2.14.2.12 */
@ -1985,39 +1986,6 @@ struct smb2_lease_ack {
__le64 LeaseDuration;
} __packed;
/*
* See MS-CIFS 2.2.3.1
* MS-SMB 2.2.3.1
*/
struct smb_hdr {
__u8 Protocol[4];
__u8 Command;
union {
struct {
__u8 ErrorClass;
__u8 Reserved;
__le16 Error;
} __packed DosError;
__le32 CifsError;
} __packed Status;
__u8 Flags;
__le16 Flags2; /* note: le */
__le16 PidHigh;
union {
struct {
__le32 SequenceNumber; /* le */
__u32 Reserved; /* zero */
} __packed Sequence;
__u8 SecuritySignature[8]; /* le */
} __packed Signature;
__u8 pad[2];
__u16 Tid;
__le16 Pid;
__u16 Uid;
__le16 Mid;
__u8 WordCount;
} __packed;
#define OP_BREAK_STRUCT_SIZE_20 24
#define OP_BREAK_STRUCT_SIZE_21 36
@ -2122,11 +2090,4 @@ struct smb_hdr {
#define SET_MINIMUM_RIGHTS (FILE_READ_EA | FILE_READ_ATTRIBUTES \
| READ_CONTROL | SYNCHRONIZE)
/* See MS-CIFS 2.2.4.52.1 */
typedef struct smb_negotiate_req {
struct smb_hdr hdr; /* wct = 0 */
__le16 ByteCount;
unsigned char DialectsArray[];
} __packed SMB_NEGOTIATE_REQ;
#endif /* _COMMON_SMB2PDU_H */

View File

@ -11,8 +11,6 @@
#ifndef _COMMON_SMB_GLOB_H
#define _COMMON_SMB_GLOB_H
#define SMB1_PROTO_NUMBER cpu_to_le32(0x424d53ff)
struct smb_version_values {
char *version_string;
__u16 protocol_id;

View File

@ -10,6 +10,7 @@
#include "glob.h"
#include "../common/smbglob.h"
#include "../common/smb1pdu.h"
#include "../common/smb2pdu.h"
#include "../common/fscc.h"
#include "smb2pdu.h"

View File

@ -214,10 +214,14 @@ void rapl_remove_package(struct rapl_package *rp);
#ifdef CONFIG_PERF_EVENTS
int rapl_package_add_pmu(struct rapl_package *rp);
int rapl_package_add_pmu_locked(struct rapl_package *rp);
void rapl_package_remove_pmu(struct rapl_package *rp);
void rapl_package_remove_pmu_locked(struct rapl_package *rp);
#else
static inline int rapl_package_add_pmu(struct rapl_package *rp) { return 0; }
static inline int rapl_package_add_pmu_locked(struct rapl_package *rp) { return 0; }
static inline void rapl_package_remove_pmu(struct rapl_package *rp) { }
static inline void rapl_package_remove_pmu_locked(struct rapl_package *rp) { }
#endif
#endif /* __INTEL_RAPL_H__ */

View File

@ -123,27 +123,15 @@ void inet_frags_fini(struct inet_frags *);
int fqdir_init(struct fqdir **fqdirp, struct inet_frags *f, struct net *net);
static inline void fqdir_pre_exit(struct fqdir *fqdir)
{
/* Prevent creation of new frags.
* Pairs with READ_ONCE() in inet_frag_find().
*/
WRITE_ONCE(fqdir->high_thresh, 0);
/* Pairs with READ_ONCE() in inet_frag_kill(), ip_expire()
* and ip6frag_expire_frag_queue().
*/
WRITE_ONCE(fqdir->dead, true);
}
void fqdir_pre_exit(struct fqdir *fqdir);
void fqdir_exit(struct fqdir *fqdir);
void inet_frag_kill(struct inet_frag_queue *q, int *refs);
void inet_frag_destroy(struct inet_frag_queue *q);
struct inet_frag_queue *inet_frag_find(struct fqdir *fqdir, void *key);
/* Free all skbs in the queue; return the sum of their truesizes. */
unsigned int inet_frag_rbtree_purge(struct rb_root *root,
enum skb_drop_reason reason);
void inet_frag_queue_flush(struct inet_frag_queue *q,
enum skb_drop_reason reason);
static inline void inet_frag_putn(struct inet_frag_queue *q, int refs)
{

View File

@ -69,9 +69,6 @@ ip6frag_expire_frag_queue(struct net *net, struct frag_queue *fq)
int refs = 1;
rcu_read_lock();
/* Paired with the WRITE_ONCE() in fqdir_pre_exit(). */
if (READ_ONCE(fq->q.fqdir->dead))
goto out_rcu_unlock;
spin_lock(&fq->q.lock);
if (fq->q.flags & INET_FRAG_COMPLETE)
@ -80,6 +77,12 @@ ip6frag_expire_frag_queue(struct net *net, struct frag_queue *fq)
fq->q.flags |= INET_FRAG_DROP;
inet_frag_kill(&fq->q, &refs);
/* Paired with the WRITE_ONCE() in fqdir_pre_exit(). */
if (READ_ONCE(fq->q.fqdir->dead)) {
inet_frag_queue_flush(&fq->q, 0);
goto out;
}
dev = dev_get_by_index_rcu(net, fq->iif);
if (!dev)
goto out;

View File

@ -1091,6 +1091,29 @@ struct nft_rule_blob {
__attribute__((aligned(__alignof__(struct nft_rule_dp))));
};
enum nft_chain_types {
NFT_CHAIN_T_DEFAULT = 0,
NFT_CHAIN_T_ROUTE,
NFT_CHAIN_T_NAT,
NFT_CHAIN_T_MAX
};
/**
* struct nft_chain_validate_state - validation state
*
* If a chain is encountered again during table validation it is
* possible to avoid revalidation provided the calling context is
* compatible. This structure stores relevant calling context of
* previous validations.
*
* @hook_mask: the hook numbers and locations the chain is linked to
* @depth: the deepest call chain level the chain is linked to
*/
struct nft_chain_validate_state {
u8 hook_mask[NFT_CHAIN_T_MAX];
u8 depth;
};
/**
* struct nft_chain - nf_tables chain
*
@ -1109,6 +1132,7 @@ struct nft_rule_blob {
* @udlen: user data length
* @udata: user data in the chain
* @blob_next: rule blob pointer to the next in the chain
* @vstate: validation state
*/
struct nft_chain {
struct nft_rule_blob __rcu *blob_gen_0;
@ -1128,9 +1152,10 @@ struct nft_chain {
/* Only used during control plane commit phase: */
struct nft_rule_blob *blob_next;
struct nft_chain_validate_state vstate;
};
int nft_chain_validate(const struct nft_ctx *ctx, const struct nft_chain *chain);
int nft_chain_validate(const struct nft_ctx *ctx, struct nft_chain *chain);
int nft_setelem_validate(const struct nft_ctx *ctx, struct nft_set *set,
const struct nft_set_iter *iter,
struct nft_elem_priv *elem_priv);
@ -1138,13 +1163,6 @@ int nft_set_catchall_validate(const struct nft_ctx *ctx, struct nft_set *set);
int nf_tables_bind_chain(const struct nft_ctx *ctx, struct nft_chain *chain);
void nf_tables_unbind_chain(const struct nft_ctx *ctx, struct nft_chain *chain);
enum nft_chain_types {
NFT_CHAIN_T_DEFAULT = 0,
NFT_CHAIN_T_ROUTE,
NFT_CHAIN_T_NAT,
NFT_CHAIN_T_MAX
};
/**
* struct nft_chain_type - nf_tables chain type info
*

View File

@ -2,6 +2,7 @@
/* Do not edit directly, auto-generated from: */
/* Documentation/netlink/specs/em.yaml */
/* YNL-GEN uapi header */
/* To regenerate run: tools/net/ynl/ynl-regen.sh */
#ifndef _UAPI_LINUX_ENERGY_MODEL_H
#define _UAPI_LINUX_ENERGY_MODEL_H

View File

@ -40,6 +40,7 @@
#define MPTCP_PM_ADDR_FLAG_FULLMESH _BITUL(3)
#define MPTCP_PM_ADDR_FLAG_IMPLICIT _BITUL(4)
#define MPTCP_PM_ADDR_FLAG_LAMINAR _BITUL(5)
#define MPTCP_PM_ADDR_FLAGS_MASK GENMASK(5, 0)
struct mptcp_info {
__u8 mptcpi_subflows;

View File

@ -2,6 +2,7 @@
/* Do not edit directly, auto-generated from: */
/* Documentation/netlink/specs/em.yaml */
/* YNL-GEN kernel source */
/* To regenerate run: tools/net/ynl/ynl-regen.sh */
#include <net/netlink.h>
#include <net/genetlink.h>

View File

@ -2,6 +2,7 @@
/* Do not edit directly, auto-generated from: */
/* Documentation/netlink/specs/em.yaml */
/* YNL-GEN kernel header */
/* To regenerate run: tools/net/ynl/ynl-regen.sh */
#ifndef _LINUX_EM_GEN_H
#define _LINUX_EM_GEN_H

2
lib/crypto/riscv/.gitignore vendored Normal file
View File

@ -0,0 +1,2 @@
# SPDX-License-Identifier: GPL-2.0-only
poly1305-core.S

View File

@ -92,8 +92,15 @@ static int cffrml_receive(struct cflayer *layr, struct cfpkt *pkt)
len = le16_to_cpu(tmp);
/* Subtract for FCS on length if FCS is not used. */
if (!this->dofcs)
if (!this->dofcs) {
if (len < 2) {
++cffrml_rcv_error;
pr_err("Invalid frame length (%d)\n", len);
cfpkt_destroy(pkt);
return -EPROTO;
}
len -= 2;
}
if (cfpkt_setlen(pkt, len) < 0) {
++cffrml_rcv_error;

View File

@ -5,7 +5,6 @@
menuconfig CAN
tristate "CAN bus subsystem support"
select CAN_DEV
help
Controller Area Network (CAN) is a slow (up to 1Mbit/s) serial
communications protocol. Development of the CAN bus started in

View File

@ -482,6 +482,12 @@ static int j1939_sk_bind(struct socket *sock, struct sockaddr_unsized *uaddr, in
goto out_release_sock;
}
if (ndev->reg_state != NETREG_REGISTERED) {
dev_put(ndev);
ret = -ENODEV;
goto out_release_sock;
}
can_ml = can_get_ml_priv(ndev);
if (!can_ml) {
dev_put(ndev);

View File

@ -1567,6 +1567,8 @@ int j1939_session_activate(struct j1939_session *session)
if (active) {
j1939_session_put(active);
ret = -EAGAIN;
} else if (priv->ndev->reg_state != NETREG_REGISTERED) {
ret = -ENODEV;
} else {
WARN_ON_ONCE(session->state != J1939_SESSION_NEW);
list_add_tail(&session->active_session_list_entry,

View File

@ -2383,7 +2383,10 @@ static int ethtool_get_strings(struct net_device *dev, void __user *useraddr)
return -ENOMEM;
WARN_ON_ONCE(!ret);
gstrings.len = ret;
if (gstrings.len && gstrings.len != ret)
gstrings.len = 0;
else
gstrings.len = ret;
if (gstrings.len) {
data = vzalloc(array_size(gstrings.len, ETH_GSTRING_LEN));
@ -2509,10 +2512,13 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr)
if (copy_from_user(&stats, useraddr, sizeof(stats)))
return -EFAULT;
stats.n_stats = n_stats;
if (stats.n_stats && stats.n_stats != n_stats)
stats.n_stats = 0;
else
stats.n_stats = n_stats;
if (n_stats) {
data = vzalloc(array_size(n_stats, sizeof(u64)));
if (stats.n_stats) {
data = vzalloc(array_size(stats.n_stats, sizeof(u64)));
if (!data)
return -ENOMEM;
ops->get_ethtool_stats(dev, &stats, data);
@ -2524,7 +2530,9 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr)
if (copy_to_user(useraddr, &stats, sizeof(stats)))
goto out;
useraddr += sizeof(stats);
if (n_stats && copy_to_user(useraddr, data, array_size(n_stats, sizeof(u64))))
if (stats.n_stats &&
copy_to_user(useraddr, data,
array_size(stats.n_stats, sizeof(u64))))
goto out;
ret = 0;
@ -2560,6 +2568,10 @@ static int ethtool_get_phy_stats_phydev(struct phy_device *phydev,
return -EOPNOTSUPP;
n_stats = phy_ops->get_sset_count(phydev);
if (stats->n_stats && stats->n_stats != n_stats) {
stats->n_stats = 0;
return 0;
}
ret = ethtool_vzalloc_stats_array(n_stats, data);
if (ret)
@ -2580,6 +2592,10 @@ static int ethtool_get_phy_stats_ethtool(struct net_device *dev,
return -EOPNOTSUPP;
n_stats = ops->get_sset_count(dev, ETH_SS_PHY_STATS);
if (stats->n_stats && stats->n_stats != n_stats) {
stats->n_stats = 0;
return 0;
}
ret = ethtool_vzalloc_stats_array(n_stats, data);
if (ret)
@ -2616,7 +2632,9 @@ static int ethtool_get_phy_stats(struct net_device *dev, void __user *useraddr)
}
useraddr += sizeof(stats);
if (copy_to_user(useraddr, data, array_size(stats.n_stats, sizeof(u64))))
if (stats.n_stats &&
copy_to_user(useraddr, data,
array_size(stats.n_stats, sizeof(u64))))
ret = -EFAULT;
out:

View File

@ -276,6 +276,8 @@ int handshake_req_submit(struct socket *sock, struct handshake_req *req,
out_unlock:
spin_unlock(&hn->hn_lock);
out_err:
/* Restore original destructor so socket teardown still runs on failure */
req->hr_sk->sk_destruct = req->hr_odestruct;
trace_handshake_submit_err(net, req, req->hr_sk, ret);
handshake_req_destroy(req);
return ret;
@ -324,7 +326,11 @@ bool handshake_req_cancel(struct sock *sk)
hn = handshake_pernet(net);
if (hn && remove_pending(hn, req)) {
/* Request hadn't been accepted */
/* Request hadn't been accepted - mark cancelled */
if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
trace_handshake_cancel_busy(net, req, sk);
return false;
}
goto out_true;
}
if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {

View File

@ -205,6 +205,8 @@ struct sk_buff *prp_get_untagged_frame(struct hsr_frame_info *frame,
__pskb_copy(frame->skb_prp,
skb_headroom(frame->skb_prp),
GFP_ATOMIC);
if (!frame->skb_std)
return NULL;
} else {
/* Unexpected */
WARN_ONCE(1, "%s:%d: Unexpected frame received (port_src %s)\n",

View File

@ -218,6 +218,41 @@ static int __init inet_frag_wq_init(void)
pure_initcall(inet_frag_wq_init);
void fqdir_pre_exit(struct fqdir *fqdir)
{
struct inet_frag_queue *fq;
struct rhashtable_iter hti;
/* Prevent creation of new frags.
* Pairs with READ_ONCE() in inet_frag_find().
*/
WRITE_ONCE(fqdir->high_thresh, 0);
/* Pairs with READ_ONCE() in inet_frag_kill(), ip_expire()
* and ip6frag_expire_frag_queue().
*/
WRITE_ONCE(fqdir->dead, true);
rhashtable_walk_enter(&fqdir->rhashtable, &hti);
rhashtable_walk_start(&hti);
while ((fq = rhashtable_walk_next(&hti))) {
if (IS_ERR(fq)) {
if (PTR_ERR(fq) != -EAGAIN)
break;
continue;
}
spin_lock_bh(&fq->lock);
if (!(fq->flags & INET_FRAG_COMPLETE))
inet_frag_queue_flush(fq, 0);
spin_unlock_bh(&fq->lock);
}
rhashtable_walk_stop(&hti);
rhashtable_walk_exit(&hti);
}
EXPORT_SYMBOL(fqdir_pre_exit);
void fqdir_exit(struct fqdir *fqdir)
{
INIT_WORK(&fqdir->destroy_work, fqdir_work_fn);
@ -263,8 +298,8 @@ static void inet_frag_destroy_rcu(struct rcu_head *head)
kmem_cache_free(f->frags_cachep, q);
}
unsigned int inet_frag_rbtree_purge(struct rb_root *root,
enum skb_drop_reason reason)
static unsigned int
inet_frag_rbtree_purge(struct rb_root *root, enum skb_drop_reason reason)
{
struct rb_node *p = rb_first(root);
unsigned int sum = 0;
@ -284,7 +319,17 @@ unsigned int inet_frag_rbtree_purge(struct rb_root *root,
}
return sum;
}
EXPORT_SYMBOL(inet_frag_rbtree_purge);
void inet_frag_queue_flush(struct inet_frag_queue *q,
enum skb_drop_reason reason)
{
unsigned int sum;
reason = reason ?: SKB_DROP_REASON_FRAG_REASM_TIMEOUT;
sum = inet_frag_rbtree_purge(&q->rb_fragments, reason);
sub_frag_mem_limit(q->fqdir, sum);
}
EXPORT_SYMBOL(inet_frag_queue_flush);
void inet_frag_destroy(struct inet_frag_queue *q)
{
@ -327,7 +372,9 @@ static struct inet_frag_queue *inet_frag_alloc(struct fqdir *fqdir,
timer_setup(&q->timer, f->frag_expire, 0);
spin_lock_init(&q->lock);
/* One reference for the timer, one for the hash table. */
/* One reference for the timer, one for the hash table.
* We never take any extra references, only decrement this field.
*/
refcount_set(&q->refcnt, 2);
return q;

View File

@ -134,11 +134,6 @@ static void ip_expire(struct timer_list *t)
net = qp->q.fqdir->net;
rcu_read_lock();
/* Paired with WRITE_ONCE() in fqdir_pre_exit(). */
if (READ_ONCE(qp->q.fqdir->dead))
goto out_rcu_unlock;
spin_lock(&qp->q.lock);
if (qp->q.flags & INET_FRAG_COMPLETE)
@ -146,6 +141,13 @@ static void ip_expire(struct timer_list *t)
qp->q.flags |= INET_FRAG_DROP;
inet_frag_kill(&qp->q, &refs);
/* Paired with WRITE_ONCE() in fqdir_pre_exit(). */
if (READ_ONCE(qp->q.fqdir->dead)) {
inet_frag_queue_flush(&qp->q, 0);
goto out;
}
__IP_INC_STATS(net, IPSTATS_MIB_REASMFAILS);
__IP_INC_STATS(net, IPSTATS_MIB_REASMTIMEOUT);
@ -240,16 +242,10 @@ static int ip_frag_too_far(struct ipq *qp)
static int ip_frag_reinit(struct ipq *qp)
{
unsigned int sum_truesize = 0;
if (!mod_timer(&qp->q.timer, jiffies + qp->q.fqdir->timeout)) {
refcount_inc(&qp->q.refcnt);
if (!mod_timer_pending(&qp->q.timer, jiffies + qp->q.fqdir->timeout))
return -ETIMEDOUT;
}
sum_truesize = inet_frag_rbtree_purge(&qp->q.rb_fragments,
SKB_DROP_REASON_FRAG_TOO_FAR);
sub_frag_mem_limit(qp->q.fqdir, sum_truesize);
inet_frag_queue_flush(&qp->q, SKB_DROP_REASON_FRAG_TOO_FAR);
qp->q.flags = 0;
qp->q.len = 0;

View File

@ -4,7 +4,7 @@ config MPTCP
depends on INET
select SKB_EXTENSIONS
select CRYPTO_LIB_SHA256
select CRYPTO
select CRYPTO_LIB_UTILS
help
Multipath TCP (MPTCP) connections send and receive data over multiple
subflows in order to utilize multiple network paths. Each subflow

View File

@ -119,7 +119,8 @@ int mptcp_pm_parse_entry(struct nlattr *attr, struct genl_info *info,
}
if (tb[MPTCP_PM_ADDR_ATTR_FLAGS])
entry->flags = nla_get_u32(tb[MPTCP_PM_ADDR_ATTR_FLAGS]);
entry->flags = nla_get_u32(tb[MPTCP_PM_ADDR_ATTR_FLAGS]) &
MPTCP_PM_ADDR_FLAGS_MASK;
if (tb[MPTCP_PM_ADDR_ATTR_PORT])
entry->addr.port = htons(nla_get_u16(tb[MPTCP_PM_ADDR_ATTR_PORT]));

View File

@ -1623,7 +1623,7 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags)
struct mptcp_sendmsg_info info = {
.flags = flags,
};
bool do_check_data_fin = false;
bool copied = false;
int push_count = 1;
while (mptcp_send_head(sk) && (push_count > 0)) {
@ -1665,7 +1665,7 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags)
push_count--;
continue;
}
do_check_data_fin = true;
copied = true;
}
}
}
@ -1674,11 +1674,14 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags)
if (ssk)
mptcp_push_release(ssk, &info);
/* ensure the rtx timer is running */
if (!mptcp_rtx_timer_pending(sk))
mptcp_reset_rtx_timer(sk);
if (do_check_data_fin)
/* Avoid scheduling the rtx timer if no data has been pushed; the timer
* will be updated on positive acks by __mptcp_cleanup_una().
*/
if (copied) {
if (!mptcp_rtx_timer_pending(sk))
mptcp_reset_rtx_timer(sk);
mptcp_check_send_data_fin(sk);
}
}
static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, bool first)
@ -2766,10 +2769,13 @@ static void __mptcp_retrans(struct sock *sk)
/*
* make the whole retrans decision, xmit, disallow
* fallback atomic
* fallback atomic, note that we can't retrans even
* when an infinite fallback is in progress, i.e. new
* subflows are disallowed.
*/
spin_lock_bh(&msk->fallback_lock);
if (__mptcp_check_fallback(msk)) {
if (__mptcp_check_fallback(msk) ||
!msk->allow_subflows) {
spin_unlock_bh(&msk->fallback_lock);
release_sock(ssk);
goto clear_scheduled;

View File

@ -408,6 +408,9 @@ err_put:
return -1;
err_unreach:
if (!skb->dev)
skb->dev = skb_dst(skb)->dev;
dst_link_failure(skb);
return -1;
}

View File

@ -172,14 +172,14 @@ static int __nf_conncount_add(struct net *net,
struct nf_conn *found_ct;
unsigned int collect = 0;
bool refcounted = false;
int err = 0;
if (!get_ct_or_tuple_from_skb(net, skb, l3num, &ct, &tuple, &zone, &refcounted))
return -ENOENT;
if (ct && nf_ct_is_confirmed(ct)) {
if (refcounted)
nf_ct_put(ct);
return -EEXIST;
err = -EEXIST;
goto out_put;
}
if ((u32)jiffies == list->last_gc)
@ -231,12 +231,16 @@ static int __nf_conncount_add(struct net *net,
}
add_new_node:
if (WARN_ON_ONCE(list->count > INT_MAX))
return -EOVERFLOW;
if (WARN_ON_ONCE(list->count > INT_MAX)) {
err = -EOVERFLOW;
goto out_put;
}
conn = kmem_cache_alloc(conncount_conn_cachep, GFP_ATOMIC);
if (conn == NULL)
return -ENOMEM;
if (conn == NULL) {
err = -ENOMEM;
goto out_put;
}
conn->tuple = tuple;
conn->zone = *zone;
@ -249,7 +253,7 @@ add_new_node:
out_put:
if (refcounted)
nf_ct_put(ct);
return 0;
return err;
}
int nf_conncount_add_skb(struct net *net,
@ -456,11 +460,10 @@ restart:
rb_link_node_rcu(&rbconn->node, parent, rbnode);
rb_insert_color(&rbconn->node, root);
if (refcounted)
nf_ct_put(ct);
}
out_unlock:
if (refcounted)
nf_ct_put(ct);
spin_unlock_bh(&nf_conncount_locks[hash]);
return count;
}

View File

@ -2487,6 +2487,7 @@ void nf_conntrack_cleanup_net(struct net *net)
void nf_conntrack_cleanup_net_list(struct list_head *net_exit_list)
{
struct nf_ct_iter_data iter_data = {};
unsigned long start = jiffies;
struct net *net;
int busy;
@ -2507,6 +2508,8 @@ i_see_dead_people:
busy = 1;
}
if (busy) {
DEBUG_NET_WARN_ONCE(time_after(jiffies, start + 60 * HZ),
"conntrack cleanup blocked for 60s");
schedule();
goto i_see_dead_people;
}

View File

@ -250,6 +250,9 @@ static void nft_dev_forward_path(const struct nft_pktinfo *pkt,
if (nft_dev_fill_forward_path(route, dst, ct, dir, ha, &stack) >= 0)
nft_dev_path_info(&stack, &info, ha, &ft->data);
if (info.outdev)
route->tuple[dir].out.ifindex = info.outdev->ifindex;
if (!info.indev || !nft_flowtable_find_dev(info.indev, ft))
return;
@ -269,7 +272,6 @@ static void nft_dev_forward_path(const struct nft_pktinfo *pkt,
route->tuple[!dir].in.num_encaps = info.num_encaps;
route->tuple[!dir].in.ingress_vlans = info.ingress_vlans;
route->tuple[dir].out.ifindex = info.outdev->ifindex;
if (info.xmit_type == FLOW_OFFLOAD_XMIT_DIRECT) {
memcpy(route->tuple[dir].out.h_source, info.h_source, ETH_ALEN);

View File

@ -294,25 +294,13 @@ nf_nat_used_tuple_new(const struct nf_conntrack_tuple *tuple,
ct = nf_ct_tuplehash_to_ctrack(thash);
/* NB: IP_CT_DIR_ORIGINAL should be impossible because
* nf_nat_used_tuple() handles origin collisions.
*
* Handle remote chance other CPU confirmed its ct right after.
*/
if (thash->tuple.dst.dir != IP_CT_DIR_REPLY)
goto out;
/* clashing connection subject to NAT? Retry with new tuple. */
if (READ_ONCE(ct->status) & uses_nat)
goto out;
if (nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple,
&ignored_ct->tuplehash[IP_CT_DIR_REPLY].tuple) &&
nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_REPLY].tuple,
&ignored_ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple)) {
&ignored_ct->tuplehash[IP_CT_DIR_REPLY].tuple))
taken = false;
goto out;
}
out:
nf_ct_put(ct);
return taken;

View File

@ -123,6 +123,29 @@ static void nft_validate_state_update(struct nft_table *table, u8 new_validate_s
table->validate_state = new_validate_state;
}
static bool nft_chain_vstate_valid(const struct nft_ctx *ctx,
const struct nft_chain *chain)
{
const struct nft_base_chain *base_chain;
enum nft_chain_types type;
u8 hooknum;
if (WARN_ON_ONCE(!nft_is_base_chain(ctx->chain)))
return false;
base_chain = nft_base_chain(ctx->chain);
hooknum = base_chain->ops.hooknum;
type = base_chain->type->type;
/* chain is already validated for this call depth */
if (chain->vstate.depth >= ctx->level &&
chain->vstate.hook_mask[type] & BIT(hooknum))
return true;
return false;
}
static void nf_tables_trans_destroy_work(struct work_struct *w);
static void nft_trans_gc_work(struct work_struct *work);
@ -4079,6 +4102,29 @@ static void nf_tables_rule_release(const struct nft_ctx *ctx, struct nft_rule *r
nf_tables_rule_destroy(ctx, rule);
}
static void nft_chain_vstate_update(const struct nft_ctx *ctx, struct nft_chain *chain)
{
const struct nft_base_chain *base_chain;
enum nft_chain_types type;
u8 hooknum;
/* ctx->chain must hold the calling base chain. */
if (WARN_ON_ONCE(!nft_is_base_chain(ctx->chain))) {
memset(&chain->vstate, 0, sizeof(chain->vstate));
return;
}
base_chain = nft_base_chain(ctx->chain);
hooknum = base_chain->ops.hooknum;
type = base_chain->type->type;
BUILD_BUG_ON(BIT(NF_INET_NUMHOOKS) > U8_MAX);
chain->vstate.hook_mask[type] |= BIT(hooknum);
if (chain->vstate.depth < ctx->level)
chain->vstate.depth = ctx->level;
}
/** nft_chain_validate - loop detection and hook validation
*
* @ctx: context containing call depth and base chain
@ -4088,15 +4134,25 @@ static void nf_tables_rule_release(const struct nft_ctx *ctx, struct nft_rule *r
* and set lookups until either the jump limit is hit or all reachable
* chains have been validated.
*/
int nft_chain_validate(const struct nft_ctx *ctx, const struct nft_chain *chain)
int nft_chain_validate(const struct nft_ctx *ctx, struct nft_chain *chain)
{
struct nft_expr *expr, *last;
struct nft_rule *rule;
int err;
BUILD_BUG_ON(NFT_JUMP_STACK_SIZE > 255);
if (ctx->level == NFT_JUMP_STACK_SIZE)
return -EMLINK;
if (ctx->level > 0) {
/* jumps to base chains are not allowed. */
if (nft_is_base_chain(chain))
return -ELOOP;
if (nft_chain_vstate_valid(ctx, chain))
return 0;
}
list_for_each_entry(rule, &chain->rules, list) {
if (fatal_signal_pending(current))
return -EINTR;
@ -4115,8 +4171,11 @@ int nft_chain_validate(const struct nft_ctx *ctx, const struct nft_chain *chain)
if (err < 0)
return err;
}
cond_resched();
}
nft_chain_vstate_update(ctx, chain);
return 0;
}
EXPORT_SYMBOL_GPL(nft_chain_validate);
@ -4128,7 +4187,7 @@ static int nft_table_validate(struct net *net, const struct nft_table *table)
.net = net,
.family = table->family,
};
int err;
int err = 0;
list_for_each_entry(chain, &table->chains, list) {
if (!nft_is_base_chain(chain))
@ -4137,12 +4196,14 @@ static int nft_table_validate(struct net *net, const struct nft_table *table)
ctx.chain = chain;
err = nft_chain_validate(&ctx, chain);
if (err < 0)
return err;
cond_resched();
goto err;
}
return 0;
err:
list_for_each_entry(chain, &table->chains, list)
memset(&chain->vstate, 0, sizeof(chain->vstate));
return err;
}
int nft_setelem_validate(const struct nft_ctx *ctx, struct nft_set *set,
@ -11676,21 +11737,10 @@ static int nft_validate_register_store(const struct nft_ctx *ctx,
enum nft_data_types type,
unsigned int len)
{
int err;
switch (reg) {
case NFT_REG_VERDICT:
if (type != NFT_DATA_VERDICT)
return -EINVAL;
if (data != NULL &&
(data->verdict.code == NFT_GOTO ||
data->verdict.code == NFT_JUMP)) {
err = nft_chain_validate(ctx, data->verdict.chain);
if (err < 0)
return err;
}
break;
default:
if (type != NFT_DATA_VALUE)

View File

@ -43,8 +43,10 @@ void nr_output(struct sock *sk, struct sk_buff *skb)
frontlen = skb_headroom(skb);
while (skb->len > 0) {
if ((skbn = sock_alloc_send_skb(sk, frontlen + NR_MAX_PACKET_SIZE, 0, &err)) == NULL)
if ((skbn = sock_alloc_send_skb(sk, frontlen + NR_MAX_PACKET_SIZE, 0, &err)) == NULL) {
kfree_skb(skb);
return;
}
skb_reserve(skbn, frontlen);

View File

@ -2802,13 +2802,20 @@ static int validate_and_copy_set_tun(const struct nlattr *attr,
return err;
}
static bool validate_push_nsh(const struct nlattr *attr, bool log)
static bool validate_push_nsh(const struct nlattr *a, bool log)
{
struct nlattr *nsh_key = nla_data(a);
struct sw_flow_match match;
struct sw_flow_key key;
/* There must be one and only one NSH header. */
if (!nla_ok(nsh_key, nla_len(a)) ||
nla_total_size(nla_len(nsh_key)) != nla_len(a) ||
nla_type(nsh_key) != OVS_KEY_ATTR_NSH)
return false;
ovs_match_init(&match, &key, true, NULL);
return !nsh_key_put_from_nlattr(attr, &match, false, true, log);
return !nsh_key_put_from_nlattr(nsh_key, &match, false, true, log);
}
/* Return false if there are any non-masked bits set.
@ -3389,7 +3396,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
return -EINVAL;
}
mac_proto = MAC_PROTO_NONE;
if (!validate_push_nsh(nla_data(a), log))
if (!validate_push_nsh(a, log))
return -EINVAL;
break;

View File

@ -281,6 +281,15 @@ static int tcf_mirred_to_dev(struct sk_buff *skb, struct tcf_mirred *m,
want_ingress = tcf_mirred_act_wants_ingress(m_eaction);
if (dev == skb->dev && want_ingress == at_ingress) {
pr_notice_once("tc mirred: Loop (%s:%s --> %s:%s)\n",
netdev_name(skb->dev),
at_ingress ? "ingress" : "egress",
netdev_name(dev),
want_ingress ? "ingress" : "egress");
goto err_cant_do;
}
/* All mirred/redirected skbs should clear previous ct info */
nf_reset_ct(skb_to_send);
if (want_ingress && !at_ingress) /* drop dst for egress -> ingress */

View File

@ -652,7 +652,7 @@ static int ets_qdisc_change(struct Qdisc *sch, struct nlattr *opt,
sch_tree_lock(sch);
for (i = nbands; i < oldbands; i++) {
if (i >= q->nstrict && q->classes[i].qdisc->q.qlen)
if (cl_is_active(&q->classes[i]))
list_del_init(&q->classes[i].alist);
qdisc_purge_queue(q->classes[i].qdisc);
}
@ -664,6 +664,10 @@ static int ets_qdisc_change(struct Qdisc *sch, struct nlattr *opt,
q->classes[i].deficit = quanta[i];
}
}
for (i = q->nstrict; i < nstrict; i++) {
if (cl_is_active(&q->classes[i]))
list_del_init(&q->classes[i].alist);
}
WRITE_ONCE(q->nstrict, nstrict);
memcpy(q->prio2band, priomap, sizeof(priomap));

View File

@ -492,6 +492,8 @@ static void sctp_v6_copy_ip_options(struct sock *sk, struct sock *newsk)
struct ipv6_pinfo *newnp, *np = inet6_sk(sk);
struct ipv6_txoptions *opt;
inet_sk(newsk)->inet_opt = NULL;
newnp = inet6_sk(newsk);
rcu_read_lock();

View File

@ -4863,8 +4863,6 @@ static struct sock *sctp_clone_sock(struct sock *sk,
newsp->pf->to_sk_daddr(&asoc->peer.primary_addr, newsk);
newinet->inet_dport = htons(asoc->peer.port);
newsp->pf->copy_ip_options(sk, newsk);
atomic_set(&newinet->inet_id, get_random_u16());
inet_set_bit(MC_LOOP, newsk);
@ -4874,17 +4872,20 @@ static struct sock *sctp_clone_sock(struct sock *sk,
#if IS_ENABLED(CONFIG_IPV6)
if (sk->sk_family == AF_INET6) {
struct ipv6_pinfo *newnp = inet6_sk(newsk);
struct ipv6_pinfo *newnp;
newinet->pinet6 = &((struct sctp6_sock *)newsk)->inet6;
newinet->ipv6_fl_list = NULL;
newnp = inet6_sk(newsk);
memcpy(newnp, inet6_sk(sk), sizeof(struct ipv6_pinfo));
newnp->ipv6_mc_list = NULL;
newnp->ipv6_ac_list = NULL;
}
#endif
newsp->pf->copy_ip_options(sk, newsk);
newsp->do_auto_asconf = 0;
skb_queue_head_init(&newsp->pd_lobby);

View File

@ -199,7 +199,7 @@ static void unix_free_vertices(struct scm_fp_list *fpl)
}
}
static DEFINE_SPINLOCK(unix_gc_lock);
static __cacheline_aligned_in_smp DEFINE_SPINLOCK(unix_gc_lock);
void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver)
{

View File

@ -13,6 +13,7 @@ UAPI_PATH:=../../../../include/uapi/
# need the explicit -D matching what's in /usr, to avoid multiple definitions.
get_hdr_inc=-D$(1) -include $(UAPI_PATH)/linux/$(2)
get_hdr_inc2=-D$(1) -D$(2) -include $(UAPI_PATH)/linux/$(3)
CFLAGS_devlink:=$(call get_hdr_inc,_LINUX_DEVLINK_H_,devlink.h)
CFLAGS_dpll:=$(call get_hdr_inc,_LINUX_DPLL_H,dpll.h)
@ -48,3 +49,4 @@ CFLAGS_tc:= $(call get_hdr_inc,__LINUX_RTNETLINK_H,rtnetlink.h) \
$(call get_hdr_inc,_TC_SKBEDIT_H,tc_act/tc_skbedit.h) \
$(call get_hdr_inc,_TC_TUNNEL_KEY_H,tc_act/tc_tunnel_key.h)
CFLAGS_tcp_metrics:=$(call get_hdr_inc,_LINUX_TCP_METRICS_H,tcp_metrics.h)
CFLAGS_wireguard:=$(call get_hdr_inc2,_LINUX_WIREGUARD_H,_WG_UAPI_WIREGUARD_H,wireguard.h)

View File

@ -755,9 +755,6 @@ TEST_F(iommufd_ioas, get_hw_info)
struct iommu_test_hw_info info;
uint64_t trailing_bytes;
} buffer_larger;
struct iommu_test_hw_info_buffer_smaller {
__u32 flags;
} buffer_smaller;
if (self->device_id) {
uint8_t max_pasid = 0;
@ -789,8 +786,9 @@ TEST_F(iommufd_ioas, get_hw_info)
* the fields within the size range still gets updated.
*/
test_cmd_get_hw_info(self->device_id,
IOMMU_HW_INFO_TYPE_DEFAULT,
&buffer_smaller, sizeof(buffer_smaller));
IOMMU_HW_INFO_TYPE_DEFAULT, &buffer_exact,
offsetofend(struct iommu_test_hw_info,
flags));
test_cmd_get_hw_info_pasid(self->device_id, &max_pasid);
ASSERT_EQ(0, max_pasid);
if (variant->pasid_capable) {

View File

@ -1,4 +1,9 @@
CFLAGS += $(KHDR_INCLUDES) -Wall -Wflex-array-member-not-at-end
top_srcdir := ../../../../..
include $(top_srcdir)/scripts/Makefile.compiler
cc-option = $(call __cc-option, $(CC),,$(1),$(2))
CFLAGS += $(KHDR_INCLUDES) -Wall $(call cc-option,-Wflex-array-member-not-at-end)
TEST_GEN_PROGS := \
diag_uid \

View File

@ -29,6 +29,7 @@ CONFIG_NET_ACT_VLAN=m
CONFIG_NET_CLS_BASIC=m
CONFIG_NET_CLS_FLOWER=m
CONFIG_NET_CLS_MATCHALL=m
CONFIG_NET_CLS_U32=m
CONFIG_NET_EMATCH=y
CONFIG_NET_EMATCH_META=m
CONFIG_NETFILTER=y

View File

@ -138,13 +138,18 @@ install_capture()
defer tc qdisc del dev "$dev" clsact
tc filter add dev "$dev" ingress proto ip pref 104 \
flower skip_hw ip_proto udp dst_port "$VXPORT" \
action pass
u32 match ip protocol 0x11 0xff \
match u16 "$VXPORT" 0xffff at 0x16 \
match u16 0x0800 0xffff at 0x30 \
action pass
defer tc filter del dev "$dev" ingress proto ip pref 104
tc filter add dev "$dev" ingress proto ipv6 pref 106 \
flower skip_hw ip_proto udp dst_port "$VXPORT" \
action pass
u32 match ip6 protocol 0x11 0xff \
match u16 "$VXPORT" 0xffff at 0x2a \
match u16 0x86dd 0xffff at 0x44 \
match u8 0x11 0xff at 0x4c \
action pass
defer tc filter del dev "$dev" ingress proto ipv6 pref 106
}
@ -248,13 +253,6 @@ vx_create()
}
export -f vx_create
vx_wait()
{
# Wait for all the ARP, IGMP etc. noise to settle down so that the
# tunnel is clear for measurements.
sleep 10
}
vx10_create()
{
vx_create vx10 10 id 1000 "$@"
@ -267,18 +265,6 @@ vx20_create()
}
export -f vx20_create
vx10_create_wait()
{
vx10_create "$@"
vx_wait
}
vx20_create_wait()
{
vx20_create "$@"
vx_wait
}
ns_init_common()
{
local ns=$1; shift
@ -554,7 +540,7 @@ ipv4_nomcroute()
# Install a misleading (S,G) rule to attempt to trick the system into
# pushing the packets elsewhere.
adf_install_broken_sg
vx10_create_wait local 192.0.2.100 group "$GROUP4" dev "$swp2"
vx10_create local 192.0.2.100 group "$GROUP4" dev "$swp2"
do_test 4 10 0 "IPv4 nomcroute"
}
@ -562,7 +548,7 @@ ipv6_nomcroute()
{
# Like for IPv4, install a misleading (S,G).
adf_install_broken_sg
vx20_create_wait local 2001:db8:4::1 group "$GROUP6" dev "$swp2"
vx20_create local 2001:db8:4::1 group "$GROUP6" dev "$swp2"
do_test 6 10 0 "IPv6 nomcroute"
}
@ -581,35 +567,35 @@ ipv6_nomcroute_rx()
ipv4_mcroute()
{
adf_install_sg
vx10_create_wait local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute
vx10_create local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute
do_test 4 10 10 "IPv4 mcroute"
}
ipv6_mcroute()
{
adf_install_sg
vx20_create_wait local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute
vx20_create local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute
do_test 6 10 10 "IPv6 mcroute"
}
ipv4_mcroute_rx()
{
adf_install_sg
vx10_create_wait local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute
vx10_create local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute
ipv4_do_test_rx 0 "IPv4 mcroute ping"
}
ipv6_mcroute_rx()
{
adf_install_sg
vx20_create_wait local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute
vx20_create local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute
ipv6_do_test_rx 0 "IPv6 mcroute ping"
}
ipv4_mcroute_changelink()
{
adf_install_sg
vx10_create_wait local 192.0.2.100 group "$GROUP4" dev "$IPMR"
vx10_create local 192.0.2.100 group "$GROUP4" dev "$IPMR"
ip link set dev vx10 type vxlan mcroute
sleep 1
do_test 4 10 10 "IPv4 mcroute changelink"
@ -618,7 +604,7 @@ ipv4_mcroute_changelink()
ipv6_mcroute_changelink()
{
adf_install_sg
vx20_create_wait local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute
vx20_create local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute
ip link set dev vx20 type vxlan mcroute
sleep 1
do_test 6 10 10 "IPv6 mcroute changelink"
@ -627,47 +613,47 @@ ipv6_mcroute_changelink()
ipv4_mcroute_starg()
{
adf_install_starg
vx10_create_wait local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute
vx10_create local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute
do_test 4 10 10 "IPv4 mcroute (*,G)"
}
ipv6_mcroute_starg()
{
adf_install_starg
vx20_create_wait local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute
vx20_create local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute
do_test 6 10 10 "IPv6 mcroute (*,G)"
}
ipv4_mcroute_starg_rx()
{
adf_install_starg
vx10_create_wait local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute
vx10_create local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute
ipv4_do_test_rx 0 "IPv4 mcroute (*,G) ping"
}
ipv6_mcroute_starg_rx()
{
adf_install_starg
vx20_create_wait local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute
vx20_create local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute
ipv6_do_test_rx 0 "IPv6 mcroute (*,G) ping"
}
ipv4_mcroute_noroute()
{
vx10_create_wait local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute
vx10_create local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute
do_test 4 0 0 "IPv4 mcroute, no route"
}
ipv6_mcroute_noroute()
{
vx20_create_wait local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute
vx20_create local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute
do_test 6 0 0 "IPv6 mcroute, no route"
}
ipv4_mcroute_fdb()
{
adf_install_sg
vx10_create_wait local 192.0.2.100 dev "$IPMR" mcroute
vx10_create local 192.0.2.100 dev "$IPMR" mcroute
bridge fdb add dev vx10 \
00:00:00:00:00:00 self static dst "$GROUP4" via "$IPMR"
do_test 4 10 10 "IPv4 mcroute FDB"
@ -676,7 +662,7 @@ ipv4_mcroute_fdb()
ipv6_mcroute_fdb()
{
adf_install_sg
vx20_create_wait local 2001:db8:4::1 dev "$IPMR" mcroute
vx20_create local 2001:db8:4::1 dev "$IPMR" mcroute
bridge -6 fdb add dev vx20 \
00:00:00:00:00:00 self static dst "$GROUP6" via "$IPMR"
do_test 6 10 10 "IPv6 mcroute FDB"
@ -686,7 +672,7 @@ ipv6_mcroute_fdb()
ipv4_mcroute_fdb_oif0()
{
adf_install_sg
vx10_create_wait local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute
vx10_create local 192.0.2.100 group "$GROUP4" dev "$IPMR" mcroute
bridge fdb del dev vx10 00:00:00:00:00:00
bridge fdb add dev vx10 00:00:00:00:00:00 self static dst "$GROUP4"
do_test 4 10 10 "IPv4 mcroute oif=0"
@ -703,7 +689,7 @@ ipv6_mcroute_fdb_oif0()
defer ip -6 route del table local multicast "$GROUP6/128" dev "$IPMR"
adf_install_sg
vx20_create_wait local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute
vx20_create local 2001:db8:4::1 group "$GROUP6" dev "$IPMR" mcroute
bridge -6 fdb del dev vx20 00:00:00:00:00:00
bridge -6 fdb add dev vx20 00:00:00:00:00:00 self static dst "$GROUP6"
do_test 6 10 10 "IPv6 mcroute oif=0"
@ -716,7 +702,7 @@ ipv4_mcroute_fdb_oif0_sep()
adf_install_sg_sep
adf_ip_addr_add lo 192.0.2.120/28
vx10_create_wait local 192.0.2.120 group "$GROUP4" dev "$IPMR" mcroute
vx10_create local 192.0.2.120 group "$GROUP4" dev "$IPMR" mcroute
bridge fdb del dev vx10 00:00:00:00:00:00
bridge fdb add dev vx10 00:00:00:00:00:00 self static dst "$GROUP4"
do_test 4 10 10 "IPv4 mcroute TX!=RX oif=0"
@ -727,7 +713,7 @@ ipv4_mcroute_fdb_oif0_sep_rx()
adf_install_sg_sep_rx lo
adf_ip_addr_add lo 192.0.2.120/28
vx10_create_wait local 192.0.2.120 group "$GROUP4" dev "$IPMR" mcroute
vx10_create local 192.0.2.120 group "$GROUP4" dev "$IPMR" mcroute
bridge fdb del dev vx10 00:00:00:00:00:00
bridge fdb add dev vx10 00:00:00:00:00:00 self static dst "$GROUP4"
ipv4_do_test_rx 0 "IPv4 mcroute TX!=RX oif=0 ping"
@ -738,7 +724,7 @@ ipv4_mcroute_fdb_sep_rx()
adf_install_sg_sep_rx lo
adf_ip_addr_add lo 192.0.2.120/28
vx10_create_wait local 192.0.2.120 group "$GROUP4" dev "$IPMR" mcroute
vx10_create local 192.0.2.120 group "$GROUP4" dev "$IPMR" mcroute
bridge fdb del dev vx10 00:00:00:00:00:00
bridge fdb add \
dev vx10 00:00:00:00:00:00 self static dst "$GROUP4" via lo
@ -750,7 +736,7 @@ ipv6_mcroute_fdb_sep_rx()
adf_install_sg_sep_rx "X$IPMR"
adf_ip_addr_add "X$IPMR" 2001:db8:5::1/64
vx20_create_wait local 2001:db8:5::1 group "$GROUP6" dev "$IPMR" mcroute
vx20_create local 2001:db8:5::1 group "$GROUP6" dev "$IPMR" mcroute
bridge -6 fdb del dev vx20 00:00:00:00:00:00
bridge -6 fdb add dev vx20 00:00:00:00:00:00 \
self static dst "$GROUP6" via "X$IPMR"

View File

@ -280,7 +280,8 @@ tc_rule_stats_get()
local selector=${1:-.packets}; shift
tc -j -s filter show dev $dev $dir pref $pref \
| jq ".[1].options.actions[].stats$selector"
| jq ".[] | select(.options.actions) |
.options.actions[].stats$selector"
}
tc_rule_handle_stats_get()

View File

@ -24,7 +24,8 @@ static inline void ksft_ready(void)
fd = STDOUT_FILENO;
}
write(fd, msg, sizeof(msg));
if (write(fd, msg, sizeof(msg)) < 0)
perror("write()");
if (fd != STDOUT_FILENO)
close(fd);
}
@ -48,7 +49,8 @@ static inline void ksft_wait(void)
fd = STDIN_FILENO;
}
read(fd, &byte, sizeof(byte));
if (read(fd, &byte, sizeof(byte)) < 0)
perror("read()");
if (fd != STDIN_FILENO)
close(fd);
}

Some files were not shown because too many files have changed in this diff Show More