1
0
mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-01-12 01:20:14 +00:00

12 Commits

Author SHA1 Message Date
Eric Biggers
862445d3b9 lib/crypto: s390/sha3: Add optimized one-shot SHA-3 digest functions
Some z/Architecture processors can compute a SHA-3 digest in a single
instruction.  arch/s390/crypto/ already uses this capability to optimize
the SHA-3 crypto_shash algorithms.

Use this capability to implement the sha3_224(), sha3_256(), sha3_384(),
and sha3_512() library functions too.

SHA3-256 benchmark results provided by Harald Freudenberger
(https://lore.kernel.org/r/4188d18bfcc8a64941c5ebd8de10ede2@linux.ibm.com/)
on a z/Architecture machine with "facility 86" (MSA level 12):

    Length (bytes)    Before (MB/s)   After (MB/s)
    ==============    =============   ============
          16                212             225
          64                820             915
         256               1850            3350
        1024               5400            8300
        4096              11200           11300

Note: the original data from Harald was given in the form of a graph for
each length, showing the distribution of throughputs from 500 runs.  I
guesstimated the peak of each one.

Harald also reported that the generic SHA-3 code was at most 259 MB/s
(https://lore.kernel.org/r/c39f6b6c110def0095e5da5becc12085@linux.ibm.com/).
So as expected, the earlier commit that optimized sha3_absorb_blocks()
and sha3_keccakf() is the more important one; it optimized the Keccak
permutation which is the most performance-critical part of SHA-3.
Still, this additional commit does notably improve performance further
on some lengths.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20251026055032.1413733-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:41 -08:00
Eric Biggers
04171105d3 lib/crypto: s390/sha3: Add optimized Keccak functions
Implement sha3_absorb_blocks() and sha3_keccakf() using the hardware-
accelerated SHA-3 support in Message-Security-Assist Extension 6.

This accelerates the SHA3-224, SHA3-256, SHA3-384, SHA3-512, and
SHAKE256 library functions.

Note that arch/s390/crypto/ already has SHA-3 code that uses this
extension, but it is exposed only via crypto_shash.  This commit brings
the same acceleration to the SHA-3 library.  The arch/s390/crypto/
version will become redundant and be removed in later changes.

Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:35 -08:00
Eric Biggers
13cecc526d lib/crypto: chacha: Consolidate into single module
Consolidate the ChaCha code into a single module (excluding
chacha-block-generic.c which remains always built-in for random.c),
similar to various other algorithms:

- Each arch now provides a header file lib/crypto/$(SRCARCH)/chacha.h,
  replacing lib/crypto/$(SRCARCH)/chacha*.c.  The header defines
  chacha_crypt_arch() and hchacha_block_arch().  It is included by
  lib/crypto/chacha.c, and thus the code gets built into the single
  libchacha module, with improved inlining in some cases.

- Whether arch-optimized ChaCha is buildable is now controlled centrally
  by lib/crypto/Kconfig instead of by lib/crypto/$(SRCARCH)/Kconfig.
  The conditions for enabling it remain the same as before, and it
  remains enabled by default.

- Any additional arch-specific translation units for the optimized
  ChaCha code, such as assembly files, are now compiled by
  lib/crypto/Makefile instead of lib/crypto/$(SRCARCH)/Makefile.

This removes the last use for the Makefile and Kconfig files in the
arm64, mips, powerpc, riscv, and s390 subdirectories of lib/crypto/.  So
also remove those files and the references to them.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-08-29 09:50:19 -07:00
Eric Biggers
c4b846ff6e lib/crypto: chacha: Remove unused function chacha_is_arch_optimized()
chacha_is_arch_optimized() is no longer used, so remove it.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-08-29 09:50:19 -07:00
Eric Biggers
5012bd2dc6 lib/crypto: Drop inline from all *_mod_init_arch() functions
Drop 'inline' from all the *_mod_init_arch() functions so that the
compiler will warn about any bugs where they are unused due to not being
wired up properly.  (There are no such bugs currently, so this just
establishes a more robust convention for the future.  Of course, these
functions also tend to get inlined anyway, regardless of the keyword.)

Link: https://lore.kernel.org/r/20250816020457.432040-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-08-27 08:15:35 -07:00
Linus Torvalds
bc46b7cbc5 s390 updates for 6.17 merge window
- Standardize on the __ASSEMBLER__ macro that is provided by GCC
   and Clang compilers and replace __ASSEMBLY__ with  __ASSEMBLER__
   in both uapi and non-uapi headers
 
 - Explicitly include <linux/export.h> in architecture and driver
   files which contain an EXPORT_SYMBOL() and remove the include
   from the files which do not contain the EXPORT_SYMBOL()
 
 - Use the full title of "z/Architecture Principles of Operation"
   manual and the name of a section where facility bits are listed
 
 - Use -D__DISABLE_EXPORTS for files in arch/s390/boot to avoid
   unnecessary slowing down of the build and confusing external
   kABI tools that process symtypes data
 
 - Print additional unrecoverable machine check information to make
   the root cause analysis easier
 
 - Move cmpxchg_user_key() handling to uaccess library code, since
   the generated code is large anyway and there is no benefit if it
   is inlined
 
 - Fix a problem when cmpxchg_user_key() is executing a code with a
   non-default key: if a system is IPL-ed with "LOAD NORMAL", and
   the previous system used storage keys where the fetch-protection
   bit was set for some pages, and the cmpxchg_user_key() is located
   within such page, a protection exception happens
 
 - Either the external call or emergency signal order is used to send
   an IPI to a remote CPU. Use the external order only, since it is at
   least as good and sometimes even better, than the emergency signal
 
 - In case of an early crash the early program check handler prints
   more or less random value of the last breaking event address, since
   it is not initialized properly. Copy the last breaking event address
   from the lowcore to pt_regs to address this
 
 - During STP synchronization check udelay() can not be used, since the
   first CPU modifies tod_clock_base and get_tod_clock_monotonic() might
   return a non-monotonic time. Instead, busy-loop on other CPUs, while
   the the first CPU actually handles the synchronization operation
 
 - When debugging the early kernel boot using QEMU with the -S flag and
   GDB attached, skip the decompressor and start directly in kernel
 
 - Rename PAI Crypto event 4210 according to z16 and z17 "z/Architecture
   Principles of Operation" manual
 
 - Remove the in-kernel time steering support in favour of the new s390
   PTP driver, which allows the kernel clock steered more precisely
 
 - Remove a possible false-positive warning in pte_free_defer(), which
   could be triggered in a valid case KVM guest process is initializing
 -----BEGIN PGP SIGNATURE-----
 
 iI0EABYKADUWIQQrtrZiYVkVzKQcYivNdxKlNrRb8AUCaIJQThccYWdvcmRlZXZA
 bGludXguaWJtLmNvbQAKCRDNdxKlNrRb8FI2APwPnlrj6ZVXzNA6dw0fSUt697rS
 NlaHEORXL8KcfoQh8QD/WwHUe1VNtDG1R5bBn0guR+UytVgR9Tt7LxyKfIgT3ws=
 =tdMb
 -----END PGP SIGNATURE-----

Merge tag 's390-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Alexander Gordeev:

 - Standardize on the __ASSEMBLER__ macro that is provided by GCC and
   Clang compilers and replace __ASSEMBLY__ with __ASSEMBLER__ in both
   uapi and non-uapi headers

 - Explicitly include <linux/export.h> in architecture and driver files
   which contain an EXPORT_SYMBOL() and remove the include from the
   files which do not contain the EXPORT_SYMBOL()

 - Use the full title of "z/Architecture Principles of Operation" manual
   and the name of a section where facility bits are listed

 - Use -D__DISABLE_EXPORTS for files in arch/s390/boot to avoid
   unnecessary slowing down of the build and confusing external kABI
   tools that process symtypes data

 - Print additional unrecoverable machine check information to make the
   root cause analysis easier

 - Move cmpxchg_user_key() handling to uaccess library code, since the
   generated code is large anyway and there is no benefit if it is
   inlined

 - Fix a problem when cmpxchg_user_key() is executing a code with a
   non-default key: if a system is IPL-ed with "LOAD NORMAL", and the
   previous system used storage keys where the fetch-protection bit was
   set for some pages, and the cmpxchg_user_key() is located within such
   page, a protection exception happens

 - Either the external call or emergency signal order is used to send an
   IPI to a remote CPU. Use the external order only, since it is at
   least as good and sometimes even better, than the emergency signal

 - In case of an early crash the early program check handler prints more
   or less random value of the last breaking event address, since it is
   not initialized properly. Copy the last breaking event address from
   the lowcore to pt_regs to address this

 - During STP synchronization check udelay() can not be used, since the
   first CPU modifies tod_clock_base and get_tod_clock_monotonic() might
   return a non-monotonic time. Instead, busy-loop on other CPUs, while
   the the first CPU actually handles the synchronization operation

 - When debugging the early kernel boot using QEMU with the -S flag and
   GDB attached, skip the decompressor and start directly in kernel

 - Rename PAI Crypto event 4210 according to z16 and z17 "z/Architecture
   Principles of Operation" manual

 - Remove the in-kernel time steering support in favour of the new s390
   PTP driver, which allows the kernel clock steered more precisely

 - Remove a possible false-positive warning in pte_free_defer(), which
   could be triggered in a valid case KVM guest process is initializing

* tag 's390-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (29 commits)
  s390/mm: Remove possible false-positive warning in pte_free_defer()
  s390/stp: Default to enabled
  s390/stp: Remove leap second support
  s390/time: Remove in-kernel time steering
  s390/sclp: Use monotonic clock in sclp_sync_wait()
  s390/smp: Use monotonic clock in smp_emergency_stop()
  s390/time: Use monotonic clock in get_cycles()
  s390/pai_crypto: Rename PAI Crypto event 4210
  scripts/gdb/symbols: make lx-symbols skip the s390 decompressor
  s390/boot: Introduce jump_to_kernel() function
  s390/stp: Remove udelay from stp_sync_clock()
  s390/early: Copy last breaking event address to pt_regs
  s390/smp: Remove conditional emergency signal order code usage
  s390/uaccess: Merge cmpxchg_user_key() inline assemblies
  s390/uaccess: Prevent kprobes on cmpxchg_user_key() functions
  s390/uaccess: Initialize code pages executed with non-default access key
  s390/skey: Provide infrastructure for executing with non-default access key
  s390/uaccess: Make cmpxchg_user_key() library code
  s390/page: Add memory clobber to page_set_storage_key()
  s390/page: Cleanup page_set_storage_key() inline assemblies
  ...
2025-07-29 20:17:08 -07:00
Eric Biggers
377982d561 lib/crypto: s390/sha1: Migrate optimized code into library
Instead of exposing the s390-optimized SHA-1 code via s390-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function.  This is much simpler, it makes the SHA-1 library
functions be s390-optimized, and it fixes the longstanding issue where
the s390-optimized SHA-1 code was disabled by default.  SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:11:49 -07:00
Eric Biggers
e96cb9507f lib/crypto: sha256: Consolidate into single module
Consolidate the CPU-based SHA-256 code into a single module, following
what I did with SHA-512:

- Each arch now provides a header file lib/crypto/$(SRCARCH)/sha256.h,
  replacing lib/crypto/$(SRCARCH)/sha256.c.  The header defines
  sha256_blocks() and optionally sha256_mod_init_arch().  It is included
  by lib/crypto/sha256.c, and thus the code gets built into the single
  libsha256 module, with proper inlining and dead code elimination.

- sha256_blocks_generic() is moved from lib/crypto/sha256-generic.c into
  lib/crypto/sha256.c.  It's now a static function marked with
  __maybe_unused, so the compiler automatically eliminates it in any
  cases where it's not used.

- Whether arch-optimized SHA-256 is buildable is now controlled
  centrally by lib/crypto/Kconfig instead of by
  lib/crypto/$(SRCARCH)/Kconfig.  The conditions for enabling it remain
  the same as before, and it remains enabled by default.

- Any additional arch-specific translation units for the optimized
  SHA-256 code (such as assembly files) are now compiled by
  lib/crypto/Makefile instead of lib/crypto/$(SRCARCH)/Makefile.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-04 10:23:11 -07:00
Eric Biggers
9f9846a72e lib/crypto: sha256: Remove sha256_is_arch_optimized()
Remove sha256_is_arch_optimized(), since it is no longer used.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-04 10:23:11 -07:00
Eric Biggers
4c855d5069 lib/crypto: sha256: Propagate sha256_block_state type to implementations
The previous commit made the SHA-256 compression function state be
strongly typed, but it wasn't propagated all the way down to the
implementations of it.  Do that now.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-04 10:22:57 -07:00
Eric Biggers
b8456f7aaf lib/crypto: s390: Move arch/s390/lib/crypto/ into lib/crypto/
Move the contents of arch/s390/lib/crypto/ into lib/crypto/s390/.

The new code organization makes a lot more sense for how this code
actually works and is developed.  In particular, it makes it possible to
build each algorithm as a single module, with better inlining and dead
code elimination.  For a more detailed explanation, see the patchset
which did this for the CRC library code:
https://lore.kernel.org/r/20250607200454.73587-1-ebiggers@kernel.org/.
Also see the patchset which did this for SHA-512:
https://lore.kernel.org/linux-crypto/20250616014019.415791-1-ebiggers@kernel.org/

This is just a preparatory commit, which does the move to get the files
into their new location but keeps them building the same way as before.
Later commits will make the actual improvements to the way the
arch-optimized code is integrated for each algorithm.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20250619191908.134235-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-06-30 09:26:20 -07:00
Eric Biggers
b7b366087e lib/crypto: s390/sha512: Migrate optimized SHA-512 code to library
Instead of exposing the s390-optimized SHA-512 code via s390-specific
crypto_shash algorithms, instead just implement the sha512_blocks()
library function.  This is much simpler, it makes the SHA-512 (and
SHA-384) library functions be s390-optimized, and it fixes the
longstanding issue where the s390-optimized SHA-512 code was disabled by
default.  SHA-512 still remains available through crypto_shash, but
individual architectures no longer need to handle it.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-06-30 09:26:20 -07:00