Compare commits

...

269 Commits

Author SHA1 Message Date
Linus Torvalds
3f9f025213 Merge tag 'random-6.19-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator updates from Jason Donenfeld:

 - Dynamically allocate cpumasks off of the stack if the kernel is
   configured for a lot of CPUs, to handle a -Wframe-larger-than case

 - The removal of next_pseudo_random32() after the last user was
   switched over to the prandom interface

 - The removal of get_random_u{8,16,32,64}_wait() functions, as there
   were no users of those at all

 - Some house keeping changes - a few grammar cleanups in the
   comments, system_unbound_wq was renamed to system_dfl_wq, and
   static_key_initialized no longer needs to be checked

* tag 'random-6.19-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  random: complete sentence of comment
  random: drop check for static_key_initialized
  random: remove unused get_random_var_wait functions
  random: replace use of system_unbound_wq with system_dfl_wq
  random: use offstack cpumask when necessary
  prandom: remove next_pseudo_random32
  media: vivid: use prandom
  random: add missing words in function comments
2025-12-02 19:00:26 -08:00
Linus Torvalds
f617d24606 Merge tag 'fpsimd-on-stack-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull arm64 FPSIMD on-stack buffer updates from Eric Biggers:
 "This is a core arm64 change. However, I was asked to take this because
  most uses of kernel-mode FPSIMD are in crypto or CRC code.

  In v6.8, the size of task_struct on arm64 increased by 528 bytes due
  to the new 'kernel_fpsimd_state' field. This field was added to allow
  kernel-mode FPSIMD code to be preempted.

  Unfortunately, 528 bytes is kind of a lot for task_struct. This
  regression in the task_struct size was noticed and reported.

  Recover that space by making this state be allocated on the stack at
  the beginning of each kernel-mode FPSIMD section.

  To make it easier for all the users of kernel-mode FPSIMD to do that
  correctly, introduce and use a 'scoped_ksimd' abstraction"

* tag 'fpsimd-on-stack-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (23 commits)
  lib/crypto: arm64: Move remaining algorithms to scoped ksimd API
  lib/crypto: arm/blake2b: Move to scoped ksimd API
  arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack
  arm64/fpu: Enforce task-context only for generic kernel mode FPU
  net/mlx5: Switch to more abstract scoped ksimd guard API on arm64
  arm64/xorblocks:  Switch to 'ksimd' scoped guard API
  crypto/arm64: sm4 - Switch to 'ksimd' scoped guard API
  crypto/arm64: sm3 - Switch to 'ksimd' scoped guard API
  crypto/arm64: sha3 - Switch to 'ksimd' scoped guard API
  crypto/arm64: polyval - Switch to 'ksimd' scoped guard API
  crypto/arm64: nhpoly1305 - Switch to 'ksimd' scoped guard API
  crypto/arm64: aes-gcm - Switch to 'ksimd' scoped guard API
  crypto/arm64: aes-blk - Switch to 'ksimd' scoped guard API
  crypto/arm64: aes-ccm - Switch to 'ksimd' scoped guard API
  raid6: Move to more abstract 'ksimd' guard API
  crypto: aegis128-neon - Move to more abstract 'ksimd' guard API
  crypto/arm64: sm4-ce-gcm - Avoid pointless yield of the NEON unit
  crypto/arm64: sm4-ce-ccm - Avoid pointless yield of the NEON unit
  crypto/arm64: aes-ce-ccm - Avoid pointless yield of the NEON unit
  lib/crc: Switch ARM and arm64 to 'ksimd' scoped guard API
  ...
2025-12-02 18:53:50 -08:00
Linus Torvalds
906003e151 Merge tag 'libcrypto-at-least-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull 'at_least' array size update from Eric Biggers:
 "C supports lower bounds on the sizes of array parameters, using the
  static keyword as follows: 'void f(int a[static 32]);'. This allows
  the compiler to warn about a too-small array being passed.

  As discussed, this reuse of the 'static' keyword, while standard, is a
  bit obscure. Therefore, add an alias 'at_least' to compiler_types.h.

  Then, add this 'at_least' annotation to the array parameters of
  various crypto library functions"

* tag 'libcrypto-at-least-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  lib/crypto: sha2: Add at_least decoration to fixed-size array params
  lib/crypto: sha1: Add at_least decoration to fixed-size array params
  lib/crypto: poly1305: Add at_least decoration to fixed-size array params
  lib/crypto: md5: Add at_least decoration to fixed-size array params
  lib/crypto: curve25519: Add at_least decoration to fixed-size array params
  lib/crypto: chacha: Add at_least decoration to fixed-size array params
  lib/crypto: chacha20poly1305: Statically check fixed array lengths
  compiler_types: introduce at_least parameter decoration pseudo keyword
  wifi: iwlwifi: trans: rename at_least variable to min_mode
2025-12-02 18:26:54 -08:00
Linus Torvalds
8f4c9978de Merge tag 'aes-gcm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull AES-GCM optimizations from Eric Biggers:
 "More optimizations and cleanups for the x86_64 AES-GCM code:

   - Add a VAES+AVX2 optimized implementation of AES-GCM. This is very
     helpful on CPUs that have VAES but not AVX512, such as AMD Zen 3.

   - Make the VAES+AVX512 optimized implementation of AES-GCM handle
     large amounts of associated data efficiently.

   - Remove the "avx10_256" implementation of AES-GCM. It's superseded
     by the VAES+AVX2 optimized implementation.

   - Rename the "avx10_512" implementation to "avx512"

  Overall, this fills in a gap where AES-GCM wasn't fully optimized on
  some recent CPUs. It also drops code that won't be as useful as
  initially expected due to AVX10/256 being dropped from the AVX10 spec"

* tag 'aes-gcm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  crypto: x86/aes-gcm-vaes-avx2 - initialize full %rax return register
  crypto: x86/aes-gcm - optimize long AAD processing with AVX512
  crypto: x86/aes-gcm - optimize AVX512 precomputation of H^2 from H^1
  crypto: x86/aes-gcm - revise some comments in AVX512 code
  crypto: x86/aes-gcm - reorder AVX512 precompute and aad_update functions
  crypto: x86/aes-gcm - clean up AVX512 code to assume 512-bit vectors
  crypto: x86/aes-gcm - rename avx10 and avx10_512 to avx512
  crypto: x86/aes-gcm - remove VAES+AVX10/256 optimized code
  crypto: x86/aes-gcm - add VAES+AVX2 optimized code
2025-12-02 18:24:35 -08:00
Linus Torvalds
db425f7a0b Merge tag 'libcrypto-tests-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library test updates from Eric Biggers:

 - Add KUnit test suites for SHA-3, BLAKE2b, and POLYVAL. These are the
   algorithms that have new crypto library interfaces this cycle.

 - Remove the crypto_shash POLYVAL tests. They're no longer needed
   because POLYVAL support was removed from crypto_shash. Better POLYVAL
   test coverage is now provided via the KUnit test suite.

* tag 'libcrypto-tests-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  crypto: testmgr - Remove polyval tests
  lib/crypto: tests: Add KUnit tests for POLYVAL
  lib/crypto: tests: Add additional SHAKE tests
  lib/crypto: tests: Add SHA3 kunit tests
  lib/crypto: tests: Add KUnit tests for BLAKE2b
2025-12-02 18:20:06 -08:00
Linus Torvalds
5abe8d8efc Merge tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library updates from Eric Biggers:
 "This is the main crypto library pull request for 6.19. It includes:

   - Add SHA-3 support to lib/crypto/, including support for both the
     hash functions and the extendable-output functions. Reimplement the
     existing SHA-3 crypto_shash support on top of the library.

     This is motivated mainly by the upcoming support for the ML-DSA
     signature algorithm, which needs the SHAKE128 and SHAKE256
     functions. But even on its own it's a useful cleanup.

     This also fixes the longstanding issue where the
     architecture-optimized SHA-3 code was disabled by default.

   - Add BLAKE2b support to lib/crypto/, and reimplement the existing
     BLAKE2b crypto_shash support on top of the library.

     This is motivated mainly by btrfs, which supports BLAKE2b
     checksums. With this change, all btrfs checksum algorithms now have
     library APIs. btrfs is planned to start just using the library
     directly.

     This refactor also improves consistency between the BLAKE2b code
     and BLAKE2s code. And as usual, it also fixes the issue where the
     architecture-optimized BLAKE2b code was disabled by default.

   - Add POLYVAL support to lib/crypto/, replacing the existing POLYVAL
     support in crypto_shash. Reimplement HCTR2 on top of the library.

     This simplifies the code and improves HCTR2 performance. As usual,
     it also makes the architecture-optimized code be enabled by
     default. The generic implementation of POLYVAL is greatly improved
     as well.

   - Clean up the BLAKE2s code

   - Add FIPS self-tests for SHA-1, SHA-2, and SHA-3"

* tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (37 commits)
  fscrypt: Drop obsolete recommendation to enable optimized POLYVAL
  crypto: polyval - Remove the polyval crypto_shash
  crypto: hctr2 - Convert to use POLYVAL library
  lib/crypto: x86/polyval: Migrate optimized code into library
  lib/crypto: arm64/polyval: Migrate optimized code into library
  lib/crypto: polyval: Add POLYVAL library
  crypto: polyval - Rename conflicting functions
  lib/crypto: x86/blake2s: Use vpternlogd for 3-input XORs
  lib/crypto: x86/blake2s: Avoid writing back unchanged 'f' value
  lib/crypto: x86/blake2s: Improve readability
  lib/crypto: x86/blake2s: Use local labels for data
  lib/crypto: x86/blake2s: Drop check for nblocks == 0
  lib/crypto: x86/blake2s: Fix 32-bit arg treated as 64-bit
  lib/crypto: arm, arm64: Drop filenames from file comments
  lib/crypto: arm/blake2s: Fix some comments
  crypto: s390/sha3 - Remove superseded SHA-3 code
  crypto: sha3 - Reimplement using library API
  crypto: jitterentropy - Use default sha3 implementation
  lib/crypto: s390/sha3: Add optimized one-shot SHA-3 digest functions
  lib/crypto: sha3: Support arch overrides of one-shot digest functions
  ...
2025-12-02 18:01:03 -08:00
Linus Torvalds
619f4edc8d Merge tag 'thermal-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
 "These add Nova Lake processor support to the Intel thermal drivers and
  DPTF code, update thermal control documentation, simplify the ACPI
  DPTF code related to thermal control, add QCS8300 compatible to the
  tsens thermal DT bindings, add DT bindings for NXP i.MX91 thermal
  module and add support for it to the imx91 thermal driver, update a
  few other thermal drivers and fix a format string issue in a thermal
  utility:

   - Add Nova Lake processor thermal device to the int340x
     processor_thermal driver, add DLVR support for Nova Lake to it, add
     Nova Lake support to the ACPI DPTF code, document thermal
     throttling on Intel platforms, and update workload type hint
     interface documentation (Srinivas Pandruvada)

   - Remove int340x thermal scan handler from the ACPI DPTF code because
     it turned out to be unnecessary (Slawomir Rosek)

   - Clean up the Intel int340x thermal driver (Kaushlendra Kumar)

   - Document the RZ/V2H TSU DT bindings (Ovidiu Panait)

   - Document the Kaanapali Temperature Sensor (Manaf Meethalavalappu
     Pallikunhi)

   - Document R-Car Gen4 and RZ/G2 support in driver comment (Marek
     Vasut)

   - Convert to DEFINE_SIMPLE_DEV_PM_OPS() in R-Car [Gen3] (Geert
     Uytterhoeven)

   - Fix format string bug in thermal-engine (Malaya Kumar Rout)

   - Make ipq5018 tsens standalone compatible (George Moussalem)

   - Add the QCS8300 compatible for QCom Tsens (Gaurav Kohli)

   - Add support for the NXP i.MX91 thermal module, including the DT
     bindings (Pengfei Li)"

* tag 'thermal-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal/drivers/imx91: Add support for i.MX91 thermal monitoring unit
  dt-bindings: thermal: fsl,imx91-tmu: add bindings for NXP i.MX91 thermal module
  dt-bindings: thermal: tsens: Add QCS8300 compatible
  dt-bindings: thermal: qcom-tsens: make ipq5018 tsens standalone compatible
  tools/thermal/thermal-engine: Fix format string bug in thermal-engine
  docs: driver-api/thermal/intel_dptf: Add new workload type hint
  thermal/drivers/rcar_gen3: Convert to DEFINE_SIMPLE_DEV_PM_OPS()
  thermal/drivers/rcar: Convert to DEFINE_SIMPLE_DEV_PM_OPS()
  Documentation: thermal: Document thermal throttling on Intel platforms
  ACPI: DPTF: Support Nova Lake
  thermal: intel: int340x: Add DLVR support for Nova Lake
  thermal: int340x: processor_thermal: Add Nova Lake processor thermal device
  thermal: intel: int340x: Replace sprintf() with sysfs_emit()
  thermal: intel: int340x: Use symbolic constant for UUID comparison
  thermal/drivers/rcar_gen3: Document R-Car Gen4 and RZ/G2 support in driver comment
  dt-bindings: thermal: qcom-tsens: document the Kaanapali Temperature Sensor
  dt-bindings: thermal: r9a09g047-tsu: Document RZ/V2H TSU
  ACPI: DPTF: Remove int340x thermal scan handler
  thermal: intel: Select INT340X_THERMAL from INTEL_SOC_DTS_THERMAL
2025-12-02 17:49:12 -08:00
Linus Torvalds
d348c22394 Merge tag 'pm-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
 "There are quite a few interesting things here, including new hardware
  support, new features, some bug fixes and documentation updates. In
  addition, there are a usual bunch of minor fixes and cleanups all
  over.

  In the new hardware support category, there are intel_pstate and
  intel_rapl driver updates to support new processors, Panther Lake,
  Wildcat Lake, Noval Lake, and Diamond Rapids in the OOB mode, OPP and
  bandwidth allocation support in the tegra186 cpufreq driver, and
  JH7110S SOC support in dt-platdev cpufreq.

  The new features are the PM QoS CPU latency limit for suspend-to-idle,
  the netlink support for the energy model management, support for
  terminating system suspend via a wakeup event during the sync of file
  systems, configurable number of hibernation compression threads, the
  runtime PM auto-cleanup macros, and the "poweroff" PM event that is
  expected to be used during system shutdown.

  Bugs are mostly fixed in cpuidle governors, but there are also fixes
  elsewhere, like in the amd-pstate cpufreq driver.

  Documentation updates include, but are not limited to, a new doc on
  debugging shutdown hangs, cross-referencing fixes and cleanups in the
  intel_pstate documentation, and updates of comments in the core
  hibernation code.

  Specifics:

   - Introduce and document a QoS limit on CPU exit latency during
     wakeup from suspend-to-idle (Ulf Hansson)

   - Add support for building libcpupower statically (Zuo An)

   - Add support for sending netlink notifications to user space on
     energy model updates (Changwoo Mini, Peng Fan)

   - Minor improvements to the Rust OPP interface (Tamir Duberstein)

   - Fixes to scope-based pointers in the OPP library (Viresh Kumar)

   - Use residency threshold in polling state override decisions in the
     menu cpuidle governor (Aboorva Devarajan)

   - Add sanity check for exit latency and target residency in the
     cpufreq core (Rafael Wysocki)

   - Use this_cpu_ptr() where possible in the teo governor (Christian
     Loehle)

   - Rework the handling of tick wakeups in the teo cpuidle governor to
     increase the likelihood of stopping the scheduler tick in the cases
     when tick wakeups can be counted as non-timer ones (Rafael Wysocki)

   - Fix a reverse condition in the teo cpuidle governor and drop a
     misguided target residency check from it (Rafael Wysocki)

   - Clean up multiple minor defects in the teo cpuidle governor (Rafael
     Wysocki)

   - Update header inclusion to make it follow the Include What You Use
     principle (Andy Shevchenko)

   - Enable MSR-based RAPL PMU support in the intel_rapl power capping
     driver and arrange for using it on the Panther Lake and Wildcat
     Lake processors (Kuppuswamy Sathyanarayanan)

   - Add support for Nova Lake and Wildcat Lake processors to the
     intel_rapl power capping driver (Kaushlendra Kumar, Srinivas
     Pandruvada)

   - Add OPP and bandwidth support for Tegra186 (Aaron Kling)

   - Optimizations for parameter array handling in the amd-pstate
     cpufreq driver (Mario Limonciello)

   - Fix for mode changes with offline CPUs in the amd-pstate cpufreq
     driver (Gautham Shenoy)

   - Preserve freq_table_sorted across suspend/hibernate in the cpufreq
     core (Zihuan Zhang)

   - Adjust energy model rules for Intel hybrid platforms in the
     intel_pstate cpufreq driver and improve printing of debug messages
     in it (Rafael Wysocki)

   - Replace deprecated strcpy() in cpufreq_unregister_governor()
     (Thorsten Blum)

   - Fix duplicate hyperlink target errors in the intel_pstate cpufreq
     driver documentation and use :ref: directive for internal linking
     in it (Swaraj Gaikwad, Bagas Sanjaya)

   - Add Diamond Rapids OOB mode support to the intel_pstate cpufreq
     driver (Kuppuswamy Sathyanarayanan)

   - Use mutex guard for driver locking in the intel_pstate driver and
     eliminate some code duplication from it (Rafael Wysocki)

   - Replace udelay() with usleep_range() in ACPI cpufreq (Kaushlendra
     Kumar)

   - Minor improvements to various cpufreq drivers (Christian Marangi,
     Hal Feng, Jie Zhan, Marco Crivellari, Miaoqian Lin, and Shuhao Fu)

   - Replace snprintf() with scnprintf() in show_trace_dev_match()
     (Kaushlendra Kumar)

   - Fix memory allocation error handling in pm_vt_switch_required()
     (Malaya Kumar Rout)

   - Introduce CALL_PM_OP() macro and use it to simplify code in generic
     PM operations (Kaushlendra Kumar)

   - Add module param to backtrace all CPUs in the device power
     management watchdog (Sergey Senozhatsky)

   - Rework message printing in swsusp_save() (Rafael Wysocki)

   - Make it possible to change the number of hibernation compression
     threads (Xueqin Luo)

   - Clarify that only cgroup1 freezer uses PM freezer (Tejun Heo)

   - Add document on debugging shutdown hangs to PM documentation and
     correct a mistaken configuration option in it (Mario Limonciello)

   - Shut down wakeup source timer before removing the wakeup source
     from the list (Kaushlendra Kumar, Rafael Wysocki)

   - Introduce new PMSG_POWEROFF event for system shutdown handling with
     the help of PM device callbacks (Mario Limonciello)

   - Make pm_test delay interruptible by wakeup events (Riwen Lu)

   - Clean up kernel-doc comment style usage in the core hibernation
     code and remove unuseful comments from it (Sunday Adelodun, Rafael
     Wysocki)

   - Add support for handling wakeup events and aborting the suspend
     process while it is syncing file systems (Samuel Wu, Rafael
     Wysocki)

   - Add WQ_UNBOUND to pm_wq workqueue (Marco Crivellari)

   - Add runtime PM wrapper macros for ACQUIRE()/ACQUIRE_ERR() and use
     them in the PCI core and the ACPI TAD driver (Rafael Wysocki)

   - Improve runtime PM in the ACPI TAD driver (Rafael Wysocki)

   - Update pm_runtime_allow/forbid() documentation (Rafael Wysocki)

   - Fix typos in runtime.c comments (Malaya Kumar Rout)

   - Move governor.h from devfreq under include/linux/ and rename to
     devfreq-governor.h to allow devfreq governor definitions in out of
     drivers/devfreq/ (Dmitry Baryshkov)

   - Use min() to improve readability in tegra30-devfreq.c (Thorsten
     Blum)

   - Fix potential use-after-free issue of OPP handling in
     hisi_uncore_freq.c (Pengjie Zhang)

   - Fix typo in DFSO_DOWNDIFFERENTIAL macro name in
     governor_simpleondemand.c in devfreq (Riwen Lu)"

* tag 'pm-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (96 commits)
  PM / devfreq: Fix typo in DFSO_DOWNDIFFERENTIAL macro name
  cpuidle: Warn instead of bailing out if target residency check fails
  cpuidle: Update header inclusion
  Documentation: power/cpuidle: Document the CPU system wakeup latency QoS
  cpuidle: Respect the CPU system wakeup QoS limit for cpuidle
  sched: idle: Respect the CPU system wakeup QoS limit for s2idle
  pmdomain: Respect the CPU system wakeup QoS limit for cpuidle
  pmdomain: Respect the CPU system wakeup QoS limit for s2idle
  PM: QoS: Introduce a CPU system wakeup QoS limit
  cpuidle: governors: teo: Add missing space to the description
  PM: hibernate: Extra cleanup of comments in swap handling code
  PM / devfreq: tegra30: use min to simplify actmon_cpu_to_emc_rate
  PM / devfreq: hisi: Fix potential UAF in OPP handling
  PM / devfreq: Move governor.h to a public header location
  powercap: intel_rapl: Enable MSR-based RAPL PMU support
  powercap: intel_rapl: Prepare read_raw() interface for atomic-context callers
  cpufreq: qcom-nvmem: fix compilation warning for qcom_cpufreq_ipq806x_match_list
  PM: sleep: Call pm_sleep_fs_sync() instead of ksys_sync_helper()
  PM: sleep: Add support for wakeup during filesystem sync
  cpufreq: ACPI: Replace udelay() with usleep_range()
  ...
2025-12-02 17:31:22 -08:00
Linus Torvalds
959bfe496b Merge tag 'acpi-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
 "These add Microsoft fan extensions support to the ACPI fan driver, fix
  a bug in ACPICA, update other ACPI drivers (processor, time and alarm
  device), update ACPI power management code and ACPI device properties
  management, and fix an ACPI utility:

   - Avoid walking the ACPI namespace in the AML interpreter if the
     starting node cannot be determined (Cryolitia PukNgae)

   - Use min() instead of min_t() in the ACPI device properties handling
     code to avoid discarding significant bits (David Laight)

   - Fix potential fwnode refcount leak in
     acpi_fwnode_graph_parse_endpoint() that may prevent the parent
     fwnode from being released (Haotian Zhang)

   - Rework acpi_graph_get_next_endpoint() to use ACPI functions only,
     remove unnecessary conditionals from it to make it easier to
     follow, and make acpi_get_next_subnode() static (Sakari Ailus)

   - Drop unused function acpi_get_lps0_constraint(), make some
     Low-Power S0 callback functions for suspend-to-idle static, and
     rearrange the code retrieving Low-Power S0 constraints so it only
     runs when the constraints are actually used (Rafael Wysocki)

   - Drop redundant locking from the ACPI battery driver (Rafael
     Wysocki)

   - Improve runtime PM in the ACPI time and alarm device (TAD) driver
     using guard macros and rearrange code related to runtime PM in
     acpi_tad_remove() (Rafael Wysocki)

   - Add support for Microsoft fan extensions to the ACPI fan driver
     along with notification support and work around a 64-bit firmware
     bug in that driver (Armin Wolf)

   - Use ACPI_FREE() to free ACPI buffer in the ACPI DPTF code
     (Kaushlendra Kumar)

   - Fix a memory leak and a resource leak in the ACPI pfrut utility
     (Malaya Kumar Rout)

   - Replace `core::mem::zeroed` with `pin_init::zeroed` in the ACPI
     Rust code (Siyuan Huang)

   - Update the ACPI code to use the new style of allocating workqueues
     and new global workqueues (Marco Crivellari)

   - Fix two spelling mistakes in the ACPI code (Chu Guangqing)

   - Fix ISAPNP to generate uevents to auto-load modules (René Rebe)

   - Relocate the state flags initialization in the ACPI processor idle
     driver and drop redundant C-state count checks from it (Huisong Li)

   - Fix map_x2apic_id() in the ACPI processor core driver for
     amd-pstate on am4 (René Rebe)"

* tag 'acpi-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (30 commits)
  ACPI: PM: Fix a spelling mistake
  ACPI: LPSS: Fix a spelling mistake
  ACPI: processor_core: fix map_x2apic_id for amd-pstate on am4
  ACPICA: Avoid walking the Namespace if start_node is NULL
  ACPI: tools: pfrut: fix memory leak and resource leak in pfrut.c
  ACPI: property: use min() instead of min_t()
  PNP: Fix ISAPNP to generate uevents to auto-load modules
  ACPI: property: Fix fwnode refcount leak in acpi_fwnode_graph_parse_endpoint()
  ACPI: DPTF: Use ACPI_FREE() for ACPI buffer deallocation
  ACPI: processor: idle: Drop redundant C-state count checks
  ACPI: thermal: Add WQ_PERCPU to alloc_workqueue() users
  ACPI: OSL: Add WQ_PERCPU to alloc_workqueue() users
  ACPI: EC: Add WQ_PERCPU to alloc_workqueue() users
  ACPI: OSL: replace use of system_wq with system_percpu_wq
  ACPI: scan: replace use of system_unbound_wq with system_dfl_wq
  ACPI: fan: Add support for Microsoft fan extensions
  ACPI: fan: Add hwmon notification support
  ACPI: fan: Add basic notification support
  ACPI: TAD: Improve runtime PM using guard macros
  ACPI: TAD: Rearrange runtime PM operations in acpi_tad_remove()
  ...
2025-12-02 17:24:03 -08:00
Rafael J. Wysocki
7cede21e9f Merge branches 'pm-qos' and 'pm-tools'
Merge PM QoS updates and a cpupower utility update for 6.19-rc1:

 - Introduce and document a QoS limit on CPU exit latency during wakeup
   from suspend-to-idle (Ulf Hansson)

 - Add support for building libcpupower statically (Zuo An)

* pm-qos:
  Documentation: power/cpuidle: Document the CPU system wakeup latency QoS
  cpuidle: Respect the CPU system wakeup QoS limit for cpuidle
  sched: idle: Respect the CPU system wakeup QoS limit for s2idle
  pmdomain: Respect the CPU system wakeup QoS limit for cpuidle
  pmdomain: Respect the CPU system wakeup QoS limit for s2idle
  PM: QoS: Introduce a CPU system wakeup QoS limit

* pm-tools:
  tools/power/cpupower: Support building libcpupower statically
2025-11-28 16:50:45 +01:00
Rafael J. Wysocki
638757c9c9 Merge branches 'pm-em' and 'pm-opp'
Merge energy model management updates and operating performance points
(OPP) library changes for 6.19-rc1:

 - Add support for sending netlink notifications to user space on energy
   model updates (Changwoo Mini, Peng Fan)

 - Minor improvements to the Rust OPP interface (Tamir Duberstein)

 - Fixes to scope-based pointers in the OPP library (Viresh Kumar)

* pm-em:
  PM: EM: Add to em_pd_list only when no failure
  PM: EM: Notify an event when the performance domain changes
  PM: EM: Implement em_notify_pd_created/updated()
  PM: EM: Implement em_notify_pd_deleted()
  PM: EM: Implement em_nl_get_pd_table_doit()
  PM: EM: Implement em_nl_get_pds_doit()
  PM: EM: Add an iterator and accessor for the performance domain
  PM: EM: Add a skeleton code for netlink notification
  PM: EM: Add em.yaml and autogen files
  PM: EM: Expose the ID of a performance domain via debugfs
  PM: EM: Assign a unique ID when creating a performance domain

* pm-opp:
  rust: opp: simplify callers of `to_c_str_array`
  OPP: Initialize scope-based pointers inline
  rust: opp: fix broken rustdoc link
2025-11-28 16:44:00 +01:00
Rafael J. Wysocki
bf7ae1773e Merge branches 'pm-cpuidle' and 'pm-powercap'
Merge cpuidle and power capping updates for 6.19-rc1:

 - Use residency threshold in polling state override decisions in the
   menu cpuidle governor (Aboorva Devarajan)

 - Add sanity check for exit latency and target residency in the cpufreq
   core (Rafael Wysocki)

 - Use this_cpu_ptr() where possible in the teo governor (Christian
   Loehle)

 - Rework the handling of tick wakeups in the teo cpuidle governor to
   increase the likelihood of stopping the scheduler tick in the cases
   when tick wakeups can be counted as non-timer ones (Rafael Wysocki)

 - Fix a reverse condition in the teo cpuidle governor and drop a
   misguided target residency check from it (Rafael Wysocki)

 - Clean up muliple minor defects in the teo cpuidle governor (Rafael
   Wysocki)

 - Update header inclusion to make it follow the Include What You Use
   principle (Andy Shevchenko)

 - Enable MSR-based RAPL PMU support in the intel_rapl power capping
   driver and arrange for using it on the Panther Lake and Wildcat Lake
   processors (Kuppuswamy Sathyanarayanan)

 - Add support for Nova Lake and Wildcat Lake processors to the
   intel_rapl power capping driver (Kaushlendra Kumar, Srinivas
   Pandruvada)

* pm-cpuidle:
  cpuidle: Warn instead of bailing out if target residency check fails
  cpuidle: Update header inclusion
  cpuidle: governors: teo: Add missing space to the description
  cpuidle: governors: teo: Simplify intercepts-based state lookup
  cpuidle: governors: teo: Fix tick_intercepts handling in teo_update()
  cpuidle: governors: teo: Rework the handling of tick wakeups
  cpuidle: governors: teo: Decay metrics below DECAY_SHIFT threshold
  cpuidle: governors: teo: Use s64 consistently in teo_update()
  cpuidle: governors: teo: Drop redundant function parameter
  cpuidle: governors: teo: Drop misguided target residency check
  cpuidle: teo: Use this_cpu_ptr() where possible
  cpuidle: Add sanity check for exit latency and target residency
  cpuidle: menu: Use residency threshold in polling state override decisions

* pm-powercap:
  powercap: intel_rapl: Enable MSR-based RAPL PMU support
  powercap: intel_rapl: Prepare read_raw() interface for atomic-context callers
  powercap: intel_rapl: Add support for Nova Lake processors
  powercap: intel_rapl: Add support for Wildcat Lake platform
2025-11-28 16:29:41 +01:00
Rafael J. Wysocki
1fe2523713 Merge branch 'pm-cpufreq'
Merge cpufreq updates for 6.19-rc1:

 - Add OPP and bandwidth support for Tegra186 (Aaron Kling)

 - Optimizations for parameter array handling in the amd-pstate cpufreq
   driver (Mario Limonciello)

 - Fix for mode changes with offline CPUs in the amd-pstate cpufreq
   driver (Gautham Shenoy)

 - Preserve freq_table_sorted across suspend/hibernate in the cpufreq
   core (Zihuan Zhang)

 - Adjust energy model rules for Intel hybrid platforms in the
   intel_pstate cpufreq driver and improve printing of debug messages
   in it (Rafael Wysocki)

 - Replace deprecated strcpy() in cpufreq_unregister_governor()
   (Thorsten Blum)

 - Fix duplicate hyperlink target errors in the intel_pstate cpufreq
   driver documentation and use :ref: directive for internal linking in
   it (Swaraj Gaikwad, Bagas Sanjaya)

 - Add Diamond Rapids OOB mode support to the intel_pstate cpufreq
   driver (Kuppuswamy Sathyanarayanan)

 - Use mutex guard for driver locking in the intel_pstate driver and
   eliminate some code duplication from it (Rafael Wysocki)

 - Replace udelay() with usleep_range() in ACPI cpufreq (Kaushlendra
   Kumar)

 - Minor improvements to various cpufreq drivers (Christian Marangi, Hal
   Feng, Jie Zhan, Marco Crivellari, Miaoqian Lin, and Shuhao Fu)

* pm-cpufreq: (27 commits)
  cpufreq: qcom-nvmem: fix compilation warning for qcom_cpufreq_ipq806x_match_list
  cpufreq: ACPI: Replace udelay() with usleep_range()
  cpufreq: intel_pstate: Eliminate some code duplication
  cpufreq: intel_pstate: Use mutex guard for driver locking
  cpufreq/amd-pstate: Call cppc_set_auto_sel() only for online CPUs
  cpufreq/amd-pstate: Add static asserts for EPP indices
  cpufreq/amd-pstate: Fix some whitespace issues
  cpufreq/amd-pstate: Adjust return values in amd_pstate_update_status()
  cpufreq/amd-pstate: Make amd_pstate_get_mode_string() never return NULL
  cpufreq/amd-pstate: Drop NULL value from amd_pstate_mode_string
  cpufreq/amd-pstate: Use sysfs_match_string() for epp
  cpufreq: tegra194: add WQ_PERCPU to alloc_workqueue users
  cpufreq: qcom-nvmem: add compatible fallback for ipq806x for no SMEM
  Documentation: intel-pstate: Use :ref: directive for internal linking
  cpufreq: intel_pstate: Add Diamond Rapids OOB mode support
  Documentation: intel_pstate: fix duplicate hyperlink target errors
  cpufreq: CPPC: Don't warn if FIE init fails to read counters
  cpufreq: nforce2: fix reference count leak in nforce2
  cpufreq: tegra186: add OPP support and set bandwidth
  cpufreq: dt-platdev: Add JH7110S SOC to the allowlist
  ...
2025-11-28 16:15:38 +01:00
Rafael J. Wysocki
f086594adb Merge branch 'pm-sleep'
Merge updates related to system suspend and hibernation for 6.19-rc1:

 - Replace snprintf() with scnprintf() in show_trace_dev_match()
   (Kaushlendra Kumar)

 - Fix memory allocation error handling in pm_vt_switch_required()
   (Malaya Kumar Rout)

 - Introduce CALL_PM_OP() macro and use it to simplify code in
   generic PM operations (Kaushlendra Kumar)

 - Add module param to backtrace all CPUs in the device power management
   watchdog (Sergey Senozhatsky)

 - Rework message printing in swsusp_save() (Rafael Wysocki)

 - Make it possible to change the number of hibernation compression
   threads (Xueqin Luo)

 - Clarify that only cgroup1 freezer uses PM freezer (Tejun Heo)

 - Add document on debugging shutdown hangs to PM documentation and
   correct a mistaken configuration option in it (Mario Limonciello)

 - Shut down wakeup source timer before removing the wakeup source from
   the list (Kaushlendra Kumar, Rafael Wysocki)

 - Introduce new PMSG_POWEROFF event for system shutdown handling with
   the help of PM device callbacks (Mario Limonciello)

 - Make pm_test delay interruptible by wakeup events (Riwen Lu)

 - Clean up kernel-doc comment style usage in the core hibernation
   code and remove unuseful comments from it (Sunday Adelodun, Rafael
   Wysocki)

 - Add support for handling wakeup events and aborting the suspend
   process while it is syncing file systems (Samuel Wu, Rafael Wysocki)

* pm-sleep: (21 commits)
  PM: hibernate: Extra cleanup of comments in swap handling code
  PM: sleep: Call pm_sleep_fs_sync() instead of ksys_sync_helper()
  PM: sleep: Add support for wakeup during filesystem sync
  PM: hibernate: Clean up kernel-doc comment style usage
  PM: suspend: Make pm_test delay interruptible by wakeup events
  usb: sl811-hcd: Add PM_EVENT_POWEROFF into suspend callbacks
  scsi: Add PM_EVENT_POWEROFF into suspend callbacks
  PM: Introduce new PMSG_POWEROFF event
  PM: wakeup: Update after recent wakeup source removal ordering change
  PM: wakeup: Delete timer before removing wakeup source from list
  Documentation: power: Correct a mistaken configuration option
  Documentation: power: Add document on debugging shutdown hangs
  freezer: Clarify that only cgroup1 freezer uses PM freezer
  PM: hibernate: add sysfs interface for hibernate_compression_threads
  PM: hibernate: make compression threads configurable
  PM: hibernate: dynamically allocate crc->unc_len/unc for configurable threads
  PM: hibernate: Rework message printing in swsusp_save()
  PM: dpm_watchdog: add module param to backtrace all CPUs
  PM: sleep: Introduce CALL_PM_OP() macro to simplify code
  PM: console: Fix memory allocation error handling in pm_vt_switch_required()
  ...
2025-11-28 16:01:13 +01:00
Rafael J. Wysocki
60d69a7ed1 Merge branches 'pm-core' and 'pm-runtime'
Merge a core power management update and runtime PM framework updates
for 6.19-rc1:

 - Add WQ_UNBOUND to pm_wq workqueue (Marco Crivellari)

 - Add runtime PM wrapper macros for ACQUIRE()/ACQUIRE_ERR() and use
   them in the PCI core and the ACPI TAD driver (Rafael Wysocki)

 - Improve runtime PM in the ACPI TAD driver (Rafael Wysocki)

 - Update pm_runtime_allow/forbid() documentation (Rafael Wysocki)

 - Fix typos in runtime.c comments (Malaya Kumar Rout)

* pm-core:
  PM: WQ_UNBOUND added to pm_wq workqueue

* pm-runtime:
  PCI/sysfs: Use PM_RUNTIME_ACQUIRE()/PM_RUNTIME_ACQUIRE_ERR()
  ACPI: TAD: Use PM_RUNTIME_ACQUIRE()/PM_RUNTIME_ACQUIRE_ERR()
  PM: runtime: Wrapper macros for ACQUIRE()/ACQUIRE_ERR()
  PM: runtime: fix typos in runtime.c comments
  ACPI: TAD: Improve runtime PM using guard macros
  ACPI: TAD: Rearrange runtime PM operations in acpi_tad_remove()
  PM: runtime: docs: Update pm_runtime_allow/forbid() documentation
2025-11-28 15:56:09 +01:00
Rafael J. Wysocki
af47d98064 Merge branches 'acpi-misc' and 'pnp'
Merge miscellaneous ACPI support updates and a PNP update for 6.19-rc1:

 - Replace `core::mem::zeroed` with `pin_init::zeroed` in the ACPI Rust
   code (Siyuan Huang)

 - Update the ACPI code to use the new style of allocating workqueues
   and new global workqueues (Marco Crivellari)

 - Fix two spelling mistakes in the ACPI code (Chu Guangqing)

 - Fix ISAPNP to generate uevents to auto-load modules (René Rebe)

* acpi-misc:
  ACPI: PM: Fix a spelling mistake
  ACPI: LPSS: Fix a spelling mistake
  ACPI: thermal: Add WQ_PERCPU to alloc_workqueue() users
  ACPI: OSL: Add WQ_PERCPU to alloc_workqueue() users
  ACPI: EC: Add WQ_PERCPU to alloc_workqueue() users
  ACPI: OSL: replace use of system_wq with system_percpu_wq
  ACPI: scan: replace use of system_unbound_wq with system_dfl_wq
  rust: acpi: replace `core::mem::zeroed` with `pin_init::zeroed`

* pnp:
  PNP: Fix ISAPNP to generate uevents to auto-load modules
2025-11-28 15:08:38 +01:00
Rafael J. Wysocki
ba9aeba053 Merge branches 'acpi-tad', 'acpi-fan', 'acpi-dptf' and 'acpi-tools'
Merge updates of the ACPI time and alarm device (TAD) driver, ACPI fan
driver, ACPI DPTF code and an ACPI utility update for 6.19-rc1:

 - Improve runtime PM in the ACPI time and alarm device (TAD) driver
   using guard macros and rearrange code related to runtime PM in
   acpi_tad_remove() (Rafael Wysocki)

 - Add support for Microsoft fan extensions to the ACPI fan driver along
   with notification support and work around a 64-bit firmware bug in
   that driver (Armin Wolf)

 - Use ACPI_FREE() to free ACPI buffer in the ACPI DPTF code (Kaushlendra
   Kumar)

 - Fix a memory leak and a resource leak in the ACPI pfrut utility (Malaya
   Kumar Rout)

* acpi-tad:
  ACPI: TAD: Improve runtime PM using guard macros
  ACPI: TAD: Rearrange runtime PM operations in acpi_tad_remove()

* acpi-fan:
  ACPI: fan: Add support for Microsoft fan extensions
  ACPI: fan: Add hwmon notification support
  ACPI: fan: Add basic notification support
  ACPI: fan: Workaround for 64-bit firmware bug

* acpi-dptf:
  ACPI: DPTF: Use ACPI_FREE() for ACPI buffer deallocation

* acpi-tools:
  ACPI: tools: pfrut: fix memory leak and resource leak in pfrut.c
2025-11-28 15:01:07 +01:00
Rafael J. Wysocki
24d268add6 Merge branches 'acpica', 'acpi-property', 'acpi-pm' and 'acpi-battery'
Merge an ACPICA change, device ACPI properties handling update, ACPI
power management updates, and an ACPI battery driver update for
6.19-rc1:

 - Avoid walking the ACPI namespace in the AML interpreter if the
   starting node cannot be determined (Cryolitia PukNgae)

 - Use min() instead of min_t() in the ACPI device properties handling
   code to avoid discarding significant bits (David Laight)

 - Fix potential fwnode refcount leak in acpi_fwnode_graph_parse_endpoint()
   that may prevent the parent fwnode from being released (Haotian Zhang)

 - Rework acpi_graph_get_next_endpoint() to use ACPI functions only, remove
   unnecessary contitionals from it to make it easier to follow, and make
   acpi_get_next_subnode() static (Sakari Ailus)

 - Drop unused function acpi_get_lps0_constraint(), make some Low-Power
   S0 callback functions for suspend-to-idle static, and rearrange the
   code retrieving Low-Power S0 constraits so it only runs when the
   constraits are actually used (Rafael Wysocki)

 - Drop redundant locking from the ACPI battery driver (Rafael Wysocki)

* acpica:
  ACPICA: Avoid walking the Namespace if start_node is NULL

* acpi-property:
  ACPI: property: use min() instead of min_t()
  ACPI: property: Fix fwnode refcount leak in acpi_fwnode_graph_parse_endpoint()
  ACPI: property: Rework acpi_graph_get_next_endpoint()
  ACPI: property: Use ACPI functions in acpi_graph_get_next_endpoint() only
  ACPI: property: Make acpi_get_next_subnode() static

* acpi-pm:
  ACPI: PM: s2idle: Only retrieve constraints when needed
  ACPI: PM: s2idle: Staticise LPS0 callback functions
  ACPI: PM: s2idle: Drop acpi_get_lps0_constraint()

* acpi-battery:
  ACPI: battery: Drop redundant locking
2025-11-28 14:44:36 +01:00
Rafael J. Wysocki
63d26c3811 Merge tag 'thermal-v6.19-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux
Pull thermal control changes for 6.19-rc1 from Daniel Lezcano:

"- Document the RZ/V2H TSU DT bindings (Ovidiu Panait)

 - Document the Kaanapali Temperature Sensor (Manaf Meethalavalappu
   Pallikunhi)

 - Document R-Car Gen4 and RZ/G2 support in driver comment (Marek Vasut)

 - Convert to DEFINE_SIMPLE_DEV_PM_OPS in the R-Car [Gen3] (Geert
   Uytterhoeven)

 - Fix format string bug in thermal-engine (Malaya Kumar Rout)

 - Make ipq5018 tsens standalone compatible (George Moussalem)

 - Add the QCS8300 compatible for the QCom Tsens (Gaurav Kohli)

 - Add the support for the NXP i.MX91 thermal module, including the DT
   bindings (Pengfei Li)

* tag 'thermal-v6.19-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
  thermal/drivers/imx91: Add support for i.MX91 thermal monitoring unit
  dt-bindings: thermal: fsl,imx91-tmu: add bindings for NXP i.MX91 thermal module
  dt-bindings: thermal: tsens: Add QCS8300 compatible
  dt-bindings: thermal: qcom-tsens: make ipq5018 tsens standalone compatible
  tools/thermal/thermal-engine: Fix format string bug in thermal-engine
  thermal/drivers/rcar_gen3: Convert to DEFINE_SIMPLE_DEV_PM_OPS()
  thermal/drivers/rcar: Convert to DEFINE_SIMPLE_DEV_PM_OPS()
  thermal/drivers/rcar_gen3: Document R-Car Gen4 and RZ/G2 support in driver comment
  dt-bindings: thermal: qcom-tsens: document the Kaanapali Temperature Sensor
  dt-bindings: thermal: r9a09g047-tsu: Document RZ/V2H TSU
2025-11-28 13:02:50 +01:00
Rafael J. Wysocki
9fac2a114b Merge back ACPI processor driver changes for 6.19 2025-11-28 12:59:01 +01:00
Rafael J. Wysocki
5e8b7b58b2 Merge tag 'devfreq-next-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux
Pull devfreq changes for v6.19 from Chanwoo Choi:

"- Move governor.h under include/linux/ and rename to devfreq-governor.h
   in order to allow devfreq governor definitions in out of drivers/devfreq/.

 - Fix potential use-after-free issue of OPP handling on hisi_uncore_freq.c

 - Use min() to improve the readability on tegra30-devfreq.c

 - Fix typo in DFSO_DOWNDIFFERENTIAL macro name on governor_simpleondemand.c"

* tag 'devfreq-next-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux:
  PM / devfreq: Fix typo in DFSO_DOWNDIFFERENTIAL macro name
  PM / devfreq: tegra30: use min to simplify actmon_cpu_to_emc_rate
  PM / devfreq: hisi: Fix potential UAF in OPP handling
  PM / devfreq: Move governor.h to a public header location
2025-11-27 16:28:16 +01:00
Chu Guangqing
a508939e15 ACPI: PM: Fix a spelling mistake
Fix spelling by replacing "interrups" with "interrupts".

Signed-off-by: Chu Guangqing <chuguangqing@inspur.com>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20251125022403.2614-1-chuguangqing@inspur.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-27 14:11:19 +01:00
Chu Guangqing
037dada8bb ACPI: LPSS: Fix a spelling mistake
Fix spelling by replacing "successfull" with "successful".

Signed-off-by: Chu Guangqing <chuguangqing@inspur.com>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20251125021431.2243-1-chuguangqing@inspur.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-27 14:08:26 +01:00
René Rebe
17e7972979 ACPI: processor_core: fix map_x2apic_id for amd-pstate on am4
On all AMD AM4 systems I have seen, e.g ASUS X470-i, Pro WS X570 Ace
and equivalent Gigabyte, amd-pstate does not initialize when the
x2apic is enabled in the BIOS. Kernel debug messages include:

[    0.315438] acpi LNXCPU:00: Failed to get CPU physical ID.
[    0.354756] ACPI CPPC: No CPC descriptor for CPU:0
[    0.714951] amd_pstate: the _CPC object is not present in SBIOS or ACPI disabled

I tracked this down to map_x2apic_id() checking device_declaration
passed in via the type argument of acpi_get_phys_id() via
map_madt_entry() while map_lapic_id() does not.

It appears these BIOSes use Processor statements for declaring the CPUs
in the ACPI namespace instead of processor device objects (which should
have been used). CPU declarations via Processor statements were
deprecated in ACPI 6.0 that was released 10 years ago. They should not
be used any more in any contemporary platform firmware.

I tried to contact Asus support multiple times, but never received a
reply nor did any BIOS update ever change this.

Fix amd-pstate w/ x2apic on am4 by allowing map_x2apic_id() to work with
CPUs declared via Processor statements for IDs less than 255, which is
consistent with ACPI 5.0 that still allowed Processor statements to be
used for declaring CPUs.

Fixes: 7237d3de78 ("x86, ACPI: add support for x2apic ACPI extensions")
Signed-off-by: René Rebe <rene@exactco.de>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20251126.165513.1373131139292726554.rene@exactco.de
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-26 18:06:25 +01:00
Pengfei Li
c411d8bf06 thermal/drivers/imx91: Add support for i.MX91 thermal monitoring unit
Introduce support for the i.MX91 thermal monitoring unit, which features a
single sensor for the CPU. The register layout differs from other chips,
necessitating the creation of a dedicated file for this.

This sensor provides a resolution of 1/64°C (6-bit fraction). For actual
accuracy, refer to the datasheet, as it varies depending on the chip grade.
Provide an interrupt for end of measurement and threshold violation and
Contain temperature threshold comparators, in normal and secure address
space, with direction and threshold programmability.

Datasheet Link: https://www.nxp.com/docs/en/data-sheet/IMX91CEC.pdf

Signed-off-by: Pengfei Li <pengfei.li_1@nxp.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/20251020-imx91tmu-v7-2-48d7d9f25055@nxp.com
2025-11-26 15:51:28 +01:00
Pengfei Li
f32aedc575 dt-bindings: thermal: fsl,imx91-tmu: add bindings for NXP i.MX91 thermal module
Add bindings documentation for i.MX91 thermal modules.

Signed-off-by: Pengfei Li <pengfei.li_1@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20251020-imx91tmu-v7-1-48d7d9f25055@nxp.com
2025-11-26 15:51:16 +01:00
Gaurav Kohli
1ee90870ce dt-bindings: thermal: tsens: Add QCS8300 compatible
Add compatibility string for the thermal sensors on QCS8300 platform.

Signed-off-by: Gaurav Kohli <quic_gkohli@quicinc.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Reviewed-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Link: https://patch.msgid.link/20250822042316.1762153-2-quic_gkohli@quicinc.com
2025-11-26 15:50:59 +01:00
Rafael J. Wysocki
6e757fd548 Merge back ACPI processor driver changes for 6.19 2025-11-26 13:56:30 +01:00
Riwen Lu
d9600d5766 PM / devfreq: Fix typo in DFSO_DOWNDIFFERENTIAL macro name
Correct the spelling error in the DFSO_DOWNDIFFERENTIAL macro
definition and update the corresponding variable assignment.

The macro was previously misspelled as DFSO_DOWNDIFFERENCTIAL.
This change ensures consistent and correct spelling throughout
the simpleondemand governor implementation.

Signed-off-by: Riwen Lu <luriwen@kylinos.cn>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Link: https://patchwork.kernel.org/project/linux-pm/patch/20251118032339.2799230-1-luriwen@kylinos.cn/
2025-11-26 13:58:59 +09:00
Cryolitia PukNgae
9d6c58dae8 ACPICA: Avoid walking the Namespace if start_node is NULL
Although commit 0c9992315e ("ACPICA: Avoid walking the ACPI Namespace
if it is not there") fixed the situation when both start_node and
acpi_gbl_root_node are NULL, the Linux kernel mainline now still crashed
on Honor Magicbook 14 Pro [1].

That happens due to the access to the member of parent_node in
acpi_ns_get_next_node().  The NULL pointer dereference will always
happen, no matter whether or not the start_node is equal to
ACPI_ROOT_OBJECT, so move the check of start_node being NULL
out of the if block.

Unfortunately, all the attempts to contact Honor have failed, they
refused to provide any technical support for Linux.

The bad DSDT table's dump could be found on GitHub [2].

DMI: HONOR FMB-P/FMB-P-PCB, BIOS 1.13 05/08/2025

Link: 1c1b57b9eb
Link: https://gist.github.com/Cryolitia/a860ffc97437dcd2cd988371d5b73ed7 [1]
Link: https://github.com/denis-bb/honor-fmb-p-dsdt [2]
Signed-off-by: Cryolitia PukNgae <cryolitia.pukngae@linux.dev>
Reviewed-by: WangYuli <wangyl5933@chinaunicom.cn>
[ rjw: Subject adjustment, changelog edits ]
Link: https://patch.msgid.link/20251125-acpica-v1-1-99e63b1b25f8@linux.dev
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-25 22:14:11 +01:00
Rafael J. Wysocki
4bf944f3fc cpuidle: Warn instead of bailing out if target residency check fails
It turns out that the change in commit 76934e495c ("cpuidle: Add
sanity check for exit latency and target residency") goes too far
because there are systems in the field on which the check introduced
by that commit does not pass.

For this reason, change __cpuidle_driver_init() return type back to void
and make it print a warning when the check mentioned above does not
pass.

Fixes: 76934e495c ("cpuidle: Add sanity check for exit latency and target residency")
Reported-by: Val Packett <val@packett.cool>
Closes: https://lore.kernel.org/linux-pm/20251121010756.6687-1-val@packett.cool/
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/2808566.mvXUDI8C0e@rafael.j.wysocki
2025-11-25 19:06:38 +01:00
Andy Shevchenko
6d96ceff9a cpuidle: Update header inclusion
While cleaning up some headers, I got a build error on this file:

drivers/cpuidle/poll_state.c:52:2: error: call to undeclared library function 'snprintf' with type 'int (char *restrict, unsigned long, const char *restrict, ...)'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]

Update header inclusions to follow IWYU (Include What You Use)
principle.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20251124205752.1328701-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-25 19:04:29 +01:00
Ulf Hansson
c19dfb267c Documentation: power/cpuidle: Document the CPU system wakeup latency QoS
Let's document how the new CPU system wakeup latency QoS limit can be used
from user space, along with how the constraint is taken into account for
s2idle and cpuidle.

Reviewed-by: Dhruva Gole <d-gole@ti.com>
Reviewed-by: Kevin Hilman (TI) <khilman@baylibre.com>
Tested-by: Kevin Hilman (TI) <khilman@baylibre.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://patch.msgid.link/20251125112650.329269-7-ulf.hansson@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-25 19:01:30 +01:00
Ulf Hansson
2b8d594742 cpuidle: Respect the CPU system wakeup QoS limit for cpuidle
The CPU system wakeup QoS limit must be respected for the regular cpuidle
state selection. Therefore, let's extend the common governor helper
cpuidle_governor_latency_req(), to take the constraint into account.

Reviewed-by: Dhruva Gole <d-gole@ti.com>
Reviewed-by: Kevin Hilman (TI) <khilman@baylibre.com>
Tested-by: Kevin Hilman (TI) <khilman@baylibre.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://patch.msgid.link/20251125112650.329269-6-ulf.hansson@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-25 19:01:29 +01:00
Ulf Hansson
99b42445f4 sched: idle: Respect the CPU system wakeup QoS limit for s2idle
A CPU system wakeup QoS limit may have been requested by user space. To
avoid breaking this constraint when entering a low power state during
s2idle, let's start to take into account the QoS limit.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dhruva Gole <d-gole@ti.com>
Reviewed-by: Kevin Hilman (TI) <khilman@baylibre.com>
Tested-by: Kevin Hilman (TI) <khilman@baylibre.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://patch.msgid.link/20251125112650.329269-5-ulf.hansson@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-25 19:01:29 +01:00
Ulf Hansson
e2e4695f01 pmdomain: Respect the CPU system wakeup QoS limit for cpuidle
The CPU system wakeup QoS limit must be respected for the regular cpuidle
state selection. Therefore, let's extend the genpd governor for CPUs to
take the constraint into account when it selects a domain idle state for
the corresponding PM domain.

Reviewed-by: Dhruva Gole <d-gole@ti.com>
Reviewed-by: Kevin Hilman (TI) <khilman@baylibre.com>
Tested-by: Kevin Hilman (TI) <khilman@baylibre.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://patch.msgid.link/20251125112650.329269-4-ulf.hansson@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-25 19:01:29 +01:00
Ulf Hansson
8e7de6dc42 pmdomain: Respect the CPU system wakeup QoS limit for s2idle
A CPU system wakeup QoS limit may have been requested by user space. To
avoid breaking this constraint when entering a low power state during
s2idle through genpd, let's extend the corresponding genpd governor for
CPUs. More precisely, during s2idle let the genpd governor select a
suitable domain idle state, by taking into account the QoS limit.

Reviewed-by: Dhruva Gole <d-gole@ti.com>
Reviewed-by: Kevin Hilman (TI) <khilman@baylibre.com>
Tested-by: Kevin Hilman (TI) <khilman@baylibre.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://patch.msgid.link/20251125112650.329269-3-ulf.hansson@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-25 19:01:29 +01:00
Ulf Hansson
a4e6512a79 PM: QoS: Introduce a CPU system wakeup QoS limit
Some platforms supports multiple low power states for CPUs that can be used
when entering system-wide suspend. Currently we are always selecting the
deepest possible state for the CPUs, which can break the system wakeup
latency constraint that may be required for a use case.

Let's take the first step towards addressing this problem, by introducing
an interface for user space, that allows us to specify the CPU system
wakeup QoS limit. Subsequent changes will start taking into account the new
QoS limit.

Reviewed-by: Dhruva Gole <d-gole@ti.com>
Reviewed-by: Kevin Hilman (TI) <khilman@baylibre.com>
Tested-by: Kevin Hilman (TI) <khilman@baylibre.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://patch.msgid.link/20251125112650.329269-2-ulf.hansson@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-25 19:01:29 +01:00
Rafael J. Wysocki
30a8e0a32e Merge tag 'linux-cpupower-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux
Pull a cpupower utility update for 6.19-rc1 from Shuah Khan:

"Adds support for building libcpupower statically when STATIC=true is
 specified during build."

* tag 'linux-cpupower-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux:
  tools/power/cpupower: Support building libcpupower statically
2025-11-25 17:11:45 +01:00
Rafael J. Wysocki
8dfa8bb652 Merge tag 'opp-updates-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Pull OPP updates for 6.19 from Viresh Kumar:

"- Minor improvements to the Rust interface (Tamir Duberstein).

 - Fixes to scope-based pointers (Viresh Kumar)."

* tag 'opp-updates-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
  rust: opp: simplify callers of `to_c_str_array`
  OPP: Initialize scope-based pointers inline
  rust: opp: fix broken rustdoc link
2025-11-25 17:08:06 +01:00
Rafael J. Wysocki
ded4feb14d Merge tag 'cpufreq-arm-updates-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Pull CPUFreq updates for 6.19 from Viresh Kumar:

"- tegra186: Add OPP / bandwidth support for Tegra186 (Aaron Kling).

 - Minor improvements to various cpufreq drivers (Christian Marangi, Hal
   Feng, Jie Zhan, Marco Crivellari, Miaoqian Lin, and Shuhao Fu)."

* tag 'cpufreq-arm-updates-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
  cpufreq: qcom-nvmem: fix compilation warning for qcom_cpufreq_ipq806x_match_list
  cpufreq: tegra194: add WQ_PERCPU to alloc_workqueue users
  cpufreq: qcom-nvmem: add compatible fallback for ipq806x for no SMEM
  cpufreq: CPPC: Don't warn if FIE init fails to read counters
  cpufreq: nforce2: fix reference count leak in nforce2
  cpufreq: tegra186: add OPP support and set bandwidth
  cpufreq: dt-platdev: Add JH7110S SOC to the allowlist
  cpufreq: s5pv210: fix refcount leak
2025-11-25 17:06:04 +01:00
George Moussalem
8d6f8d5c58 dt-bindings: thermal: qcom-tsens: make ipq5018 tsens standalone compatible
The tsens IP found in the IPQ5018 SoC should not use qcom,tsens-v1 as
fallback since it has no RPM and, as such, must deviate from the
standard v1 init routine as this version of tsens needs to be explicitly
reset and enabled in the driver.

So let's make qcom,ipq5018-tsens a standalone compatible in the bindings.

Fixes: 77c6d28192 ("dt-bindings: thermal: qcom-tsens: Add ipq5018 compatible")
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: George Moussalem <george.moussalem@outlook.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/20250818-ipq5018-tsens-fix-v1-1-0f08cf09182d@outlook.com
2025-11-25 16:07:57 +01:00
Malaya Kumar Rout
16e802667e tools/thermal/thermal-engine: Fix format string bug in thermal-engine
The error message in the daemon() failure path uses %p format specifier
without providing a corresponding pointer argument, resulting in undefined
behavior and printing garbage values.

Replace %p with %m to properly print the errno error message, which is
the intended behavior when daemon() fails.

This fix ensures proper error reporting when daemonization fails.

Signed-off-by: Malaya Kumar Rout <mrout@redhat.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/20251124104401.374856-1-mrout@redhat.com
2025-11-25 11:00:28 +01:00
Jason A. Donenfeld
90fb9b98fc random: complete sentence of comment
Complete the sentence by adding "is set", rather than having it dangle
as a sentence fragment.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2025-11-25 02:54:37 +01:00
Malaya Kumar Rout
8974573ba4 ACPI: tools: pfrut: fix memory leak and resource leak in pfrut.c
Static analysis found an issue in pfrut.c

cppcheck output before this patch:
tools/power/acpi/tools/pfrut/pfrut.c:225:3: error: Resource leak: fd_update [resourceLeak]
tools/power/acpi/tools/pfrut/pfrut.c:269:3: error: Resource leak: fd_update [resourceLeak]
tools/power/acpi/tools/pfrut/pfrut.c:269:3: error: Resource leak: fd_update_log [resourceLeak]
tools/power/acpi/tools/pfrut/pfrut.c:365:4: error: Memory leak: addr_map_capsule [memleak]
tools/power/acpi/tools/pfrut/pfrut.c:424:4: error: Memory leak: log_buf [memleak]

cppcheck output after this patch:
No resource leaks found

Fix by closing file descriptors and freeing allocated memory.

Signed-off-by: Malaya Kumar Rout <mrout@redhat.com>
Link: https://patch.msgid.link/20251120170001.251968-1-mrout@redhat.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-24 20:50:15 +01:00
David Laight
c964081d60 ACPI: property: use min() instead of min_t()
min_t(unsigned int, a, b) casts an 'unsigned long' to 'unsigned int'.
Use min(a, b) instead as it promotes any 'unsigned int' to 'unsigned long'
and so cannot discard significant bits.

In this case the 'unsigned long' value is small enough that the result
is ok.

Detected by an extra check added to min_t().

Signed-off-by: David Laight <david.laight.linux@gmail.com>
[ rjw: Subject adjustment ]
Link: https://patch.msgid.link/20251119224140.8616-14-david.laight.linux@gmail.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-24 20:45:38 +01:00
Rafael J. Wysocki
15bfdadd61 cpuidle: governors: teo: Add missing space to the description
There is a missing space in the governor description comment, so add it.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/5059034.31r3eYUQgx@rafael.j.wysocki
2025-11-24 20:43:32 +01:00
Rafael J. Wysocki
c03aef8833 PM: hibernate: Extra cleanup of comments in swap handling code
Continue recent cleanups of comments in the swap handling code.

Unify the use of white space in the comments, drop some unuseful
comments outside function bodies, and move some other comments into
function bodies.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/5943864.DvuYhMxLoT@rafael.j.wysocki
2025-11-24 20:41:06 +01:00
Eric Biggers
4f0382b090 lib/crypto: sha2: Add at_least decoration to fixed-size array params
Add the at_least (i.e. 'static') decoration to the fixed-size array
parameters of the sha2 library functions.  This causes clang to warn
when a too-small array of known size is passed.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20251122194206.31822-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-23 12:19:47 -08:00
Eric Biggers
d5cc4e731d lib/crypto: sha1: Add at_least decoration to fixed-size array params
Add the at_least (i.e. 'static') decoration to the fixed-size array
parameters of the sha1 library functions.  This causes clang to warn
when a too-small array of known size is passed.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20251122194206.31822-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-23 12:19:47 -08:00
Eric Biggers
c2099fa616 lib/crypto: poly1305: Add at_least decoration to fixed-size array params
Add the at_least (i.e. 'static') decoration to the fixed-size array
parameters of the poly1305 library functions.  This causes clang to warn
when a too-small array of known size is passed.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20251122194206.31822-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-23 12:19:47 -08:00
Eric Biggers
580f1d31df lib/crypto: md5: Add at_least decoration to fixed-size array params
Add the at_least (i.e. 'static') decoration to the fixed-size array
parameters of the md5 library functions.  This causes clang to warn when
a too-small array of known size is passed.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20251122194206.31822-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-23 12:19:47 -08:00
Eric Biggers
2143d622cd lib/crypto: curve25519: Add at_least decoration to fixed-size array params
Add the at_least (i.e. 'static') decoration to the fixed-size array
parameters of the curve25519 library functions.  This causes clang to
warn when a too-small array of known size is passed.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20251122194206.31822-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-23 12:19:47 -08:00
Eric Biggers
1b31b43bf5 lib/crypto: chacha: Add at_least decoration to fixed-size array params
Add the at_least (i.e. 'static') decoration to the fixed-size array
parameters of the chacha library functions.  This causes clang to warn
when a too-small array of known size is passed.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20251122194206.31822-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-23 12:19:47 -08:00
Jason A. Donenfeld
ac653d57ad lib/crypto: chacha20poly1305: Statically check fixed array lengths
Several parameters of the chacha20poly1305 functions require arrays of
an exact length. Use the new at_least keyword to instruct gcc and
clang to statically check that the caller is passing an object of at
least that length.

Here it is in action, with this faulty patch to wireguard's cookie.h:

     struct cookie_checker {
     	u8 secret[NOISE_HASH_LEN];
    -	u8 cookie_encryption_key[NOISE_SYMMETRIC_KEY_LEN];
    +	u8 cookie_encryption_key[NOISE_SYMMETRIC_KEY_LEN - 1];
     	u8 message_mac1_key[NOISE_SYMMETRIC_KEY_LEN];

If I try compiling this code, I get this helpful warning:

  CC      drivers/net/wireguard/cookie.o
drivers/net/wireguard/cookie.c: In function ‘wg_cookie_message_create’:
drivers/net/wireguard/cookie.c:193:9: warning: ‘xchacha20poly1305_encrypt’ reading 32 bytes from a region of size 31 [-Wstringop-overread]
  193 |         xchacha20poly1305_encrypt(dst->encrypted_cookie, cookie, COOKIE_LEN,
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  194 |                                   macs->mac1, COOKIE_LEN, dst->nonce,
      |                                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  195 |                                   checker->cookie_encryption_key);
      |                                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/wireguard/cookie.c:193:9: note: referencing argument 7 of type ‘const u8 *’ {aka ‘const unsigned char *’}
In file included from drivers/net/wireguard/messages.h:10,
                 from drivers/net/wireguard/cookie.h:9,
                 from drivers/net/wireguard/cookie.c:6:
include/crypto/chacha20poly1305.h:28:6: note: in a call to function ‘xchacha20poly1305_encrypt’
   28 | void xchacha20poly1305_encrypt(u8 *dst, const u8 *src, const size_t src_len,

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20251123054819.2371989-4-Jason@zx2c4.com
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-23 12:19:21 -08:00
Jason A. Donenfeld
074e16d58e compiler_types: introduce at_least parameter decoration pseudo keyword
Clang and recent gcc support warning if they are able to prove that the
user is passing to a function an array that is too short in size. For
example:

    void blah(unsigned char herp[at_least 7]);
    static void schma(void)
    {
        unsigned char good[] = { 1, 2, 3, 4, 5, 6, 7 };
        unsigned char bad[] = { 1, 2, 3, 4, 5, 6 };
        blah(good);
        blah(bad);
    }

The notation here, `static 7`, which this commit makes explicit by
allowing us to write it as `at_least 7`, means that it's incorrect to
pass anything less than 7 elements. This is section 6.7.5.3 of C99:

    If the keyword static also appears within the [ and ] of the array
    type derivation, then for each call to the function, the value of
    the corresponding actual argument shall provide access to the first
    element of an array with at least as many elements as specified by
    the size expression.

Here is the output from gcc 15:

    zx2c4@thinkpad /tmp $ gcc -c a.c
    a.c: In function ‘schma’:
    a.c:9:9: warning: ‘blah’ accessing 7 bytes in a region of size 6 [-Wstringop-overflow=]
        9 |         blah(bad);
          |         ^~~~~~~~~
    a.c:9:9: note: referencing argument 1 of type ‘unsigned char[7]’
    a.c:2:6: note: in a call to function ‘blah’
        2 | void blah(unsigned char herp[at_least 7]);
          |      ^~~~

And from clang 21:

    zx2c4@thinkpad /tmp $ clang -c a.c
    a.c:9:2: warning: array argument is too small; contains 6 elements, callee requires at least 7
          [-Warray-bounds]
        9 |         blah(bad);
          |         ^    ~~~
    a.c:2:25: note: callee declares array parameter as static here
        2 | void blah(unsigned char herp[at_least 7]);
          |                         ^   ~~~~~~~~~~
    1 warning generated.

So these are covered by, variously, -Wstringop-overflow and
-Warray-bounds.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20251123054819.2371989-3-Jason@zx2c4.com
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-23 12:18:36 -08:00
Jason A. Donenfeld
d96f562054 wifi: iwlwifi: trans: rename at_least variable to min_mode
The subsequent commit is going to add a macro that redefines `at_least`
to mean something else. Given that the usage here in iwlwifi is the only
use of that identifier in the whole kernel, just rename it to a more
fitting name, `min_mode`.

Cc: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20251123054819.2371989-1-Jason@zx2c4.com
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-23 12:18:20 -08:00
Thorsten Blum
dc30fe7a0a PM / devfreq: tegra30: use min to simplify actmon_cpu_to_emc_rate
Use min() to improve the readability of actmon_cpu_to_emc_rate() and
remove any unnecessary curly braces.

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Link: https://patchwork.kernel.org/project/linux-pm/patch/20251112172121.3741-2-thorsten.blum@linux.dev/
2025-11-24 00:02:07 +09:00
Pengjie Zhang
26dd44a400 PM / devfreq: hisi: Fix potential UAF in OPP handling
Ensure all required data is acquired before calling dev_pm_opp_put(opp)
to maintain correct resource acquisition and release order.

Fixes: 7da2fdaaa1 ("PM / devfreq: Add HiSilicon uncore frequency scaling driver")
Signed-off-by: Pengjie Zhang <zhangpengjie2@huawei.com>
Reviewed-by: Jie Zhan <zhanjie9@hisilicon.com>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Link: https://patchwork.kernel.org/project/linux-pm/patch/20250915062135.748653-1-zhangpengjie2@huawei.com/
2025-11-24 00:02:07 +09:00
Dmitry Baryshkov
447c4e8338 PM / devfreq: Move governor.h to a public header location
Some device drivers (and out-of-tree modules) might want to define
device-specific device governors. Rather than restricting all of them to
be a part of drivers/devfreq/ (which is not possible for out-of-tree
drivers anyway) move governor.h to include/linux/devfreq-governor.h and
update all drivers to use it.

The devfreq_cpu_data is only used internally, by the passive governor,
so it is moved to the driver source rather than being a part of the
public interface.

Reported-by: Robie Basak <robibasa@qti.qualcomm.com>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Acked-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Link: https://patchwork.kernel.org/project/linux-pm/patch/20251030-governor-public-v2-1-432a11a9975a@oss.qualcomm.com/
2025-11-24 00:02:01 +09:00
Kuppuswamy Sathyanarayanan
748d6ba43a powercap: intel_rapl: Enable MSR-based RAPL PMU support
Currently, RAPL PMU support requires adding CPU model entries to
arch/x86/events/rapl.c for each new generation. However, RAPL MSRs are
not architectural and require platform-specific customization, making
arch/x86 an inappropriate location for this functionality.

The powercap subsystem already handles RAPL functionality and is the
natural place to consolidate all RAPL features. The powercap RAPL
driver already includes PMU support for TPMI-based RAPL interfaces,
making it straightforward to extend this support to MSR-based RAPL
interfaces as well.

This consolidation eliminates the need to maintain RAPL support in
multiple subsystems and provides a unified approach for both TPMI and
MSR-based RAPL implementations.

The MSR-based PMU support includes the following updates:

 1. Register MSR-based PMU support for the supported platforms
    and unregister it when no online CPUs remain in the package.

 2. Remove existing checks that restrict RAPL PMU support to TPMI-based
    interfaces and extend the logic to allow MSR-based RAPL interfaces.

 3. Define a CPU model list to determine which processors should
    register RAPL PMU interface through the powercap driver for
    MSR-based RAPL, excluding those that support TPMI interface.
    This list prevents conflicts with existing arch/x86 PMU code
    that already registers RAPL PMU for some processors. Add
    Panther Lake & Wildcat Lake to the CPU models list.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20251121000539.386069-3-sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-21 21:47:08 +01:00
Kuppuswamy Sathyanarayanan
1d6c915819 powercap: intel_rapl: Prepare read_raw() interface for atomic-context callers
The current read_raw() implementation of the TPMI, MMIO and MSR
interfaces does not distinguish between atomic and non-atomic callers.

rapl_msr_read_raw() uses rdmsrq_safe_on_cpu(), which can sleep and
issue cross CPU calls. When MSR-based RAPL PMU support is enabled, PMU
event handlers can invoke this function from atomic context where
sleeping or rescheduling is not allowed. In atomic context, the caller
is already executing on the target CPU, so a direct rdmsrq() is
sufficient.

To support such usage, introduce an atomic flag to the read_raw()
interface to allow callers pass the context information. Modify the
common RAPL code to propagate this flag, and set the flag to reflect
the calling contexts.

Utilize the atomic flag in rapl_msr_read_raw() to perform direct MSR
read with rdmsrq() when running in atomic context, and a sanity check
to ensure target CPU matches the current CPU for such use cases.

The TPMI and MMIO implementations do not require special atomic
handling, so the flag is ignored in those paths.

This is a preparatory patch for adding MSR-based RAPL PMU support.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Subject tweak ]
Link: https://patch.msgid.link/20251121000539.386069-2-sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-21 21:47:08 +01:00
Christian Marangi
c3852d2ca4 cpufreq: qcom-nvmem: fix compilation warning for qcom_cpufreq_ipq806x_match_list
If CONFIG_OF is not enabled, of_match_node() is set as NULL and
qcom_cpufreq_ipq806x_match_list won't be used causing a compilation
warning.

Flag qcom_cpufreq_ipq806x_match_list as __maybe_unused to fix the
compilation warning.

While at it also flag as __initconst as it's used only in probe contest
and can be freed after probe.

This follows the pattern of the usual of_device_id variables.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202511202119.6zvvFMup-lkp@intel.com/
Fixes: 58f5d39d5e ("cpufreq: qcom-nvmem: add compatible fallback for ipq806x for no SMEM")
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
[ Viresh: Drop __initconst ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-11-21 10:21:13 +05:30
Samuel Wu
8e2d57e653 PM: sleep: Call pm_sleep_fs_sync() instead of ksys_sync_helper()
Replace the direct calls to ksys_sync_helper() with the new
pm_sleep_fs_sync() in suspend and hibernation code paths.

This enables the new mechanism allowing the filesystem sync phase
to be interrupted.

Suggested-by: Saravana Kannan <saravanak@google.com>
Signed-off-by: Samuel Wu <wusamuel@google.com>
Co-developed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[ rjw: Subject and changelog edits, tags adjustment ]
Link: https://patch.msgid.link/20251119171426.4086783-3-wusamuel@google.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-20 22:29:40 +01:00
Samuel Wu
bf8867eae1 PM: sleep: Add support for wakeup during filesystem sync
Add helper function pm_sleep_fs_sync() and related data structures
as a preparation for allowing system suspend and hibernation to be
aborted by wakeup events while syncing file systems.

The new function, to be called by the suspend process in order to
sync file systems, uses a dedicated ordered workqueue to run
ksys_sync_helper() in parallel with the calling process.  Next, it
waits for the completion of the filesystem sync and periodically
checks if any system wakeup events are pending, in which case it will
return an error.

If that happens while the filesystem sync is still in progress, it
will continue, possibly after pm_sleep_fs_sync() has returned, and if
that function is called again before the sync is complete, a new work
item to run ksys_sync_helper() again will be queued (and waited for)
to increase the likelihood of writing all of the dirty pages in memory
back to persistent storage.

Suggested-by: Saravana Kannan <saravanak@google.com>
Signed-off-by: Samuel Wu <wusamuel@google.com>
Co-developed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[ rjw: Subject and changelog rewrite, tags adjustment ]
Link: https://patch.msgid.link/20251119171426.4086783-2-wusamuel@google.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-20 22:29:40 +01:00
Rafael J. Wysocki
a857b530b3 Merge back material related to system sleep for 6.19 2025-11-20 22:28:23 +01:00
Kaushlendra Kumar
1b541e10ee cpufreq: ACPI: Replace udelay() with usleep_range()
Replace udelay() with usleep_range() in check_freqs() to allow
CPU scheduling during frequency polling.

Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20251119031109.134583-1-kaushlendra.kumar@intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-20 21:50:08 +01:00
Srinivas Pandruvada
8538e7ee09 docs: driver-api/thermal/intel_dptf: Add new workload type hint
Add documentation for longer term classification of workload type for
power or performance.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://patch.msgid.link/20251118223620.554798-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-20 21:32:52 +01:00
Rafael J. Wysocki
d834e68a0e cpuidle: governors: teo: Simplify intercepts-based state lookup
Simplify the loop looking up a candidate idle state in the case when an
intercept is likely to occur by adding a search for the state index limit
if the tick is stopped before it.

First, call tick_nohz_tick_stopped() just once and if it returns true,
look for the shallowest state index below the current candidate one with
target residency at least equal to the tick period length.

Next, simply look for a state that is not shallower than the one found
in the previous step and satisfies the intercepts majority condition (if
there are no such states, the shallowest state that is not shallower
than the one found in the previous step becomes the new candidate).

Since teo_state_ok() has no callers any more after the above changes,
drop it.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
[ rjw: Changelog clarification and code comment edit ]
Link: https://patch.msgid.link/2418792.ElGaqSPkdT@rafael.j.wysocki
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-20 16:32:48 +01:00
Geert Uytterhoeven
a6eb177102 thermal/drivers/rcar_gen3: Convert to DEFINE_SIMPLE_DEV_PM_OPS()
Convert the Renesas R-Car Gen3 thermal driver from SIMPLE_DEV_PM_OPS()
to DEFINE_SIMPLE_DEV_PM_OPS() and pm_sleep_ptr().  This lets us drop the
__maybe_unused annotation from its resume callback, and reduces kernel
size in case CONFIG_PM or CONFIG_PM_SLEEP is disabled.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://patch.msgid.link/813ad36fdc8561cf1c396230436e8ff3ff903a1f.1763117455.git.geert+renesas@glider.be
2025-11-20 15:33:45 +01:00
Geert Uytterhoeven
186b5c2726 thermal/drivers/rcar: Convert to DEFINE_SIMPLE_DEV_PM_OPS()
Convert the Renesas R-Car thermal driver from SIMPLE_DEV_PM_OPS() to
DEFINE_SIMPLE_DEV_PM_OPS() and pm_sleep_ptr().  This lets us drop the
check for CONFIG_PM_SLEEP, and reduces kernel size in case CONFIG_PM or
CONFIG_PM_SLEEP is disabled, while increasing build coverage.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://patch.msgid.link/ee03ec71d10fd589e7458fa1b0ada3d3c19dbb54.1763117351.git.geert+renesas@glider.be
2025-11-20 15:32:14 +01:00
Rafael J. Wysocki
50db438231 cpuidle: governors: teo: Fix tick_intercepts handling in teo_update()
The condition deciding whether or not to increase cpu_data->tick_intercepts
in teo_update() is reverse, so fix it.

Fixes: d619b5cc67 ("cpuidle: teo: Simplify counting events used for tick management")
Cc: 6.14+ <stable@vger.kernel.org> # 6.14+: 0796ddf4a7: cpuidle: teo: Use this_cpu_ptr() where possible
Cc: 6.14+ <stable@vger.kernel.org> # 6.14+: 8f3f01082d: cpuidle: governors: teo: Use s64 consistently in teo_update()
Cc: 6.14+ <stable@vger.kernel.org> # 6.14+: b54df61c74: cpuidle: governors: teo: Decay metrics below DECAY_SHIFT threshold
Cc: 6.14+ <stable@vger.kernel.org> 6.14+: 083654ded5: cpuidle: governors: teo: Rework the handling of tick wakeups
Cc: 6.14+ <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/5085160.31r3eYUQgx@rafael.j.wysocki
2025-11-20 14:54:08 +01:00
Rafael J. Wysocki
083654ded5 cpuidle: governors: teo: Rework the handling of tick wakeups
If the wakeup pattern is clearly dominated by tick wakeups, count those
wakeups as hits on the deepest available idle state to increase the
likelihood of stopping the tick, especially on systems where there are
only 2 usable idle states and the tick can only be stopped when the
deeper state is selected.

This change is expected to reduce power on some systems where state 0 is
selected relatively often even though they are almost idle.  Without it,
the governor may end up selecting the shallowest idle state all the time
even if the system is almost completely idle due all tick wakeups being
counted as hits on that state and preventing the tick from being stopped
at all.

Fixes: 4b20b07ce7 ("cpuidle: teo: Don't count non-existent intercepts")
Reported-by: Reka Norman <rekanorman@chromium.org>
Closes: https://lore.kernel.org/linux-pm/CAEmPcwsNMNnNXuxgvHTQ93Mx-q3Oz9U57THQsU_qdcCx1m4w5g@mail.gmail.com/
Tested-by: Reka Norman <rekanorman@chromium.org>
Tested-by: Christian Loehle <christian.loehle@arm.com>
Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: 92ce5c07b7: cpuidle: teo: Reorder candidate state index checks
Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: ea185406d1: cpuidle: teo: Combine candidate state index checks against 0
Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: b9a6af26bd: cpuidle: teo: Drop local variable prev_intercept_idx
Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: e24f8a55de: cpuidle: teo: Clarify two code comments
Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: d619b5cc67: cpuidle: teo: Simplify counting events used for tick management
Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: 13ed5c4a6d: cpuidle: teo: Skip getting the sleep length if wakeups are very frequent
Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: ddcfa79646: cpuidle: teo: Simplify handling of total events count
Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: 65e18e6544: cpuidle: teo: Replace time_span_ns with a flag
Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: 0796ddf4a7: cpuidle: teo: Use this_cpu_ptr() where possible
Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: 8f3f01082d: cpuidle: governors: teo: Use s64 consistently in teo_update()
Cc: 6.11+ <stable@vger.kernel.org> # 6.11+: b54df61c74: cpuidle: governors: teo: Decay metrics below DECAY_SHIFT threshold
Cc: 6.11+ <stable@vger.kernel.org> # 6.11+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[ rjw: Rebase on commit 0796ddf4a7, changelog update ]
Link: https://patch.msgid.link/6228387.lOV4Wx5bFT@rafael.j.wysocki
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-20 14:49:57 +01:00
René Rebe
e96190da17 PNP: Fix ISAPNP to generate uevents to auto-load modules
Currently ISAPNP devices do not generate an uevent for udev to
auto-load the driver modules for Creative SoundBlaster or Gravis
UltraSound to just work.

Signed-off-by: René Rebe <rene@exactco.de>
[ rjw: Subject edits ]
Link: https://patch.msgid.link/20251118.145942.1445519082574147037.rene@exactco.de
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-18 17:35:36 +01:00
Rafael J. Wysocki
b20a374902 cpufreq: intel_pstate: Eliminate some code duplication
To eliminate some code duplication from the intel_pstate driver,
move the core_get_val() function body to a new function called
get_perf_ctl_val() and make both core_get_val() and atom_get_val()
invoke it to carry out the same computation.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/2829273.mvXUDI8C0e@rafael.j.wysocki
2025-11-18 15:51:31 +01:00
Kaushlendra Kumar
58075aec92 powercap: intel_rapl: Add support for Nova Lake processors
Add RAPL support for Intel Nova Lake and Nova Lake L processors using
the core defaults configuration.

Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
[ rjw: Subject and changelog edits, rebase ]
Link: https://patch.msgid.link/20251028101814.3482508-1-kaushlendra.kumar@intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-18 15:39:29 +01:00
Sunday Adelodun
46fc75a29b PM: hibernate: Clean up kernel-doc comment style usage
Several static functions in kernel/power/swap.c were described using the
kernel-doc comment style (/** ... */) even though they are not exported
or referenced by generated documentation. This led to kernel-doc warnings
and stylistic inconsistencies.

Convert these unnecessary kernel-doc blocks to regular C comments,
remove comment blocks that are no longer useful, relocate comments to
more appropriate positions where needed, and fix a few "Return:"
descriptions that were either missing or incorrectly formatted.

No functional changes.

Signed-off-by: Sunday Adelodun <adelodunolaoluwa@yahoo.com>
[ rjw: Subject adjustment, changelog edits, comment edits ]
Link: https://patch.msgid.link/20251114220438.52448-1-adelodunolaoluwa@yahoo.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-17 20:16:56 +01:00
Rafael J. Wysocki
37d6d92fe0 Merge back earlier material related to system sleep for 6.19 2025-11-17 16:55:55 +01:00
Rafael J. Wysocki
07f42f8290 PCI/sysfs: Use PM_RUNTIME_ACQUIRE()/PM_RUNTIME_ACQUIRE_ERR()
Use new PM_RUNTIME_ACQUIRE() and PM_RUNTIME_ACQUIRE_ERR() wrapper macros
to make the code look more straightforward.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Dhruva Gole <d-gole@ti.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
[ rjw; Typo fix in the changelog ]
Link: https://patch.msgid.link/3932581.kQq0lBPeGt@rafael.j.wysocki
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-14 21:25:15 +01:00
Rafael J. Wysocki
70dcad3400 ACPI: TAD: Use PM_RUNTIME_ACQUIRE()/PM_RUNTIME_ACQUIRE_ERR()
Use new PM_RUNTIME_ACQUIRE() and PM_RUNTIME_ACQUIRE_ERR() wrapper macros
to make the code look more straightforward.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Dhruva Gole <d-gole@ti.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
[ rjw: Typo fix in the changelog ]
Link: https://patch.msgid.link/2040585.PYKUYFuaPT@rafael.j.wysocki
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-14 21:25:15 +01:00
Rafael J. Wysocki
ef8057b07c PM: runtime: Wrapper macros for ACQUIRE()/ACQUIRE_ERR()
Add wrapper macros for ACQUIRE()/ACQUIRE_ERR() and runtime PM
usage counter guards introduced recently: pm_runtime_active_try,
pm_runtime_active_auto_try, pm_runtime_active_try_enabled, and
pm_runtime_active_auto_try_enabled.

The new macros should be more straightforward to use.

For example, they can be used for rewriting a piece of code like below:

        ACQUIRE(pm_runtime_active_try, pm)(dev);
        if ((ret = ACQUIRE_ERR(pm_runtime_active_try, &pm)))
                return ret;

in the following way:

        PM_RUNTIME_ACQUIRE(dev, pm);
        if ((ret = PM_RUNTIME_ACQUIRE_ERR(&pm)))
                return ret;

If the original code does not care about the specific error code
returned when attepmting to resume the device:

        ACQUIRE(pm_runtime_active_try, pm)(dev);
        if (ACQUIRE_ERR(pm_runtime_active_try, &pm))
                return -ENXIO;

it may be changed like this:

        PM_RUNTIME_ACQUIRE(dev, pm);
        if (PM_RUNTIME_ACQUIRE_ERR(&pm))
                return -ENXIO;

Link: https://lore.kernel.org/linux-pm/5068916.31r3eYUQgx@rafael.j.wysocki/
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dhruva Gole <d-gole@ti.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/3400866.aeNJFYEL58@rafael.j.wysocki
2025-11-14 21:24:54 +01:00
Srinivas Pandruvada
3402bc010d Documentation: thermal: Document thermal throttling on Intel platforms
Add documentation for Intel thermal throttling reporting events.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
[ rjw: Subject adjustment, file name change, minor edits ]
Link: https://patch.msgid.link/20251113212104.221632-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-14 17:22:45 +01:00
Riwen Lu
a10ad1b104 PM: suspend: Make pm_test delay interruptible by wakeup events
Modify the suspend_test() function to allow the test delay to be
interrupted by wakeup events.

This improves the responsiveness of the system during suspend testing
when wakeup events occur, allowing the suspend process to proceed
without waiting for the full test delay to complete when wakeup events
are detected.

Additionally, using msleep() instead of mdelay() avoids potential soft
lockup "CPU stuck" issues when long test delays are configured.

Co-developed-by: xiongxin <xiongxin@kylinos.cn>
Signed-off-by: xiongxin <xiongxin@kylinos.cn>
Signed-off-by: Riwen Lu <luriwen@kylinos.cn>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20251113012638.1362013-1-luriwen@kylinos.cn
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-14 17:09:16 +01:00
Mario Limonciello (AMD)
7b9725b3d1 usb: sl811-hcd: Add PM_EVENT_POWEROFF into suspend callbacks
When the PM core uses hibernation callbacks for shutdown drivers
will receive PM_EVENT_POWEROFF and should handle it the same as
PM_EVENT_HIBERNATE would have been used.

Tested-by: Eric Naim <dnaim@cachyos.org>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
[ rjw: Changelog adjustment ]
Link: https://patch.msgid.link/20251112224025.2051702-4-superm1@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-14 17:05:53 +01:00
Mario Limonciello (AMD)
988dd0bd91 scsi: Add PM_EVENT_POWEROFF into suspend callbacks
If the PM core uses hibernation callbacks for powering off the
system, drivers will receive PM_EVENT_POWEROFF and should handle
it the same as they previously handled PM_EVENT_HIBERNATE.

Support this case in the scsi driver.  No functional changes.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Tested-by: Eric Naim <dnaim@cachyos.org>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Link: https://patch.msgid.link/20251112224025.2051702-3-superm1@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-14 17:05:53 +01:00
Mario Limonciello (AMD)
0ca04993da PM: Introduce new PMSG_POWEROFF event
PMSG_POWEROFF will be used for the PM core to allow differentiating between
a hibernation or shutdown sequence when re-using callbacks for common code.

Hibernation is started by writing a hibernation method (such as 'platform'
'shutdown', or 'reboot') to use into /sys/power/disk and writing 'disk' to
/sys/power/state.

Shutdown is initiated with the reboot() syscall with arguments on whether
to halt the system or power it off.

Tested-by: Eric Naim <dnaim@cachyos.org>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Link: https://patch.msgid.link/20251112224025.2051702-2-superm1@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-14 17:05:53 +01:00
Rafael J. Wysocki
bdfacf441b Merge back earlier runtime PM changes for 6.19 2025-11-14 16:56:40 +01:00
Rafael J. Wysocki
b54df61c74 cpuidle: governors: teo: Decay metrics below DECAY_SHIFT threshold
If a given governor metric falls below a certain value (8 for
DECAY_SHIFT equal to 3), it will not decay any more due to the
simplistic decay implementation.  This may in some cases lead to
subtle inconsistencies in the governor behavior, so change the
decay implementation to take it into account and set the metric
at hand to 0 in that case.

Suggested-by: Christian Loehle <christian.loehle@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/2819353.mvXUDI8C0e@rafael.j.wysocki
2025-11-14 15:20:01 +01:00
Rafael J. Wysocki
8f3f01082d cpuidle: governors: teo: Use s64 consistently in teo_update()
Two local variables in teo_update() are defined as u64, but their
values are then compared with s64 values, so it is more consistent
to use s64 as their data type.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/3026616.e9J7NaK4W3@rafael.j.wysocki
2025-11-14 15:20:01 +01:00
Rafael J. Wysocki
17673f64a0 cpuidle: governors: teo: Drop redundant function parameter
The last no_poll parameter of teo_find_shallower_state() is always
false, so drop it.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/2253109.irdbgypaU6@rafael.j.wysocki
2025-11-14 15:20:01 +01:00
Rafael J. Wysocki
a03b201180 cpuidle: governors: teo: Drop misguided target residency check
When the target residency of the current candidate idle state is
greater than the expected time till the closest timer (the sleep
length), it does not matter whether or not the tick has already been
stopped or if it is going to be stopped.  The closest timer will
trigger anyway at its due time, so if an idle state with target
residency above the sleep length is selected, energy will be wasted
and there may be excess latency.

Of course, if the closest timer were canceled before it could trigger,
a deeper idle state would be more suitable, but this is not expected
to happen (generally speaking, hrtimers are not expected to be
canceled as a rule).

Accordingly, the teo_state_ok() check done in that case causes energy to
be wasted more often than it allows any energy to be saved (if it allows
any energy to be saved at all), so drop it and let the governor use the
teo_find_shallower_state() return value as the new candidate idle state
index.

Fixes: 21d28cd2fa ("cpuidle: teo: Do not call tick_nohz_get_sleep_length() upfront")
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/5955081.DvuYhMxLoT@rafael.j.wysocki
2025-11-14 15:19:39 +01:00
Haotian Zhang
593ee49222 ACPI: property: Fix fwnode refcount leak in acpi_fwnode_graph_parse_endpoint()
acpi_fwnode_graph_parse_endpoint() calls fwnode_get_parent() to obtain the
parent fwnode but returns without calling fwnode_handle_put() on it. This
potentially leads to a fwnode refcount leak and prevents the parent node
from being released properly.

Call fwnode_handle_put() on the parent fwnode before returning to prevent
the leak from occurring.

Fixes: 3b27d00e7b ("device property: Move fwnode graph ops to firmware specific locations")
Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20251111075000.1828-1-vulab@iscas.ac.cn
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-12 21:23:56 +01:00
Srinivas Pandruvada
172880f7c9 ACPI: DPTF: Support Nova Lake
Add Nova Lake ACPI IDs for DPTF.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20251111004552.137984-3-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-12 21:14:21 +01:00
Srinivas Pandruvada
dbd911a07f thermal: intel: int340x: Add DLVR support for Nova Lake
Add support for DLVR (Digital Linear Voltage Regulator) for Nova Lake.

There are no new sysfs attributes or difference in operations compared
to prior generations.

MMIO offset and bit positions are changed. Also no mapping is required
as units are already in MHz.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20251111004552.137984-2-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-12 21:14:21 +01:00
Srinivas Pandruvada
af1b80b941 thermal: int340x: processor_thermal: Add Nova Lake processor thermal device
Add PCI IDs for Nova Lake processor thermal device.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20251111004552.137984-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-12 21:14:21 +01:00
Kaushlendra Kumar
347d92a795 thermal: intel: int340x: Replace sprintf() with sysfs_emit()
Replace sprintf() calls with sysfs_emit() in sysfs "show" functions to
follow current kernel coding standards.

sysfs_emit() is the preferred method for formatting sysfs output as it
provides better bounds checking and is more secure.

Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
[ rjw: Subject adjustments, changelog edits ]
Link: https://patch.msgid.link/20251030053410.311656-1-kaushlendra.kumar@intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-12 21:13:13 +01:00
Kaushlendra Kumar
9019907816 thermal: intel: int340x: Use symbolic constant for UUID comparison
Replace sizeof() with a symbolic constant for UUID matching to maintain
existing ABI behavior while improving code clarity. The current behavior
of comparing only the first 7 characters is sufficient to distinguish
all UUIDs and changing to full string comparison would alter the kernel
ABI, potentially breaking existing userspace applications.

Use a defined constant to make the truncated comparison explicit and
maintainable.

Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
[ rjw: Subject adjustments ]
Link: https://patch.msgid.link/20251030035955.62171-1-kaushlendra.kumar@intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-12 21:13:13 +01:00
Christian Loehle
0796ddf4a7 cpuidle: teo: Use this_cpu_ptr() where possible
The cpuidle governor callbacks for update, select and reflect
are always running on the actual idle entering/exiting CPU, so
use the more optimized this_cpu_ptr() to access the internal teo
data.

This brings down the latency-critical teo_reflect() from
static void teo_reflect(struct cpuidle_device *dev, int state)
{
ffffffc080ffcff0:	hint	#0x19
ffffffc080ffcff4:	stp	x29, x30, [sp, #-48]!
	struct teo_cpu *cpu_data = per_cpu_ptr(&teo_cpus, dev->cpu);
ffffffc080ffcff8:	adrp	x2, ffffffc0848c0000 <gicv5_global_data+0x28>
{
ffffffc080ffcffc:	add	x29, sp, #0x0
ffffffc080ffd000:	stp	x19, x20, [sp, #16]
ffffffc080ffd004:	orr	x20, xzr, x0
	struct teo_cpu *cpu_data = per_cpu_ptr(&teo_cpus, dev->cpu);
ffffffc080ffd008:	add	x0, x2, #0xc20
{
ffffffc080ffd00c:	stp	x21, x22, [sp, #32]
	struct teo_cpu *cpu_data = per_cpu_ptr(&teo_cpus, dev->cpu);
ffffffc080ffd010:	adrp	x19, ffffffc083eb5000 <cpu_devices+0x78>
ffffffc080ffd014:	add	x19, x19, #0xbb0
ffffffc080ffd018:	ldr	w3, [x20, #4]

	dev->last_state_idx = state;

to

static void teo_reflect(struct cpuidle_device *dev, int state)
{
ffffffc080ffd034:	hint	#0x19
ffffffc080ffd038:	stp	x29, x30, [sp, #-48]!
ffffffc080ffd03c:	add	x29, sp, #0x0
ffffffc080ffd040:	stp	x19, x20, [sp, #16]
ffffffc080ffd044:	orr	x20, xzr, x0
	struct teo_cpu *cpu_data = this_cpu_ptr(&teo_cpus);
ffffffc080ffd048:	adrp	x19, ffffffc083eb5000 <cpu_devices+0x78>
{
ffffffc080ffd04c:	stp	x21, x22, [sp, #32]
	struct teo_cpu *cpu_data = this_cpu_ptr(&teo_cpus);
ffffffc080ffd050:	add	x19, x19, #0xbb0

	dev->last_state_idx = state;

This saves us:
	adrp    x2, ffffffc0848c0000 <gicv5_global_data+0x28>
	add     x0, x2, #0xc20
	ldr     w3, [x20, #4]

Signed-off-by: Christian Loehle <christian.loehle@arm.com>
[ rjw: Subject tweak ]
Link: https://patch.msgid.link/20251110120819.714560-1-christian.loehle@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-12 21:02:07 +01:00
Rafael J. Wysocki
76934e495c cpuidle: Add sanity check for exit latency and target residency
Make __cpuidle_driver_init() fail if the exit latency of one of the
driver's idle states is less than its target residency which would
break cpuidle assumptions.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
[ rjw: Changelog fix ]
Link: https://patch.msgid.link/12779486.O9o76ZdvQC@rafael.j.wysocki
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-12 21:00:00 +01:00
Rafael J. Wysocki
9cf02802d6 PM: wakeup: Update after recent wakeup source removal ordering change
After a recent change, wakeup_source_activate() will warn that the given
wakeup source is "unregistered" after its timer has been shut down
in wakeup_source_remove() which may be somewhat confusing, so change
the warning message to say that the wakeup source is "unusable".

Accordingly, rename wakeup_source_not_registered() to
wakeup_source_not_usable() and update the comment in it
to also mention the removal of the wakeup source.

Also restore the comment in wakeup_source_remove() regarding the warning
in wakeup_source_activate() that may trigger after shutting down the
wakeup source timer.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/12788103.O9o76ZdvQC@rafael.j.wysocki
2025-11-12 20:56:25 +01:00
Rafael J. Wysocki
62c95ea763 cpufreq: intel_pstate: Use mutex guard for driver locking
Use guard(mutex)(&intel_pstate_driver_lock), or the scoped variant of
it, wherever intel_pstate_driver_lock needs to be held.

This allows some local variables and goto statements to be dropped as
they are not necessary any more.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Link: https://patch.msgid.link/2807232.mvXUDI8C0e@rafael.j.wysocki
2025-11-12 20:54:57 +01:00
Rafael J. Wysocki
25ca66300a Merge tag 'amd-pstate-v6.19-2025-11-10' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux
Pull amd-pstate content for 6.19 (11/10/25) from Mario Liminciello:

"* optimizations for parameter array handling
 * fix for mode changes with offline CPUs"

* tag 'amd-pstate-v6.19-2025-11-10' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux:
  cpufreq/amd-pstate: Call cppc_set_auto_sel() only for online CPUs
  cpufreq/amd-pstate: Add static asserts for EPP indices
  cpufreq/amd-pstate: Fix some whitespace issues
  cpufreq/amd-pstate: Adjust return values in amd_pstate_update_status()
  cpufreq/amd-pstate: Make amd_pstate_get_mode_string() never return NULL
  cpufreq/amd-pstate: Drop NULL value from amd_pstate_mode_string
  cpufreq/amd-pstate: Use sysfs_match_string() for epp
2025-11-12 20:51:53 +01:00
Rafael J. Wysocki
377e38859c Merge back cpufreq material for 6.19 2025-11-12 20:50:58 +01:00
Eric Biggers
5dc8d27752 Merge tag 'arm64-fpsimd-on-stack-for-v6.19' into libcrypto-fpsimd-on-stack
Pull fpsimd-on-stack changes from Ard Biesheuvel:

  "Shared tag/branch for arm64 FP/SIMD changes going through libcrypto"

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-12 10:15:07 -08:00
Ard Biesheuvel
8dcac98a47 lib/crypto: arm64: Move remaining algorithms to scoped ksimd API
Move the arm64 implementations of SHA-3 and POLYVAL to the newly
introduced scoped ksimd API, which replaces kernel_neon_begin() and
kernel_neon_end(). On arm64, this is needed because the latter API
will change in an incompatible manner.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-12 10:14:11 -08:00
Ard Biesheuvel
c0d597e016 lib/crypto: arm/blake2b: Move to scoped ksimd API
Even though ARM's versions of kernel_neon_begin()/_end() are not being
changed, update the newly migrated ARM blake2b to the scoped ksimd API
so that all ARM and arm64 in lib/crypto remains consistent in this
manner.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-12 09:57:52 -08:00
Eric Biggers
065f040010 Merge tag 'scoped-ksimd-for-arm-arm64' into libcrypto-fpsimd-on-stack
Pull scoped ksimd API for ARM and arm64 from Ard Biesheuvel:

  "Introduce a more strict replacement API for
   kernel_neon_begin()/kernel_neon_end() on both ARM and arm64, and
   replace occurrences of the latter pair appearing in lib/crypto"

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-12 09:55:55 -08:00
Ard Biesheuvel
4fa617cc68 arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack
Commit aefbab8e77

  ("arm64: fpsimd: Preserve/restore kernel mode NEON at context switch")

added a 'kernel_fpsimd_state' field to struct thread_struct, which is
the arch-specific portion of struct task_struct, and is allocated for
each task in the system. The size of this field is 528 bytes, resulting
in non-negligible bloat of task_struct, and the resulting memory
overhead may impact performance on systems with many processes.

This allocation is only used if the task is scheduled out or interrupted
by a softirq while using the FP/SIMD unit in kernel mode, and so it is
possible to transparently allocate this buffer on the caller's stack
instead.

So tweak the 'ksimd' scoped guard implementation so that a stack buffer
is allocated and passed to both kernel_neon_begin() and
kernel_neon_end(), and either record it in the task struct, or use it
directly to preserve the task mode kernel FP/SIMD when running in
softirq context. Passing the address to both functions, and checking the
addresses for consistency ensures that callers of the updated bare
begin/end API use it in a manner that is consistent with the new context
switch semantics.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:52:03 +01:00
Ard Biesheuvel
103728a716 arm64/fpu: Enforce task-context only for generic kernel mode FPU
The generic kernel mode FPU API, which is used by the AMDGPU driver to
perform floating point calculations, is modeled after the most
restrictive architecture that supports it. This means it doesn't support
preemption, and can only be used from task context.

The arm64 implementation is a bit more flexible, but supporting that in
the generic API complicates matters slightly, and for no good reason,
given that the only user does not need it.

So enforce that kernel_fpu_begin() can only be called from task context,
and [redundantly] disable preemption. This removes the need for users of
this API to provide a kernel mode FP/SIMD state after a future patch
that makes that compulsory for preemptible task context.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:52:03 +01:00
Ard Biesheuvel
9dc106fa1e net/mlx5: Switch to more abstract scoped ksimd guard API on arm64
Instead of calling kernel_neon_begin/end directly, switch to the scoped
guard API which encapsulates those calls. This is needed because the
prototypes of those APIs are going to be modified and will require a
kernel mode FP/SIMD buffer to be provided, which the scoped guard API
will do transparently.

Cc: Saeed Mahameed <saeedm@nvidia.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Tariq Toukan <tariqt@nvidia.com>
Cc: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:52:02 +01:00
Ard Biesheuvel
ab5718f06b arm64/xorblocks: Switch to 'ksimd' scoped guard API
Switch to the more abstract 'scoped_ksimd()' API, which will be modified
in a future patch to transparently allocate a kernel mode FP/SIMD state
buffer on the stack, so that kernel mode FP/SIMD code remains
preemptible in principe, but without the memory overhead that adds 528
bytes to the size of struct task_struct.

Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:52:02 +01:00
Ard Biesheuvel
03bc4768fb crypto/arm64: sm4 - Switch to 'ksimd' scoped guard API
Switch to the more abstract 'scoped_ksimd()' API, which will be modified
in a future patch to transparently allocate a kernel mode FP/SIMD state
buffer on the stack, so that kernel mode FP/SIMD code remains
preemptible in principle, but without the memory overhead that adds 528
bytes to the size of struct task_struct.

Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:52:02 +01:00
Ard Biesheuvel
ab9615b501 crypto/arm64: sm3 - Switch to 'ksimd' scoped guard API
Switch to the more abstract 'scoped_ksimd()' API, which will be modified
in a future patch to transparently allocate a kernel mode FP/SIMD state
buffer on the stack, so that kernel mode FP/SIMD code remains
preemptible in principle, but without the memory overhead that adds 528
bytes to the size of struct task_struct.

Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:52:02 +01:00
Ard Biesheuvel
a6b4084455 crypto/arm64: sha3 - Switch to 'ksimd' scoped guard API
Switch to the more abstract 'scoped_ksimd()' API, which will be modified
in a future patch to transparently allocate a kernel mode FP/SIMD state
buffer on the stack, so that kernel mode FP/SIMD code remains
preemptible in principe, but without the memory overhead that adds 528
bytes to the size of struct task_struct.

Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:52:02 +01:00
Ard Biesheuvel
931ceb5785 crypto/arm64: polyval - Switch to 'ksimd' scoped guard API
Switch to the more abstract 'scoped_ksimd()' API, which will be modified
in a future patch to transparently allocate a kernel mode FP/SIMD state
buffer on the stack, so that kernel mode FP/SIMD code remains
preemptible in principe, but without the memory overhead that adds 528
bytes to the size of struct task_struct.

Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:52:02 +01:00
Ard Biesheuvel
72cb51233b crypto/arm64: nhpoly1305 - Switch to 'ksimd' scoped guard API
Switch to the more abstract 'scoped_ksimd()' API, which will be modified
in a future patch to transparently allocate a kernel mode FP/SIMD state
buffer on the stack, so that kernel mode FP/SIMD code remains
preemptible in principe, but without the memory overhead that adds 528
bytes to the size of struct task_struct.

Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:52:02 +01:00
Ard Biesheuvel
87c9b04e71 crypto/arm64: aes-gcm - Switch to 'ksimd' scoped guard API
Switch to the more abstract 'scoped_ksimd()' API, which will be modified
in a future patch to transparently allocate a kernel mode FP/SIMD state
buffer on the stack, so that kernel mode FP/SIMD code remains
preemptible in principe, but without the memory overhead that adds 528
bytes to the size of struct task_struct.

Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:52:02 +01:00
Ard Biesheuvel
ba3c1b3b5a crypto/arm64: aes-blk - Switch to 'ksimd' scoped guard API
Switch to the more abstract 'scoped_ksimd()' API, which will be modified
in a future patch to transparently allocate a kernel mode FP/SIMD state
buffer on the stack, so that kernel mode FP/SIMD code remains
preemptible in principe, but without the memory overhead that adds 528
bytes to the size of struct task_struct.

Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:52:01 +01:00
Ard Biesheuvel
b044c7e4c7 crypto/arm64: aes-ccm - Switch to 'ksimd' scoped guard API
Switch to the more abstract 'scoped_ksimd()' API, which will be modified
in a future patch to transparently allocate a kernel mode FP/SIMD state
buffer on the stack, so that kernel mode FP/SIMD code remains
preemptible in principe, but without the memory overhead that adds 528
bytes to the size of struct task_struct.

Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:52:01 +01:00
Ard Biesheuvel
3142ec4af2 raid6: Move to more abstract 'ksimd' guard API
Move away from calling kernel_neon_begin() and kernel_neon_end()
directly, and instead, use the newly introduced scoped_ksimd() API. This
permits arm64 to modify the kernel mode NEON API without affecting code
that is shared between ARM and arm64.

Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:52:01 +01:00
Ard Biesheuvel
88a7999e80 crypto: aegis128-neon - Move to more abstract 'ksimd' guard API
Move away from calling kernel_neon_begin() and kernel_neon_end()
directly, and instead, use the newly introduced scoped_ksimd() API. This
permits arm64 to modify the kernel mode NEON API without affecting code
that is shared between ARM and arm64.

Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:52:01 +01:00
Ard Biesheuvel
c13aebfeee crypto/arm64: sm4-ce-gcm - Avoid pointless yield of the NEON unit
Kernel mode NEON sections are now preemptible on arm64, and so there is
no need to yield it when calling APIs that may sleep.

Also, move the calls to kernel_neon_end() to the same scope as
kernel_neon_begin(). This is needed for a subsequent change where a
stack buffer is allocated transparently and passed to
kernel_neon_begin().

While at it, simplify the logic.

Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:52:01 +01:00
Ard Biesheuvel
9520ef3771 crypto/arm64: sm4-ce-ccm - Avoid pointless yield of the NEON unit
Kernel mode NEON sections are now preemptible on arm64, and so there is
no need to yield it when calling APIs that may sleep.

Also, move the calls to kernel_neon_end() to the same scope as
kernel_neon_begin(). This is needed for a subsequent change where a
stack buffer is allocated transparently and passed to
kernel_neon_begin().

Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:52:01 +01:00
Ard Biesheuvel
e9426f3e6b crypto/arm64: aes-ce-ccm - Avoid pointless yield of the NEON unit
Kernel mode NEON sections are now preemptible on arm64, and so there is
no need to yield it explicitly in order to prevent scheduling latency
spikes.

Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:52:01 +01:00
Ard Biesheuvel
4fb623074e lib/crc: Switch ARM and arm64 to 'ksimd' scoped guard API
Before modifying the prototypes of kernel_neon_begin() and
kernel_neon_end() to accommodate kernel mode FP/SIMD state buffers
allocated on the stack, move arm64 to the new 'ksimd' scoped guard API,
which encapsulates the calls to those functions.

For symmetry, do the same for 32-bit ARM too.

Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:52:01 +01:00
Ard Biesheuvel
f53d18a4e6 lib/crypto: Switch ARM and arm64 to 'ksimd' scoped guard API
Before modifying the prototypes of kernel_neon_begin() and
kernel_neon_end() to accommodate kernel mode FP/SIMD state buffers
allocated on the stack, move arm64 to the new 'ksimd' scoped guard API,
which encapsulates the calls to those functions.

For symmetry, do the same for 32-bit ARM too.

Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:51:13 +01:00
Ard Biesheuvel
814f5415d3 ARM/simd: Add scoped guard API for kernel mode SIMD
Implement the ksimd scoped guard API so that it can be used by code that
supports both ARM and arm64.

Reviewed-by: Kees Cook <kees@kernel.org>
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:22:39 +01:00
Ard Biesheuvel
c5b91a17cc arm64/simd: Add scoped guard API for kernel mode SIMD
Encapsulate kernel_neon_begin() and kernel_neon_end() using a 'ksimd'
cleanup guard. This hides the prototype of those functions, allowing
them to be changed for arm64 but not ARM, without breaking code that is
shared between those architectures (RAID6, AEGIS-128)

It probably makes sense to expose this API more widely across
architectures, as it affords more flexibility to the arch code to
plumb it in, while imposing more rigid rules regarding the start/end
bookends appearing in matched pairs.

Reviewed-by: Kees Cook <kees@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:22:39 +01:00
Eric Biggers
578fe3ff3d crypto: testmgr - Remove polyval tests
These are no longer used, since polyval support has been removed from
the crypto_shash API.

POLYVAL remains supported via lib/crypto/, where it has a KUnit test
suite instead.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:07:52 -08:00
Eric Biggers
b3aed551b3 lib/crypto: tests: Add KUnit tests for POLYVAL
Add a test suite for the POLYVAL library, including:

- All the standard tests and the benchmark from hash-test-template.h
- Comparison with a test vector from the RFC
- Test with key and message containing all one bits
- Additional tests related to the key struct

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:07:52 -08:00
Eric Biggers
b2210f3516 lib/crypto: tests: Add additional SHAKE tests
Add the following test cases to cover gaps in the SHAKE testing:

    - test_shake_all_lens_up_to_4096()
    - test_shake_multiple_squeezes()
    - test_shake_with_guarded_bufs()

Remove test_shake256_tiling() and test_shake256_tiling2() since they are
superseded by test_shake_multiple_squeezes().  It provides better test
coverage by using randomized testing.  E.g., it's able to generate a
zero-length squeeze followed by a nonzero-length squeeze, which the
first 7 versions of the SHA-3 patchset handled incorrectly.

Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:07:36 -08:00
David Howells
15c64c47e4 lib/crypto: tests: Add SHA3 kunit tests
Add a SHA3 kunit test suite, providing the following:

 (*) A simple test of each of SHA3-224, SHA3-256, SHA3-384, SHA3-512,
     SHAKE128 and SHAKE256.

 (*) NIST 0- and 1600-bit test vectors for SHAKE128 and SHAKE256.

 (*) Output tiling (multiple squeezing) tests for SHAKE256.

 (*) Standard hash template test for SHA3-256.  To make this possible,
     gen-hash-testvecs.py is modified to support sha3-256.

 (*) Standard benchmark test for SHA3-256.

[EB: dropped some unnecessary changes to gen-hash-testvecs.py, moved
     addition of Testing section in doc file into this commit, and
     other small cleanups]

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20251026055032.1413733-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:07:36 -08:00
Eric Biggers
6401fd334d lib/crypto: tests: Add KUnit tests for BLAKE2b
Add a KUnit test suite for the BLAKE2b library API, mirroring the
BLAKE2s test suite very closely.

As with the BLAKE2s test suite, a benchmark is included.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:07:36 -08:00
Eric Biggers
2dbb6f4a25 fscrypt: Drop obsolete recommendation to enable optimized POLYVAL
CONFIG_CRYPTO_POLYVAL_ARM64_CE and CONFIG_CRYPTO_POLYVAL_CLMUL_NI no
longer exist.  The architecture-optimized POLYVAL code is now just
enabled automatically when HCTR2 support is enabled.  Update the fscrypt
documentation accordingly.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:03:39 -08:00
Eric Biggers
fd36de5749 crypto: polyval - Remove the polyval crypto_shash
Remove polyval support from crypto_shash.  It no longer has any user now
that the HCTR2 code uses the POLYVAL library instead.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:03:38 -08:00
Eric Biggers
d35abc0b1d crypto: hctr2 - Convert to use POLYVAL library
The "hash function" in hctr2 is fixed at POLYVAL; it can never vary.
Just use the POLYVAL library, which is much easier to use than the
crypto_shash API.  It's faster, uses fixed-size structs, and never fails
(all the functions return void).

Note that this eliminates the only known user of the polyval support in
crypto_shash.  A later commit will remove support for polyval from
crypto_shash, given that the library API is sufficient.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:03:38 -08:00
Eric Biggers
4d8da35579 lib/crypto: x86/polyval: Migrate optimized code into library
Migrate the x86_64 implementation of POLYVAL into lib/crypto/, wiring it
up to the POLYVAL library interface.  This makes the POLYVAL library be
properly optimized on x86_64.

This drops the x86_64 optimizations of polyval in the crypto_shash API.
That's fine, since polyval will be removed from crypto_shash entirely
since it is unneeded there.  But even if it comes back, the crypto_shash
API could just be implemented on top of the library API, as usual.

Adjust the names and prototypes of the assembly functions to align more
closely with the rest of the library code.

Also replace a movaps instruction with movups to remove the assumption
that the key struct is 16-byte aligned.  Users can still align the key
if they want (and at least in this case, movups is just as fast as
movaps), but it's inconvenient to require it.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:03:38 -08:00
Eric Biggers
37919e239e lib/crypto: arm64/polyval: Migrate optimized code into library
Migrate the arm64 implementation of POLYVAL into lib/crypto/, wiring it
up to the POLYVAL library interface.  This makes the POLYVAL library be
properly optimized on arm64.

This drops the arm64 optimizations of polyval in the crypto_shash API.
That's fine, since polyval will be removed from crypto_shash entirely
since it is unneeded there.  But even if it comes back, the crypto_shash
API could just be implemented on top of the library API, as usual.

Adjust the names and prototypes of the assembly functions to align more
closely with the rest of the library code.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:03:38 -08:00
Eric Biggers
3d176751e5 lib/crypto: polyval: Add POLYVAL library
Add support for POLYVAL to lib/crypto/.

This will replace the polyval crypto_shash algorithm and its use in the
hctr2 template, simplifying the code and reducing overhead.

Specifically, this commit introduces the POLYVAL library API and a
generic implementation of it.  Later commits will migrate the existing
architecture-optimized implementations of POLYVAL into lib/crypto/ and
add a KUnit test suite.

I've also rewritten the generic implementation completely, using a more
modern approach instead of the traditional table-based approach.  It's
now constant-time, requires no precomputation or dynamic memory
allocations, decreases the per-key memory usage from 4096 bytes to 16
bytes, and is faster than the old polyval-generic even on bulk data
reusing the same key (at least on x86_64, where I measured 15% faster).
We should do this for GHASH too, but for now just do it for POLYVAL.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:03:38 -08:00
Eric Biggers
e1c3608497 crypto: polyval - Rename conflicting functions
Rename polyval_init() and polyval_update(), in preparation for adding
library functions with the same name to <crypto/polyval.h>.

Note that polyval-generic.c will be removed later, as it will be
superseded by the library.  This commit just keeps the kernel building
for the initial introduction of the library.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:03:38 -08:00
Gautham R. Shenoy
bb31fef0d0 cpufreq/amd-pstate: Call cppc_set_auto_sel() only for online CPUs
amd_pstate_change_mode_without_dvr_change() calls cppc_set_auto_sel()
for all the present CPUs.

However, this callpath eventually calls cppc_set_reg_val() which
accesses the per-cpu cpc_desc_ptr object. This object is initialized
only for online CPUs via acpi_soft_cpu_online() -->
__acpi_processor_start() --> acpi_cppc_processor_probe().

Hence, restrict calling cppc_set_auto_sel() to only the online CPUs.

Fixes: 3ca7bc818d ("cpufreq: amd-pstate: Add guided mode control support via sysfs")
Suggested-by: Mario Limonciello (AMD) (kernel.org) <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-11-10 23:35:20 -06:00
Mario Limonciello (AMD)
077f23573d cpufreq/amd-pstate: Add static asserts for EPP indices
In case a new index is introduced add a static assert to make sure
that strings and values are updated.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-11-10 23:35:20 -06:00
Mario Limonciello (AMD)
e9d62ca86a cpufreq/amd-pstate: Fix some whitespace issues
Add whitespace around the equals and remove leading space.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-11-10 23:35:20 -06:00
Mario Limonciello (AMD)
92d6146a40 cpufreq/amd-pstate: Adjust return values in amd_pstate_update_status()
get_mode_idx_from_str() already checks the upper boundary for a string
sent.  Drop the extra check in amd_pstate_update_status() and pass
the return code if there is a failure.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-11-10 23:35:20 -06:00
Mario Limonciello (AMD)
baf106f3a7 cpufreq/amd-pstate: Make amd_pstate_get_mode_string() never return NULL
amd_pstate_get_mode_string() is only used by amd-pstate-ut.  Set the
failure path to use AMD_PSTATE_UNDEFINED ("undefined") to avoid showing
"(null)" as a string when running test suite.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-11-10 23:35:20 -06:00
Mario Limonciello (AMD)
06791bc017 cpufreq/amd-pstate: Drop NULL value from amd_pstate_mode_string
None of the users actually look for the NULL value.  To avoid risk
of regression introducing a new value but forgetting to add a string
add a static assert to test AMD_PSTATE_MAX matches the array size.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-11-10 23:35:20 -06:00
Mario Limonciello (AMD)
7e17f48667 cpufreq/amd-pstate: Use sysfs_match_string() for epp
Rather than scanning the buffer and manually matching the string
use the sysfs macros.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-11-10 23:35:20 -06:00
Thomas Weißschuh
2db833312d random: drop check for static_key_initialized
Commit e871abcda3 ("random: handle creditable entropy from atomic
process context") added the use of workqueues, which meant testing
whether the workqueue is valid, but it did not remove the existing check
of whether static keys have been initialized. This static key check is
unnecessary because workqueues are initialized long after it. And
semantically it doesn't make much sense either, because it's not really
directly calling a static key function in the condition.

Remove the now unnecessary check.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
[Jason: rewrite commit message with different explanation, rebase on
        random.git, and update code comment.]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2025-11-11 01:25:31 +01:00
Marek Vasut
b1c4c05bb0 thermal/drivers/rcar_gen3: Document R-Car Gen4 and RZ/G2 support in driver comment
The R-Car Gen3 thermal driver supports both R-Car Gen3 and Gen4 SoCs
as well as RZ/G2. Update the driver comment. No functional change.

Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Link: https://patch.msgid.link/20251110143029.10940-1-marek.vasut+renesas@mailbox.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2025-11-10 15:56:49 +01:00
Manaf Meethalavalappu Pallikunhi
e1304efc19 dt-bindings: thermal: qcom-tsens: document the Kaanapali Temperature Sensor
Document the Temperature Sensor (TSENS) on the Kaanapali Platform.

Signed-off-by: Manaf Meethalavalappu Pallikunhi <manaf.pallikunhi@oss.qualcomm.com>
Signed-off-by: Jingyi Wang <jingyi.wang@oss.qualcomm.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20251021-b4-knp-tsens-v2-1-7b662e2e71b4@oss.qualcomm.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2025-11-10 13:01:40 +01:00
Ovidiu Panait
30183a67a8 dt-bindings: thermal: r9a09g047-tsu: Document RZ/V2H TSU
The Renesas RZ/V2H SoC includes a Thermal Sensor Unit (TSU) block designed
to measure the junction temperature. The device provides real-time
temperature measurements for thermal management, utilizing two dedicated
channels for temperature sensing.

The Renesas RZ/V2H SoC is using the same TSU IP found on the RZ/G3E SoC,
the only difference being that it has two channels instead of one.

Add new compatible string "renesas,r9a09g057-tsu" for RZ/V2H and use
"renesas,r9a09g047-tsu" as a fallback compatible to indicate hardware
compatibility with the RZ/G3E implementation.

Signed-off-by: Ovidiu Panait <ovidiu.panait.rb@renesas.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20251020143107.13974-3-ovidiu.panait.rb@renesas.com
2025-11-10 12:56:17 +01:00
Marco Crivellari
47c303ba6e cpufreq: tegra194: add WQ_PERCPU to alloc_workqueue users
Currently if a user enqueues a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistency cannot be addressed without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This continues the effort to refactor workqueue APIs, which began with
the introduction of new workqueues and a new alloc_workqueue flag in:

commit 128ea9f6cc ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566 ("workqueue: Add new WQ_PERCPU flag")

This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
[ Viresh: Fixed Subject ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-11-10 16:18:48 +05:30
Christian Marangi
58f5d39d5e cpufreq: qcom-nvmem: add compatible fallback for ipq806x for no SMEM
On some IPQ806x SoC SMEM might be not initialized by SBL. This is the
case for some Google devices (the OnHub family) that can't make use of
SMEM to detect the SoC ID (and socinfo can't be used either as it does
depends on SMEM presence).

To handle these specific case, check if the SMEM is not initialized (by
checking if the qcom_smem_get_soc_id returns -ENODEV) and fallback to
OF machine compatible checking to identify the SoC variant.

Suggested-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-11-10 16:16:52 +05:30
Kaushlendra Kumar
352899fd91 PM: wakeup: Delete timer before removing wakeup source from list
Replace timer_delete_sync() with timer_shutdown_sync() and move
it before list_del_rcu() in wakeup_source_remove() to improve the
cleanup ordering and code clarity.

This ensures that the timer is stopped before removing the wakeup
source from the events list, providing a more logical cleanup
sequence.

While the current ordering is functionally correct, stopping the
timer first makes the cleanup flow more intuitive and follows the
general pattern of disabling active components before removing data
structures.

Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
[ rjw: Subject and changelog edits ]
Link: https://patch.msgid.link/20251027044127.2456365-1-kaushlendra.kumar@intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-08 12:17:28 +01:00
Kaushlendra Kumar
2f58be82fc ACPI: DPTF: Use ACPI_FREE() for ACPI buffer deallocation
Replace kfree() with ACPI_FREE() in pch_fivr_read() to follow ACPICA
memory management conventions.

While functionally equivalent in Linux (ACPI_FREE() is implemented
as kfree()), using ACPI_FREE() maintains consistency with ACPICA
coding standards for deallocating ACPI buffer objects.

Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
[ rjw: Subject and changelog edits ]
Link: https://patch.msgid.link/20251028051554.2862049-1-kaushlendra.kumar@intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-07 21:24:54 +01:00
Slawomir Rosek
966c9e65ba ACPI: DPTF: Remove int340x thermal scan handler
Using the IS_ENABLED() macro in the int340x_thermal_handler_attach()
forces the kernel to be recompiled when thermal drivers are enabled
or disabled, which is a significant limitation of its modularity.

The IS_ENABLED() macro is particularly problematic for the Android
Generic Kernel Image (GKI) project which uses unified core kernel
while SoC/board support is moved to loadable vendor modules.

The Intel Dynamic Platform and Thermal Framework (DPTF) requires
thermal drivers to be loaded at runtime, thus ACPI bus scan handler
is not needed and acpi_default_enumeration() may create all platform
devices, regardless of the actual setting of CONFIG_INT340X_THERMAL.

Signed-off-by: Slawomir Rosek <srosek@google.com>
Link: https://patch.msgid.link/20251103162516.2606158-3-srosek@google.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-07 20:45:37 +01:00
Slawomir Rosek
13a96342d5 thermal: intel: Select INT340X_THERMAL from INTEL_SOC_DTS_THERMAL
The IRQ used by the Intel SoC DTS thermal device for critical
overheating notification is listed in _CRS of device INT3401 which
therefore needs to be enumerated for Intel SoC DTS thermal to work.

The enumeration happens by binding the int3401_thermal driver to the
INT3401 platform device. Thus CONFIG_INT340X_THERMAL is in fact
necessary for enumerating it, so checking CONFIG_INTEL_SOC_DTS_THERMAL
in int340x_thermal_handler_attach() is pointless and INT340X_THERMAL
may as well be selected by INTEL_SOC_DTS_THERMAL.

Signed-off-by: Slawomir Rosek <srosek@google.com>
[ rjw: New subject ]
Link: https://patch.msgid.link/20251103162516.2606158-2-srosek@google.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-07 20:45:37 +01:00
Huisong Li
77ca1612b8 ACPI: processor: idle: Drop redundant C-state count checks
acpi_processor_setup_cstates() and acpi_processor_setup_cpuidle_cx()
are called after successfully obtaining power information. Among other
things, these setup functions check the C-state count against zero.

However, that check is done by acpi_processor_get_power_info_cst()
which will cause acpi_processor_get_power_info() to fail if it does
no pass, so the checks in the two functions mentioned above are
redundant.

Drop those redundant checks.

No intentional functional impact.

Signed-off-by: Huisong Li <lihuisong@huawei.com>
[ rjw: Subject and changelog rewrite ]
Link: https://patch.msgid.link/20251105093647.3557248-1-lihuisong@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-07 18:47:58 +01:00
Mario Limonciello (AMD)
39ce15a48f Documentation: power: Correct a mistaken configuration option
Somehow CONFIG_PSTORE_FIRMWARE ended up in this document when I intended
it to be CONFIG_CHROMEOS_PSTORE.  Correct the configuration option and
make it clear that not all options are required.

Fixes: b1f02f005a ("Documentation: power: Add document on debugging shutdown hangs")
Reported-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
[ rjw: Fixes: tag ]
Link: https://patch.msgid.link/20251106142524.3841343-1-superm1@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-07 16:19:14 +01:00
Eric Biggers
8ba60c5914 lib/crypto: x86/blake2s: Use vpternlogd for 3-input XORs
AVX-512 supports 3-input XORs via the vpternlogd (or vpternlogq)
instruction with immediate 0x96.  This approach, vs. the alternative of
two vpxor instructions, is already used in the CRC, AES-GCM, and AES-XTS
code, since it reduces the instruction count and is faster on some CPUs.
Make blake2s_compress_avx512() take advantage of it too.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:52 -08:00
Eric Biggers
cd5528621a lib/crypto: x86/blake2s: Avoid writing back unchanged 'f' value
Just before returning, blake2s_compress_ssse3() and
blake2s_compress_avx512() store updated values to the 'h', 't', and 'f'
fields of struct blake2s_ctx.  But 'f' is always unchanged (which is
correct; only the C code changes it).  So, there's no need to write to
'f'.  Use 64-bit stores (movq and vmovq) instead of 128-bit stores
(movdqu and vmovdqu) so that only 't' is written.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:52 -08:00
Eric Biggers
a7acd77ebd lib/crypto: x86/blake2s: Improve readability
Various cleanups for readability.  No change to the generated code:

- Add some comments
- Add #defines for arguments
- Rename some labels
- Use decimal constants instead of hex where it makes sense.
  (The pshufd immediates intentionally remain as hex.)
- Add blank lines when there's a logical break

The round loop still could use some work, but this is at least a start.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:52 -08:00
Eric Biggers
83c1a867c9 lib/crypto: x86/blake2s: Use local labels for data
Following the usual practice, prefix the names of the data labels with
".L" so that the assembler treats them as truly local.  This more
clearly expresses the intent and is less error-prone.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:52 -08:00
Eric Biggers
c19bdf24cc lib/crypto: x86/blake2s: Drop check for nblocks == 0
Since blake2s_compress() is always passed nblocks != 0, remove the
unnecessary check for nblocks == 0 from blake2s_compress_ssse3().

Note that this makes it consistent with blake2s_compress_avx512() in the
same file as well as the arm32 blake2s_compress().

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:52 -08:00
Eric Biggers
2f22115709 lib/crypto: x86/blake2s: Fix 32-bit arg treated as 64-bit
In the C code, the 'inc' argument to the assembly functions
blake2s_compress_ssse3() and blake2s_compress_avx512() is declared with
type u32, matching blake2s_compress().  The assembly code then reads it
from the 64-bit %rcx.  However, the ABI doesn't guarantee zero-extension
to 64 bits, nor do gcc or clang guarantee it.  Therefore, fix these
functions to read this argument from the 32-bit %ecx.

In theory, this bug could have caused the wrong 'inc' value to be used,
causing incorrect BLAKE2s hashes.  In practice, probably not: I've fixed
essentially this same bug in many other assembly files too, but there's
never been a real report of it having caused a problem.  In x86_64, all
writes to 32-bit registers are zero-extended to 64 bits.  That results
in zero-extension in nearly all situations.  I've only been able to
demonstrate a lack of zero-extension with a somewhat contrived example
involving truncation, e.g. when the C code has a u64 variable holding
0x1234567800000040 and passes it as a u32 expecting it to be truncated
to 0x40 (64).  But that's not what the real code does, of course.

Fixes: ed0356eda1 ("crypto: blake2s - x86_64 SIMD implementation")
Cc: stable@vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:51 -08:00
Eric Biggers
95ce85de0b lib/crypto: arm, arm64: Drop filenames from file comments
Remove self-references to filenames from assembly files in
lib/crypto/arm/ and lib/crypto/arm64/.  This follows the recommended
practice and eliminates an outdated reference to sha2-ce-core.S.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102014809.170713-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:51 -08:00
Eric Biggers
b8b816ec04 lib/crypto: arm/blake2s: Fix some comments
Fix the indices in some comments in blake2s-core.S.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102021553.176587-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:51 -08:00
Eric Biggers
496df7cd64 crypto: s390/sha3 - Remove superseded SHA-3 code
The SHA-3 library now utilizes the same s390 SHA-3 acceleration
capabilities as the arch/s390/crypto/ SHA-3 crypto_shash algorithms.
Moreover, crypto/sha3.c now uses the SHA-3 library.  The result is that
all SHA-3 APIs are now s390-accelerated without any need for the old
SHA-3 code in arch/s390/crypto/.  Remove this superseded code.

Also update the s390 defconfig and debug_defconfig files to enable
CONFIG_CRYPTO_SHA3 instead of CONFIG_CRYPTO_SHA3_256_S390 and
CONFIG_CRYPTO_SHA3_512_S390.  This makes it so that the s390-optimized
SHA-3 continues to be built when either of these defconfigs is used.

Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-16-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:51 -08:00
Eric Biggers
f1799d1728 crypto: sha3 - Reimplement using library API
Replace sha3_generic.c with a new file sha3.c which implements the SHA-3
crypto_shash algorithms on top of the SHA-3 library API.

Change the driver name suffix from "-generic" to "-lib" to reflect that
these algorithms now just use the (possibly arch-optimized) library.

This closely mirrors crypto/{md5,sha1,sha256,sha512,blake2b}.c.

Implement export_core and import_core, since crypto/hmac.c expects these
to be present.  (Note that there is no security purpose in wrapping
SHA-3 with HMAC.  HMAC was designed for older algorithms that don't
resist length extension attacks.  But since someone could be using
"hmac(sha3-*)" via crypto_shash anyway, keep supporting it for now.)

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20251026055032.1413733-15-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:51 -08:00
Eric Biggers
d280d4d56a crypto: jitterentropy - Use default sha3 implementation
Make jitterentropy use "sha3-256" instead of "sha3-256-generic", as the
ability to explicitly request the generic code is going away.  It's not
worth providing a special generic API just for jitterentropy.  There are
many other solutions available to it, such as doing more iterations or
using a more effective jitter collection method.

Moreover, the status quo is that SHA-3 is quite slow anyway.  Currently
only arm64 and s390 have architecture-optimized SHA-3 code.  I'm not
familiar with the performance of the s390 one, but the arm64 one isn't
actually that much faster than the generic code anyway.

Note that jitterentropy should just use the library API instead of
crypto_shash.  But that belongs in a separate change later.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20251026055032.1413733-14-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:51 -08:00
Eric Biggers
862445d3b9 lib/crypto: s390/sha3: Add optimized one-shot SHA-3 digest functions
Some z/Architecture processors can compute a SHA-3 digest in a single
instruction.  arch/s390/crypto/ already uses this capability to optimize
the SHA-3 crypto_shash algorithms.

Use this capability to implement the sha3_224(), sha3_256(), sha3_384(),
and sha3_512() library functions too.

SHA3-256 benchmark results provided by Harald Freudenberger
(https://lore.kernel.org/r/4188d18bfcc8a64941c5ebd8de10ede2@linux.ibm.com/)
on a z/Architecture machine with "facility 86" (MSA level 12):

    Length (bytes)    Before (MB/s)   After (MB/s)
    ==============    =============   ============
          16                212             225
          64                820             915
         256               1850            3350
        1024               5400            8300
        4096              11200           11300

Note: the original data from Harald was given in the form of a graph for
each length, showing the distribution of throughputs from 500 runs.  I
guesstimated the peak of each one.

Harald also reported that the generic SHA-3 code was at most 259 MB/s
(https://lore.kernel.org/r/c39f6b6c110def0095e5da5becc12085@linux.ibm.com/).
So as expected, the earlier commit that optimized sha3_absorb_blocks()
and sha3_keccakf() is the more important one; it optimized the Keccak
permutation which is the most performance-critical part of SHA-3.
Still, this additional commit does notably improve performance further
on some lengths.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20251026055032.1413733-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:41 -08:00
Eric Biggers
0354d3c1f1 lib/crypto: sha3: Support arch overrides of one-shot digest functions
Add support for architecture-specific overrides of sha3_224(),
sha3_256(), sha3_384(), and sha3_512().  This will be used to implement
these functions more efficiently on s390 than is possible via the usual
init + update + final flow.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20251026055032.1413733-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:35 -08:00
Eric Biggers
04171105d3 lib/crypto: s390/sha3: Add optimized Keccak functions
Implement sha3_absorb_blocks() and sha3_keccakf() using the hardware-
accelerated SHA-3 support in Message-Security-Assist Extension 6.

This accelerates the SHA3-224, SHA3-256, SHA3-384, SHA3-512, and
SHAKE256 library functions.

Note that arch/s390/crypto/ already has SHA-3 code that uses this
extension, but it is exposed only via crypto_shash.  This commit brings
the same acceleration to the SHA-3 library.  The arch/s390/crypto/
version will become redundant and be removed in later changes.

Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:35 -08:00
Eric Biggers
1e29a75057 lib/crypto: arm64/sha3: Migrate optimized code into library
Instead of exposing the arm64-optimized SHA-3 code via arm64-specific
crypto_shash algorithms, instead just implement the sha3_absorb_blocks()
and sha3_keccakf() library functions.  This is much simpler, it makes
the SHA-3 library functions be arm64-optimized, and it fixes the
longstanding issue where the arm64-optimized SHA-3 code was disabled by
default.  SHA-3 still remains available through crypto_shash, but
individual architectures no longer need to handle it.

Note: to see the diff from arch/arm64/crypto/sha3-ce-glue.c to
lib/crypto/arm64/sha3.h, view this commit with 'git show -M10'.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:35 -08:00
Eric Biggers
be755eb2b0 crypto: arm64/sha3 - Update sha3_ce_transform() to prepare for library
- Use size_t lengths, to match the library.

- Pass the block size instead of digest size, and add support for the
  block size that SHAKE128 uses.  This allows the code to be used with
  SHAKE128 and SHAKE256, which don't have the concept of a digest size.
  SHAKE256 has the same block size as SHA3-256, but SHAKE128 has a
  unique block size.  Thus, there are now 5 supported block sizes.

Don't bother changing the "glue" code arm64_sha3_update() too much, as
it gets deleted when the SHA-3 code is migrated into lib/crypto/ anyway.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:35 -08:00
Eric Biggers
6fa873641c lib/crypto: sha3: Add FIPS cryptographic algorithm self-test
Since the SHA-3 algorithms are FIPS-approved, add the boot-time
self-test which is apparently required.  This closely follows the
corresponding SHA-1, SHA-256, and SHA-512 tests.

Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:35 -08:00
David Howells
c0db39e253 lib/crypto: sha3: Move SHA3 Iota step mapping into round function
In crypto/sha3_generic.c, the keccakf() function calls keccakf_round()
to do four of Keccak-f's five step mappings.  However, it does not do
the Iota step mapping - presumably because that is dependent on round
number, whereas Theta, Rho, Pi and Chi are not.

Note that the keccakf_round() function needs to be explicitly
non-inlined on certain architectures as gcc's produced output will (or
used to) use over 1KiB of stack space if inlined.

Now, this code was copied more or less verbatim into lib/crypto/sha3.c,
so that has the same aesthetic issue.  Fix this there by passing the
round number into sha3_keccakf_one_round_generic() and doing the Iota
step mapping there.

crypto/sha3_generic.c is left untouched as that will be converted to use
lib/crypto/sha3.c at some point.

Suggested-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:35 -08:00
David Howells
0593447248 lib/crypto: sha3: Add SHA-3 support
Add SHA-3 support to lib/crypto/.  All six algorithms in the SHA-3
family are supported: four digests (SHA3-224, SHA3-256, SHA3-384, and
SHA3-512) and two extendable-output functions (SHAKE128 and SHAKE256).

The SHAKE algorithms will be required for ML-DSA.

[EB: simplified the API to use fewer types and functions, fixed bug that
     sometimes caused incorrect SHAKE output, cleaned up the
     documentation, dropped an ad-hoc test that was inconsistent with
     the rest of lib/crypto/, and many other cleanups]

Signed-off-by: David Howells <dhowells@redhat.com>
Co-developed-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:32 -08:00
Zuo An
059835bbfa tools/power/cpupower: Support building libcpupower statically
The cpupower Makefile built and installed libcpupower as a shared
library (libcpupower.so) without passing `STATIC=true`, but did not
build a static version of the library even with `STATIC=true`. (Only the
programs were static). Thus, out-of-tree programs using libcpupower
were unable to link statically against the library without having access
to intermediate object files produced during the build.

This fixes that situation by ensuring that libcpupower.a is built and
installed when `STATIC=true` is specified.

Link: https://lore.kernel.org/r/x7geegquiks3zndiavw2arihdc2rk7e2dx3lk7yxkewqii6zpg@tzjijqxyzwmu
Signed-off-by: Zuo An <zuoan.penguin@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-11-05 09:56:01 -07:00
Mario Limonciello (AMD)
b1f02f005a Documentation: power: Add document on debugging shutdown hangs
If the kernel hangs while shutting down, ideally a UART log should
be captured to debug the problem.  However if one isn't available,
users can use the pstore functionality to retrieve logs.

Add a document explaining how this works to make it more accessible
to users.

Tested-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Link: https://patch.msgid.link/20251025004341.2386868-1-superm1@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-03 20:55:07 +01:00
Bagas Sanjaya
4ab25c9214 Documentation: intel-pstate: Use :ref: directive for internal linking
intel_pstate docs uses standard reST construct (`Section title`_) for
cross-referencing sections (internal linking), rather than for external
links. Incorrect cross-references are not caught when these are written
in that syntax, however (fortunately docutils 0.22 raise duplicate
target warnings that get fixed in cb908f8b0a ("Documentation:
intel_pstate: fix duplicate hyperlink target errors")).

Convert the cross-references to use :ref: directive, which doesn't
exhibit this problem.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
[ rjw: Changelog tweak ]
Link: https://patch.msgid.link/20251101055614.32270-1-bagasdotme@gmail.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-03 19:20:53 +01:00
Marco Crivellari
2817e6fa84 ACPI: thermal: Add WQ_PERCPU to alloc_workqueue() users
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
[ rjw: Subject adjustment ]
Link: https://patch.msgid.link/20251030154739.262582-6-marco.crivellari@suse.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-03 18:45:42 +01:00
Marco Crivellari
ec4291f524 ACPI: OSL: Add WQ_PERCPU to alloc_workqueue() users
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
[ rjw: Subject adjustment ]
Link: https://patch.msgid.link/20251030154739.262582-5-marco.crivellari@suse.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-03 18:45:42 +01:00
Marco Crivellari
87c21e2406 ACPI: EC: Add WQ_PERCPU to alloc_workqueue() users
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
[ rjw: Subject adjustment ]
Link: https://patch.msgid.link/20251030154739.262582-4-marco.crivellari@suse.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-03 18:45:42 +01:00
Marco Crivellari
6447ece47c ACPI: OSL: replace use of system_wq with system_percpu_wq
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.

This lack of consistency cannot be addressed without refactoring the API.

system_wq should be the per-cpu workqueue, yet in this name nothing makes
that clear, so replace system_wq with system_percpu_wq.

The old wq (system_wq) will be kept for a few release cycles.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Link: https://patch.msgid.link/20251030154739.262582-3-marco.crivellari@suse.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-03 18:45:42 +01:00
Marco Crivellari
0327c504e2 ACPI: scan: replace use of system_unbound_wq with system_dfl_wq
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.

This lack of consistency cannot be addressed without refactoring the API.

system_unbound_wq should be the default workqueue so as not to enforce
locality constraints for random work whenever it's not required.

Adding system_dfl_wq to encourage its use when unbound work should be used.

The old system_unbound_wq will be kept for a few release cycles.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Link: https://patch.msgid.link/20251030154739.262582-2-marco.crivellari@suse.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-03 18:45:42 +01:00
David Howells
4141211903 crypto: arm64/sha3 - Rename conflicting function
Rename the arm64 sha3_update() function to have an "arm64_" prefix to
avoid a name conflict with the upcoming SHA-3 library.

Note: this code will be superseded later.  This commit simply keeps the
kernel building for the initial introduction of the library.

[EB: dropped unnecessary rename of sha3_finup(), and improved commit
     message]

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-03 09:10:58 -08:00
David Howells
863ee5a3aa crypto: s390/sha3 - Rename conflicting functions
Rename the s390 sha3_*_init() functions to have an "s390_" prefix to
avoid a name conflict with the upcoming SHA-3 library functions.

Note: this code will be superseded later.  This commit simply keeps the
kernel building for the initial introduction of the library.

[EB: dropped unnecessary rename of import and export functions, and
     improved commit message]

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-03 09:10:58 -08:00
Eric Biggers
0e253e250e crypto: x86/aes-gcm-vaes-avx2 - initialize full %rax return register
Update aes_gcm_dec_final_vaes_avx2() to be consistent with
aes_gcm_dec_final_aesni() and aes_gcm_dec_final_vaes_avx512() by
initializing the full %rax return register instead of just %al.
Technically this is unnecessary, since these functions return bool.  But
I think it's worth being extra careful with the result of the tag
comparison and also keeping the different implementations consistent.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102015256.171536-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-03 09:07:57 -08:00
Eric Biggers
933ecf5912 random: remove unused get_random_var_wait functions
None of these functions are used, so remove them.

This renders the two bugs moot:

- get_random_u64_wait() used the wrong pointer type, making it provide
  only 32 bits.

- The '#undef' directive used the wrong identifier, leaving the helper
  macro defined.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2025-11-02 13:40:13 +01:00
Rafael J. Wysocki
1cf9c4f115 Merge back system sleep material for 6.19 2025-10-31 11:33:01 +01:00
Srinivas Pandruvada
39f421f2e3 powercap: intel_rapl: Add support for Wildcat Lake platform
Add Wildcat Lake to the list of supported processors for RAPL.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20251023174532.1882008-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-30 20:15:02 +01:00
Kuppuswamy Sathyanarayanan
790e826be8 cpufreq: intel_pstate: Add Diamond Rapids OOB mode support
Prevent intel_pstate from loading when Out-of-Band (OOB) P-states mode
is enabled.

The OOB identification mechanism for Diamond Rapids servers is the same
as for prior generation CPUs such as Granite Rapids. Add the Diamond
Rapids CPU model to intel_pstate_cpu_oob_ids[] to ensure correct OOB
handling.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20251022215425.3566218-1-sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-30 20:13:14 +01:00
Tejun Heo
8e4ec90701 freezer: Clarify that only cgroup1 freezer uses PM freezer
cgroup1 freezer piggybacks on the PM freezer, which inadvertently allowed
userspace to produce uninterruptible tasks at will. To avoid the issue,
cgroup2 freezer switched to a separate job control based mechanism. While
this happened a long time ago, the code and comment haven't been updated
making it confusing to people who aren't familiar with the history.

Rename cgroup_freezing() to cgroup1_freezing() and update comments on top of
freezing() and frozen() to clarify that cgroup2 freezer isn't covered by the
PM freezer mechanism.

Signed-off-by: Tejun Heo <tj@kernel.org>
Suggested-by: Qu Wenruo <wqu@suse.com>
Link: https://patch.msgid.link/aPZ3q6Hm865NicBC@slm.duckdns.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-30 20:10:27 +01:00
Xueqin Luo
ea358066de PM: hibernate: add sysfs interface for hibernate_compression_threads
Add a sysfs attribute `/sys/power/hibernate_compression_threads` to
allow runtime configuration of the number of threads used for
compressing and decompressing hibernation images.

The new sysfs interface enables dynamic adjustment at runtime:

    # cat /sys/power/hibernate_compression_threads
    3
    # echo 4 > /sys/power/hibernate_compression_threads

This change provides greater flexibility for debugging and performance
tuning of hibernation without requiring a reboot.

Signed-off-by: Xueqin Luo <luoxueqin@kylinos.cn>
Link: https://patch.msgid.link/c68c62f97fabf32507b8794ad8c16cd22ee656ac.1761046167.git.luoxueqin@kylinos.cn
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-30 20:07:00 +01:00
Xueqin Luo
090bf5a0f4 PM: hibernate: make compression threads configurable
The number of compression/decompression threads has a direct impact on
hibernate image generation and resume latency. Using more threads can
reduce overall resume time, but on systems with fewer CPU cores it may
also introduce contention and reduce efficiency.

Performance was evaluated on an 8-core ARM system, averaged over 10 runs:

    Threads  Hibernate(s)  Resume(s)
    --------------------------------
       3         12.14       18.86
       4         12.28       17.48
       5         11.09       16.77
       6         11.08       16.44

With 5–6 threads, resume latency improves by approximately 12% compared
to the default 3-thread configuration, with negligible impact on
hibernate time.

Introduce a new kernel parameter `hibernate_compression_threads=` that
allows users and integrators to tune the number of
compression/decompression threads at boot. This provides a way to
balance performance and CPU utilization across a wide range of hardware
without recompiling the kernel.

Signed-off-by: Xueqin Luo <luoxueqin@kylinos.cn>
Link: https://patch.msgid.link/f24b3ca6416e230a515a154ed4c121d72a7e05a6.1761046167.git.luoxueqin@kylinos.cn
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-30 20:07:00 +01:00
Xueqin Luo
e114e2eb7e PM: hibernate: dynamically allocate crc->unc_len/unc for configurable threads
Convert crc->unc_len and crc->unc from fixed-size arrays to dynamically
allocated arrays, sized according to the actual number of threads selected
at runtime. This removes the fixed limit imposed by CMP_THREADS.

Signed-off-by: Xueqin Luo <luoxueqin@kylinos.cn>
Link: https://patch.msgid.link/b5db63bb95729482d2649b12d3a11cb7547b7fcc.1761046167.git.luoxueqin@kylinos.cn
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-30 20:07:00 +01:00
Marco Crivellari
aba5f969f8 random: replace use of system_unbound_wq with system_dfl_wq
system_unbound_wq has been renamed to system_dfl_wq in 128ea9f6cc
("workqueue: Add system_percpu_wq and system_dfl_wq"), so update
random.c's usage of it system_unbound_wq to reflect the new change. The
old system_unbound_wq is slated for removal in the next few cycles.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2025-10-30 18:40:12 +01:00
Arnd Bergmann
5d49f1a5bd random: use offstack cpumask when necessary
The entropy generation function keeps a local cpu mask on the stack,
which can trigger warnings in configurations with a large number of
CPUs:

    drivers/char/random.c:1292:20: error: stack frame size (1288)
    exceeds limit (1280) in 'try_to_generate_entropy' [-Werror,-Wframe-larger-than]

Use the cpumask interface to dynamically allocate it in those
configurations.

Fixes: 1c21fe00ed ("random: spread out jitter callback to different CPUs")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2025-10-30 18:35:26 +01:00
Markus Theil
3c0c81de52 prandom: remove next_pseudo_random32
next_pseudo_random32 implements a LCG with known bad statistical
properties and was only used in two pieces of testing code.

With no remaining users now, remove it.

Signed-off-by: Markus Theil <theil.markus@gmail.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2025-10-30 18:35:26 +01:00
Markus Theil
8c0cf6542e media: vivid: use prandom
This is part of a prandom cleanup, which removes
next_pseudo_random32 and replaces it with the standard PRNG.

Signed-off-by: Markus Theil <theil.markus@gmail.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2025-10-30 18:35:26 +01:00
Thorsten Blum
a6a4d97f0d random: add missing words in function comments
s/good as/as good as/

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2025-10-30 18:35:26 +01:00
Eric Biggers
fa3ca9bfe3 crypto: blake2b - Reimplement using library API
Replace blake2b_generic.c with a new file blake2b.c which implements the
BLAKE2b crypto_shash algorithms on top of the BLAKE2b library API.

Change the driver name suffix from "-generic" to "-lib" to reflect that
these algorithms now just use the (possibly arch-optimized) library.

This closely mirrors crypto/{md5,sha1,sha256,sha512}.c.

Remove include/crypto/internal/blake2b.h since it is no longer used.
Likewise, remove struct blake2b_state from include/crypto/blake2b.h.

Omit support for import_core and export_core, since there are no legacy
drivers that need these for these algorithms.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29 22:04:24 -07:00
Eric Biggers
ba6617bd47 lib/crypto: arm/blake2b: Migrate optimized code into library
Migrate the arm-optimized BLAKE2b code from arch/arm/crypto/ to
lib/crypto/arm/.  This makes the BLAKE2b library able to use it, and it
also simplifies the code because it's easier to integrate with the
library than crypto_shash.

This temporarily makes the arm-optimized BLAKE2b code unavailable via
crypto_shash.  A later commit reimplements the blake2b-* crypto_shash
algorithms on top of the BLAKE2b library API, making it available again.

Note that as per the lib/crypto/ convention, the optimized code is now
enabled by default.  So, this also fixes the longstanding issue where
the optimized BLAKE2b code was not enabled by default.

To see the diff from arch/arm/crypto/blake2b-neon-glue.c to
lib/crypto/arm/blake2b.h, view this commit with 'git show -M10'.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29 22:04:24 -07:00
Eric Biggers
23a16c9533 lib/crypto: blake2b: Add BLAKE2b library functions
Add a library API for BLAKE2b, closely modeled after the BLAKE2s API.

This will allow in-kernel users such as btrfs to use BLAKE2b without
going through the generic crypto layer.  In addition, as usual the
BLAKE2b crypto_shash algorithms will be reimplemented on top of this.

Note: to create lib/crypto/blake2b.c I made a copy of
lib/crypto/blake2s.c and made the updates from BLAKE2s => BLAKE2b.  This
way, the BLAKE2s and BLAKE2b code is kept consistent.  Therefore, it
borrows the SPDX-License-Identifier and Copyright from
lib/crypto/blake2s.c rather than crypto/blake2b_generic.c.

The library API uses 'struct blake2b_ctx', consistent with other
lib/crypto/ APIs.  The existing 'struct blake2b_state' will be removed
once the blake2b crypto_shash algorithms are updated to stop using it.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29 22:04:24 -07:00
Eric Biggers
c99d307060 byteorder: Add le64_to_cpu_array() and cpu_to_le64_array()
Add le64_to_cpu_array() and cpu_to_le64_array().  These mirror the
corresponding 32-bit functions.

These will be used by the BLAKE2b code.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29 22:04:24 -07:00
Eric Biggers
b95d4471cb lib/crypto: blake2s: Document the BLAKE2s library API
Add kerneldoc for the BLAKE2s library API.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29 22:04:24 -07:00
Eric Biggers
5385bcbffe lib/crypto: blake2s: Drop excessive const & rename block => data
A couple more small cleanups to the BLAKE2s code before these things get
propagated into the BLAKE2b code:

- Drop 'const' from some non-pointer function parameters.  It was a bit
  excessive and not conventional.

- Rename 'block' argument of blake2s_compress*() to 'data'.  This is for
  consistency with the SHA-* code, and also to avoid the implication
  that it points to a singular "block".

No functional changes.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29 22:04:24 -07:00
Eric Biggers
5e0ec8e46d lib/crypto: blake2s: Rename blake2s_state to blake2s_ctx
For consistency with the SHA-1, SHA-2, SHA-3 (in development), and MD5
library APIs, rename blake2s_state to blake2s_ctx.

As a refresher, the ctx name:

- Is a bit shorter.
- Avoids confusion with the compression function state, which is also
  often called the state (but is just part of the full context).
- Is consistent with OpenSSL.

Not a big deal, of course.  But consistency is nice.  With a BLAKE2b
library API about to be added, this is a convenient time to update this.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29 22:04:24 -07:00
Eric Biggers
50b8e36994 lib/crypto: blake2s: Adjust parameter order of blake2s()
Reorder the parameters of blake2s() from (out, in, key, outlen, inlen,
keylen) to (key, keylen, in, inlen, out, outlen).

This aligns BLAKE2s with the common conventions of pairing buffers and
their lengths, and having outputs follow inputs.  This is widely used
elsewhere in lib/crypto/ and crypto/, and even elsewhere in the BLAKE2s
code itself such as blake2s_init_key() and blake2s_final().  So
blake2s() was a bit of an exception.

Notably, this results in the same order as hmac_*_usingrawkey().

Note that since the type signature changed, it's not possible for a
blake2s() call site to be silently missed.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29 22:04:24 -07:00
Eric Biggers
04cadb4fe0 lib/crypto: Add FIPS self-tests for SHA-1 and SHA-2
Add FIPS cryptographic algorithm self-tests for all SHA-1 and SHA-2
algorithms.  Following the "Implementation Guidance for FIPS 140-3"
document, to achieve this it's sufficient to just test a single test
vector for each of HMAC-SHA1, HMAC-SHA256, and HMAC-SHA512.

Just run these tests in the initcalls, following the example of e.g.
crypto/kdf_sp800108.c.  Note that this should meet the FIPS self-test
requirement even in the built-in case, given that the initcalls run
before userspace, storage, network, etc. are accessible.

This does not fix a regression, seeing as lib/ has had SHA-1 support
since 2005 and SHA-256 support since 2018.  Neither ever had FIPS
self-tests.  Moreover, fips=1 support has always been an unfinished
feature upstream.  However, with lib/ now being used more widely, it's
now seeing more scrutiny and people seem to want these now [1][2].

[1] https://lore.kernel.org/r/3226361.1758126043@warthog.procyon.org.uk/
[2] https://lore.kernel.org/r/f31dbb22-0add-481c-aee0-e337a7731f8e@oracle.com/

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251011001047.51886-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29 22:04:24 -07:00
Swaraj Gaikwad
cb908f8b0a Documentation: intel_pstate: fix duplicate hyperlink target errors
Fix reST warnings in
Documentation/admin-guide/pm/intel_pstate.rst caused by missing explicit
hyperlink labels for section titles.

Before this change, the following errors were printed during
`make htmldocs`:

  Documentation/admin-guide/pm/intel_pstate.rst:401:
    ERROR: Indirect hyperlink target (id="id6") refers to target
    "passive mode", which is a duplicate, and cannot be used as a
    unique reference.
  Documentation/admin-guide/pm/intel_pstate.rst:517:
    ERROR: Indirect hyperlink target (id="id9") refers to target
    "active mode", which is a duplicate, and cannot be used as a
    unique reference.
  Documentation/admin-guide/pm/intel_pstate.rst:611:
    ERROR: Indirect hyperlink target (id="id15") refers to target
    "global attributes", which is a duplicate, and cannot be used as
    a unique reference.
  ERROR: Duplicate target name, cannot be used as a unique reference:
  "passive mode", "active mode", "global attributes".

These errors occurred because the sections "Active Mode",
"Active Mode With HWP", "Passive Mode", and "Global Attributes"
did not define explicit hyperlink labels. As a result, Sphinx
auto-generated duplicate anchors when the same titles appeared
multiple times within the document.

Because of this, the generated HTML documentation contained
broken references such as:

  `active mode <Active Mode_>`_
  `passive mode <Passive Mode_>`_
  `global attributes <Global Attributes_>`_

This patch adds explicit hyperlink labels for the affected sections,
ensuring all references are unique and correctly resolved.

After applying this patch, `make htmldocs` completes without
any warnings, and all hyperlinks in intel_pstate.html render properly.

Signed-off-by: Swaraj Gaikwad <swarajgaikwad1925@gmail.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
[ rjw: Subject adjustment ]
Link: https://patch.msgid.link/20251029134737.42229-1-swarajgaikwad1925@gmail.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-29 20:04:50 +01:00
Malaya Kumar Rout
4e48e7baa3 PM: runtime: fix typos in runtime.c comments
Fix several typos in comments:
- "timesptamp" -> "timestamp"
- "involed" -> "involved"
- "nonero" -> "nonzero"

Fix typos in comments to improve code documentation clarity.

Signed-off-by: Malaya Kumar Rout <mrout@redhat.com>
Link: https://patch.msgid.link/20251026170527.262003-1-mrout@redhat.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-29 19:58:58 +01:00
Peng Fan
65df3a9629 PM: EM: Add to em_pd_list only when no failure
When em_create_perf_table() returns failure, pd is freed, there dev->em_pd
is not valid. Then accessing dev->em_pd->node will trigger kernel panic
in em_dev_register_pd_no_update(). So return early if 'ret' is non-zero.

Kernel dump:
cpu cpu0: EM: invalid power: 0
Unable to handle kernel NULL pointer dereference at virtual address
0000000000000008
Mem abort info:
pc : em_dev_register_pd_no_update+0xb4/0x79c
lr : em_dev_register_pd_no_update+0x9c/0x79c
Call trace:
 em_dev_register_pd_no_update+0xb4/0x79c (P)
 em_dev_register_perf_domain+0x18/0x58
 scmi_cpufreq_register_em+0x84/0xb8
 cpufreq_online+0x48c/0xb74
 cpufreq_add_dev+0x80/0x98
 subsys_interface_register+0x100/0x11c
 cpufreq_register_driver+0x158/0x278
 scmi_cpufreq_probe+0x1f8/0x2e0
 scmi_dev_probe+0x28/0x3c
 really_probe+0xbc/0x29c
 __driver_probe_device+0x78/0x12c
 driver_probe_device+0x3c/0x15c
 __device_attach_driver+0xb8/0x134
 bus_for_each_drv+0x84/0xe4

Fixes: cbe5aeedec ("PM: EM: Assign a unique ID when creating a performance domain")
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Reviewed-by: Changwoo Min <changwoo@igalia.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/20251028-fix-energy-v1-1-ab854fd6a97c@nxp.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-29 13:37:00 +01:00
Jie Zhan
1971b18785 cpufreq: CPPC: Don't warn if FIE init fails to read counters
During the CPPC FIE initialization, reading perf counters on offline cpus
should be expected to fail.  Don't warn on this case.

Also, change the error log level to debug since FIE is optional.

Co-developed-by: Bowen Yu <yubowen8@huawei.com>
Signed-off-by: Bowen Yu <yubowen8@huawei.com> # Changing loglevel to debug
Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com>
[ Viresh: Added back the dropped comment. ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-10-28 10:40:47 +05:30
Miaoqian Lin
9600156bb9 cpufreq: nforce2: fix reference count leak in nforce2
There are two reference count leaks in this driver:

1. In nforce2_fsb_read(): pci_get_subsys() increases the reference count
   of the PCI device, but pci_dev_put() is never called to release it,
   thus leaking the reference.

2. In nforce2_detect_chipset(): pci_get_subsys() gets a reference to the
   nforce2_dev which is stored in a global variable, but the reference
   is never released when the module is unloaded.

Fix both by:
- Adding pci_dev_put(nforce2_sub5) in nforce2_fsb_read() after reading
  the configuration.
- Adding pci_dev_put(nforce2_dev) in nforce2_exit() to release the
  global device reference.

Found via static analysis.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-10-28 10:28:13 +05:30
Armin Wolf
a5c2fcd82e ACPI: fan: Add support for Microsoft fan extensions
Microsoft has designed a set of extensions for the ACPI fan device
allowing the OS to specify a set of fan speed trip points. The
platform firmware will then notify the ACPI fan device when one
of the trip points is triggered.

Unfortunatly, some device manufacturers (like HP) blindly assume
that the OS will use said extensions and thus only update the values
returned by the _FST control method when receiving such a
notification. As a result, the ACPI fan driver is currently unusable
on such machines, always reporting a constant value.

Fix this by adding support for the Microsoft extensions.

During probe and when resuming from suspend, the driver will attempt to
trigger an initial notification that will update the values returned by
_FST.

Said trip points will be updated each time a notification is received
from the platform firmware to ensure that the values returned by the
_FST control method are updated.

Link: https://learn.microsoft.com/en-us/windows-hardware/design/device-experiences/design-guide
Closes: https://github.com/lm-sensors/lm-sensors/issues/506
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
[ rjw: Edits of the new code comments ]
Link: https://patch.msgid.link/20251024183824.5656-4-W_Armin@gmx.de
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-27 20:56:01 +01:00
Armin Wolf
3d4ca76369 ACPI: fan: Add hwmon notification support
The platform firmware can notify the ACPI fan device that the fan
speed has changed. Relay this notification to the hwmon device if
present so that userspace applications can react to it.

Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://patch.msgid.link/20251024183824.5656-3-W_Armin@gmx.de
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-27 20:56:01 +01:00
Armin Wolf
0670b9ad4d ACPI: fan: Add basic notification support
The ACPI specification states that the platform firmware can notify
the ACPI fan device that the fan speed has changed an that the _FST
control method should be reevaluated. Add support for this mechanism
to prepare for future changes.

Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://patch.msgid.link/20251024183824.5656-2-W_Armin@gmx.de
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-27 20:56:01 +01:00
Rafael J. Wysocki
58ca21d591 ACPI: TAD: Improve runtime PM using guard macros
Use guard pm_runtime_active_try to simplify runtime PM cleanup and
implement runtime resume error handling in multiple places.

Also use guard pm_runtime_noresume to simplify acpi_tad_remove().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/13881356.uLZWGnKmhe@rafael.j.wysocki
2025-10-27 20:32:13 +01:00
Rafael J. Wysocki
f9f5e22b75 ACPI: TAD: Rearrange runtime PM operations in acpi_tad_remove()
It is not necessary to resume the device upfront in acpi_tad_remove()
because both acpi_tad_disable_timer() and acpi_tad_clear_status()
attempt to resume it, but it is better to prevent it from suspending
between these calls by incrementing its runtime PM usage counter.

Accordingly, replace the pm_runtime_get_sync() call in acpi_tad_remove()
with a pm_runtime_get_noresume() one and put the latter right before the
first invocation of acpi_tad_disable_timer().

In addition, use pm_runtime_put_noidle() to drop the device's runtime
PM usage counter after using pm_runtime_get_noresume() to bump it up
to follow a common pattern and use pm_runtime_suspend() for suspending
the device afterward.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/5031965.GXAFRqVoOG@rafael.j.wysocki
2025-10-27 20:32:13 +01:00
Siyuan Huang
040beccb03 rust: acpi: replace core::mem::zeroed with pin_init::zeroed
All types in `bindings` implement `Zeroable` if they can, so use
`pin_init::zeroed` instead of relying on `unsafe` code.

If this ends up not compiling in the future, something in bindgen or on
the C side changed and is most likely incorrect.

Link: https://github.com/Rust-for-Linux/linux/issues/1189
Suggested-by: Benno Lossin <lossin@kernel.org>
Signed-off-by: Siyuan Huang <huangsiyuan@kylinos.cn>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Benno Lossin <lossin@kernel.org>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Reviewed-by: Kunwu Chan <chentao@kylinos.cn>
Link: https://patch.msgid.link/20251020031204.78917-1-huangsiyuan@kylinos.cn
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-27 20:27:05 +01:00
Rafael J. Wysocki
86bfd21a0b ACPI: battery: Drop redundant locking
All of the evaluations of objects in the ACPI namespace are carried out
under the namespace lock and interpreter lock in ACPICA, so it is not
necessary to put any additional locks around them for synchronization.

However, the ACPI battery driver does just that, so remove the
redundant locking around ACPI object evaluation from it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2344462.iZASKD2KPV@rafael.j.wysocki
2025-10-27 20:19:52 +01:00
Aboorva Devarajan
07d8157012 cpuidle: menu: Use residency threshold in polling state override decisions
On virtualized PowerPC (pseries) systems, where only one polling state
(Snooze) and one deep state (CEDE) are available, selecting CEDE when
the predicted idle duration is less than the target residency of CEDE
state can hurt performance. In such cases, the entry/exit overhead of
CEDE outweighs the power savings, leading to unnecessary state
transitions and higher latency.

Menu governor currently contains a special-case rule that prioritizes
the first non-polling state over polling, even when its target residency
is much longer than the predicted idle duration. On PowerPC/pseries,
where the gap between the polling state (Snooze) and the first non-polling
state (CEDE) is large, this behavior causes performance regressions.

Refine that special case by adding an extra requirement: the first
non-polling state can only be chosen if its target residency is below
the defined RESIDENCY_THRESHOLD_NS. If this condition is not satisfied,
polling is allowed instead, avoiding suboptimal non-polling state
entries.

This change is limited to the single special-case rule for the first
non-polling state. The general non-polling state selection logic in the
menu governor remains unchanged.

Performance improvement observed with pgbench on PowerPC (pseries)
system:
+---------------------------+------------+------------+------------+
| Metric                    | Baseline   | Patched    | Change (%) |
+---------------------------+------------+------------+------------+
| Transactions/sec (TPS)    | 495,210    | 536,982    | +8.45%     |
| Avg latency (ms)          | 0.163      | 0.150      | -7.98%     |
+---------------------------+------------+------------+------------+

CPUIdle state usage:
+--------------+--------------+-------------+
| Metric       | Baseline     | Patched     |
+--------------+--------------+-------------+
| Total usage  | 12,735,820   | 13,918,442  |
| Above usage  | 11,401,520   | 1,598,210   |
| Below usage  | 20,145       | 702,395     |
+--------------+--------------+-------------+

Above/Total and Below/Total usage percentages:
+------------------------+-----------+---------+
| Metric                 | Baseline  | Patched |
+------------------------+-----------+---------+
| Above % (Above/Total)  | 89.56%    | 11.49%  |
| Below % (Below/Total)  | 0.16%     | 5.05%   |
| Total cpuidle miss (%) | 89.72%    | 16.54%  |
+------------------------+-----------+---------+

The results indicate that restricting CEDE selection to cases where
its residency matches the predicted idle time reduces mispredictions,
lowers unnecessary state transitions, and improves overall throughput.

Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
[ rjw: Changelog edits, rebase ]
Link: https://patch.msgid.link/20251006013954.17972-1-aboorvad@linux.ibm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-27 14:53:46 +01:00
Eric Biggers
05794985b1 crypto: x86/aes-gcm - optimize long AAD processing with AVX512
Improve the performance of aes_gcm_aad_update_vaes_avx512() on large AAD
(additional authenticated data) lengths by 4-8 times by making it use up
to 512-bit vectors and a 4-vector-wide loop.  Previously, it used only
256-bit vectors and a 1-vector-wide loop.

Originally, I assumed that the case of large AADLEN was unimportant.
Later, when reviewing the users of BoringSSL's AES-GCM code, I found
that some callers use BoringSSL's AES-GCM API to just compute GMAC,
authenticating lots of data but not en/decrypting any.  Thus, I included
a similar optimization in the BoringSSL port of this code.  I believe
it's wise to include this optimization in the kernel port too for
similar reasons, and to align it more closely with the BoringSSL port.

Another reason this function originally used 256-bit vectors was so that
separate *_avx10_256 and *_avx10_512 versions of it wouldn't be needed.
However, that's no longer applicable.

To avoid a slight performance regression in the common case of AADLEN <=
16, also add a fast path for that case which uses 128-bit vectors.  In
fact, this case actually gets slightly faster too, since it saves a
couple instructions over the original 256-bit code.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251002023117.37504-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-26 20:37:41 -07:00
Eric Biggers
5ab1ff2e0f crypto: x86/aes-gcm - optimize AVX512 precomputation of H^2 from H^1
Squaring in GF(2^128) requires fewer instructions than a generic
multiplication in GF(2^128).  Take advantage of this when computing H^2
from H^1 in aes_gcm_precompute_vaes_avx512().

Note that aes_gcm_precompute_vaes_avx2() already uses this optimization.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251002023117.37504-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-26 20:37:41 -07:00
Eric Biggers
e0abd0053f crypto: x86/aes-gcm - revise some comments in AVX512 code
- Fix some references to field names in struct aes_gcm_key_vaes_avx512.

- Remove the mention of the counter having to start at 2.  The assembly
  code doesn't actually assume that it does.

Note that these changes improve consistency with aes-gcm-vaes-avx2.S.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251002023117.37504-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-26 20:37:41 -07:00
Eric Biggers
5213aefa9e crypto: x86/aes-gcm - reorder AVX512 precompute and aad_update functions
Now that the _aes_gcm_precompute macro is instantiated only once,
replace it directly with a function definition.

Also, move aes_gcm_aad_update_vaes_avx512() to a different location in
the file so that it's consistent with aes-gcm-vaes-avx2.S and also the
BoringSSL port of this code.

No functional changes.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251002023117.37504-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-26 20:37:41 -07:00
Eric Biggers
4b582e0fb3 crypto: x86/aes-gcm - clean up AVX512 code to assume 512-bit vectors
aes-gcm-vaes-avx512.S (originally aes-gcm-avx10-x86_64.S) was designed
to support multiple maximum vector lengths, while still utilizing AVX512
/ AVX10 features such as the increased number of vector registers.
However, the support for multiple maximum vector lengths turned out to
not be useful.  Support for maximum vector lengths other than 512 bits
was removed from the AVX10 specification, which leaves "avoiding
overly-eager downclocking" as the only remaining use case for limiting
AVX512 / AVX10 code to 256-bit vectors.  But this issue has gone away in
new CPUs, and the separate VAES+AVX2 code which I ended up having to
write anyway provides nearly as good 256-bit support.

Therefore, clean up this code to not be written in terms of a generic
vector length, but rather just assume 512-bit vectors.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251002023117.37504-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-26 20:37:41 -07:00
Eric Biggers
12beec21c5 crypto: x86/aes-gcm - rename avx10 and avx10_512 to avx512
With the "avx10_256" code removed and the AVX10 specification having
been changed to basically just be a re-packaged AVX512, the "avx10_512"
name no longer makes sense.  Replace it with "avx512".

While doing this, also add the "vaes_" prefix in places that didn't
already have it.  The result is that the two VAES optimized
implementations are consistently called vaes_avx2 and vaes_avx512.
(Also drop the "-x86_64" part of the assembly filename, to keep it from
getting too long.  There's no 32-bit version of this code, and the fact
that it's 64-bit is unremarkable; it's the norm for new code.)

Note: although aes_gcm_aad_update_vaes_avx512() (previously called
aes_gcm_aad_update_vaes_avx10()) uses at most 256-bit vectors, it still
depends on the AVX512 CPU feature.  So its new name is still accurate.
Also, a later commit will make it sometimes use 512-bit vectors anyway.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251002023117.37504-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-26 20:37:40 -07:00
Eric Biggers
f65e908606 crypto: x86/aes-gcm - remove VAES+AVX10/256 optimized code
Remove the VAES+AVX10/256 optimized implementation of AES-GCM.

It's no longer expected to be useful for future CPUs, since Intel
changed the AVX10 specification to require 512-bit vectors.

In addition, it's no longer very useful to serve as the 256-bit fallback
for older Intel CPUs (Ice Lake and Tiger Lake) that downclock too
eagerly when 512-bit vectors are used.  This is because I ended up
writing another 256-bit implementation anyway, using VAES+AVX2.  The
VAES+AVX2 implementation is almost as fast as the VAES+AVX10/256 one, as
shown by the following tables.  So, let's just use it instead.

Table 1: AES-256-GCM encryption throughput change,
         CPU vs. message length in bytes:

                      | 16384 |  4096 |  4095 |  1420 |   512 |   500 |
----------------------+-------+-------+-------+-------+-------+-------+
Intel Ice Lake Server |   -2% |   -1% |    0% |   -2% |   -2% |    3% |

                      |   300 |   200 |    64 |    63 |    16 |
----------------------+-------+-------+-------+-------+-------+
Intel Ice Lake Server |    1% |    0% |    4% |    2% |   -6% |

Table 2: AES-256-GCM decryption throughput change,
         CPU vs. message length in bytes:

                      | 16384 |  4096 |  4095 |  1420 |   512 |   500 |
----------------------+-------+-------+-------+-------+-------+-------+
Intel Ice Lake Server |   -1% |   -1% |    1% |   -2% |    0% |    2% |

                      |   300 |   200 |    64 |    63 |    16 |
----------------------+-------+-------+-------+-------+-------+
Intel Ice Lake Server |   -1% |    4% |    1% |    0% |   -5% |

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251002023117.37504-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-26 20:37:40 -07:00
Eric Biggers
fae3b96ba6 crypto: x86/aes-gcm - add VAES+AVX2 optimized code
Add an implementation of AES-GCM that uses 256-bit vectors and the
following CPU features: Vector AES (VAES), Vector Carryless
Multiplication (VPCLMULQDQ), and AVX2.

It doesn't require AVX512.  So unlike the existing VAES+AVX512 code, it
works on CPUs that support VAES but not AVX512, specifically:

    - AMD Zen 3, both client and server
    - Intel Alder Lake, Raptor Lake, Meteor Lake, Arrow Lake, and Lunar
      Lake.  (These are client CPUs.)
    - Intel Sierra Forest.  (This is a server CPU.)

On these CPUs, this VAES+AVX2 code is much faster than the existing
AES-NI code.  The AES-NI code uses only 128-bit vectors.

These CPUs are widely deployed, making VAES+AVX2 code worthwhile even
though hopefully future x86_64 CPUs will uniformly support AVX512.

This implementation will also serve as the fallback 256-bit
implementation for older Intel CPUs (Ice Lake and Tiger Lake) that
support AVX512 but downclock too eagerly when 512-bit vectors are used.
Currently, the VAES+AVX10/256 implementation serves that purpose.  A
later commit will remove that and just use the VAES+AVX2 one.  (Note
that AES-XTS and AES-CTR already successfully use this approach.)

I originally wrote this AES-GCM implementation for BoringSSL.  It's been
in BoringSSL for a while now, including in Chromium.  This is a port of
it to the Linux kernel.  The main changes in the Linux version include:

- Port from "perlasm" to a standard .S file.
- Align all assembly functions with what aesni-intel_glue.c expects,
  including adding support for lengths not a multiple of 16 bytes.
- Rework the en/decryption of the final 1 to 127 bytes.

This commit increases AES-256-GCM throughput on AMD Milan (Zen 3) by up
to 74%, as shown by the following tables:

Table 1: AES-256-GCM encryption throughput change,
         CPU vs. message length in bytes:

                      | 16384 |  4096 |  4095 |  1420 |   512 |   500 |
----------------------+-------+-------+-------+-------+-------+-------+
AMD Milan (Zen 3)     |   67% |   59% |   61% |   39% |   23% |   27% |

                      |   300 |   200 |    64 |    63 |    16 |
----------------------+-------+-------+-------+-------+-------+
AMD Milan (Zen 3)     |   14% |   12% |    7% |    7% |    0% |

Table 2: AES-256-GCM decryption throughput change,
         CPU vs. message length in bytes:

                      | 16384 |  4096 |  4095 |  1420 |   512 |   500 |
----------------------+-------+-------+-------+-------+-------+-------+
AMD Milan (Zen 3)     |   74% |   65% |   65% |   44% |   23% |   26% |

                      |   300 |   200 |    64 |    63 |    16 |
----------------------+-------+-------+-------+-------+-------+
AMD Milan (Zen 3)     |   12% |   11% |    3% |    2% |   -3% |

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251002023117.37504-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-26 20:37:40 -07:00
Armin Wolf
2e00f7a4bb ACPI: fan: Workaround for 64-bit firmware bug
Some firmware implementations use the "Ones" ASL opcode to produce
an integer with all bits set in order to indicate missing speed or
power readings. This however only works when using 32-bit integers,
as the ACPI spec requires a 32-bit integer (0xFFFFFFFF) to be
returned for missing speed/power readings. With 64-bit integers the
"Ones" opcode produces a 64-bit integer with all bits set, violating
the ACPI spec regarding the placeholder value for missing readings.

Work around such buggy firmware implementation by also checking for
64-bit integers with all bits set when reading _FST.

Signed-off-by: Armin Wolf <W_Armin@gmx.de>
[ rjw: Typo fix in the changelog ]
Link: https://patch.msgid.link/20251007234149.2769-3-W_Armin@gmx.de
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-24 10:29:52 +02:00
Tamir Duberstein
33ffb0aa8c rust: opp: simplify callers of to_c_str_array
Use `Option` combinators to make this a bit less noisy.

Wrap the `dev_pm_opp_set_config` operation in a closure and use type
ascription to leverage the compiler to check for use after free.

Signed-off-by: Tamir Duberstein <tamird@kernel.org>
Tested-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-10-23 20:51:17 +05:30
Rafael J. Wysocki
cea54f8e34 PM: runtime: docs: Update pm_runtime_allow/forbid() documentation
Drop confusing descriptions of pm_runtime_allow() and pm_runtime_forbid()
from Documentation/power/runtime_pm.rst and update the kerneldoc comments
of these functions to better explain their purpose.

Link: https://lore.kernel.org/linux-pm/08976178-298f-79d9-1d63-cff5a4e56cc3@linux.intel.com/
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://patch.msgid.link/12780841.O9o76ZdvQC@rafael.j.wysocki
2025-10-23 16:13:33 +02:00
Aaron Kling
85976d3774 cpufreq: tegra186: add OPP support and set bandwidth
Add support to use OPP table from DT in Tegra186 cpufreq driver.
Tegra SoC's receive the frequency lookup table (LUT) from BPMP-FW.
Cross check the OPP's present in DT against the LUT from BPMP-FW
and enable only those DT OPP's which are present in LUT also.

The OPP table in DT has CPU Frequency to bandwidth mapping where
the bandwidth value is per MC channel. DRAM bandwidth depends on the
number of MC channels which can vary as per the boot configuration.
This per channel bandwidth from OPP table will be later converted by
MC driver to final bandwidth value by multiplying with number of
channels before being handled in the EMC driver.

If OPP table is not present in DT, then use the LUT from BPMP-FW
directly as the CPU frequency table and not do the DRAM frequency
scaling which is same as the current behavior.

Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
[ Viresh: Fix _free() definitions ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-10-23 12:10:11 +05:30
Hal Feng
6e7970cab5 cpufreq: dt-platdev: Add JH7110S SOC to the allowlist
Add the compatible strings for supporting the generic
cpufreq driver on the StarFive JH7110S SoC.

Signed-off-by: Hal Feng <hal.feng@starfivetech.com>
Reviewed-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-10-23 12:10:11 +05:30
Shuhao Fu
2de5cb9606 cpufreq: s5pv210: fix refcount leak
In function `s5pv210_cpu_init`, a possible refcount inconsistency has
been identified, causing a resource leak.

Why it is a bug:
1. For every clk_get, there should be a matching clk_put on every
successive error handling path.
2. After calling `clk_get(dmc1_clk)`, variable `dmc1_clk` will not be
freed even if any error happens.

How it is fixed: For every failed path, an extra goto label is added to
ensure `dmc1_clk` will be freed regardlessly.

Signed-off-by: Shuhao Fu <sfual@cse.ust.hk>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-10-23 12:10:11 +05:30
Viresh Kumar
173e02d674 OPP: Initialize scope-based pointers inline
Uninitialized pointers with `__free` attribute can cause undefined
behaviour as the memory allocated to the pointer is freed automatically
when the pointer goes out of scope.

The OPP core doesn't have any bugs related to this as of now, but it is
better to initialize pointers marked with `__free` attribute at
declaration to simplify the code and ensure proper scope-based cleanup.

Reported-by: Joe Perches <joe@perches.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-10-23 11:58:05 +05:30
Changwoo Min
a1b17c9ac8 PM: EM: Notify an event when the performance domain changes
Send an event to userspace when a performance domain is created or deleted,
or its energy model is updated.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/20251020220914.320832-11-changwoo@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-22 21:44:38 +02:00
Changwoo Min
b95a0c02ad PM: EM: Implement em_notify_pd_created/updated()
Implement two event notifications when a performance domain is created
(EM_CMD_PD_CREATED) and updated (EM_CMD_PD_UPDATED). The message format
of these two event notifications is the same as EM_CMD_GET_PD_TABLE --
containing the performance domain's ID and its energy model table.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/20251020220914.320832-10-changwoo@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-22 21:44:38 +02:00
Changwoo Min
b2b1bbcac7 PM: EM: Implement em_notify_pd_deleted()
Add the event notification infrastructure and implement the event
notification for when a performance domain is deleted (EM_CMD_PD_DELETED).

The event contains the ID of the performance domain (EM_A_PD_TABLE_PD_ID)
so the userspace can identify the changed performance domain for further
processing.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/20251020220914.320832-9-changwoo@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-22 21:44:37 +02:00
Changwoo Min
f2d2946eaa PM: EM: Implement em_nl_get_pd_table_doit()
When a userspace requests EM_CMD_GET_PD_TABLE with an ID of a performance
domain, the kernel reports back the energy model table of the specified
performance domain. The message format of the response is as follows:

EM_A_PD_TABLE_PD_ID (NLA_U32)
EM_A_PD_TABLE_PS (NLA_NESTED)*
    EM_A_PS_PERFORMANCE (NLA_U64)
    EM_A_PS_FREQUENCY (NLA_U64)
    EM_A_PS_POWER (NLA_U64)
    EM_A_PS_COST (NLA_U64)
    EM_A_PS_FLAGS (NLA_U64)

where EM_A_PD_TABLE_PS can be repeated as many times as there are
performance states (struct em_perf_state).

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/20251020220914.320832-8-changwoo@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-22 21:44:37 +02:00
Changwoo Min
d8eef04531 PM: EM: Implement em_nl_get_pds_doit()
When a userspace requests EM_CMD_GET_PDS, the kernel responds with
information on all performance domains. The message format of the
response is as follows:

EM_A_PDS_PD (NLA_NESTED)*
    EM_A_PD_PD_ID (NLA_U32)
    EM_A_PD_FLAGS (NLA_U64)
    EM_A_PD_CPUS (NLA_STRING)

where EM_A_PDS_PD can be repeated as many times as there are performance
domains, and EM_A_PD_CPUS is a hexadecimal string representing a CPU
bitmask.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/20251020220914.320832-7-changwoo@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-22 21:44:37 +02:00
Changwoo Min
7928339cfe PM: EM: Add an iterator and accessor for the performance domain
Add an iterator function (for_each_em_perf_domain) that iterates all the
performance domains in the global list. A passed callback function (cb) is
called for each performance domain.

Additionally, add a lookup function (em_perf_domain_get_by_id) that
searches for a performance domain by matching the ID in the global list.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/20251020220914.320832-6-changwoo@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-22 21:44:37 +02:00
Changwoo Min
e4ed8d26c5 PM: EM: Add a skeleton code for netlink notification
Add a boilerplate code for netlink notification to register the new
protocol family. Also, initialize and register the netlink during booting.
The initialization is called at the postcore level, which is late enough
after the generic netlink is initialized.

Finally, update MAINTAINERS to include new files.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/20251020220914.320832-5-changwoo@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-22 21:44:37 +02:00
Changwoo Min
bd26631ccd PM: EM: Add em.yaml and autogen files
Add a generic netlink spec in YAML format and autogenerate boilerplate
code using ynl-regen.sh to introduce a generic netlink for the energy
model. It allows a userspace program to read the performance domain and
its energy model. It notifies the userspace program when a performance
domain is created or deleted or its energy model is updated through a
multicast interface.

Specifically, it supports two commands:
  - EM_CMD_GET_PDS: Get the list of information for all performance
    domains.
  - EM_CMD_GET_PD_TABLE: Get the energy model table of a performance
    domain.

Also, it supports three notification events:
  - EM_CMD_PD_CREATED: When a performance domain is created.
  - EM_CMD_PD_DELETED: When a performance domain is deleted.
  - EM_CMD_PD_UPDATED: When the energy model table of a performance domain
    is updated.

Finally, update MAINTAINERS to include new files.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/20251020220914.320832-4-changwoo@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-22 21:44:37 +02:00
Changwoo Min
ee50b8bb6b PM: EM: Expose the ID of a performance domain via debugfs
For ease of debugging, let's expose the assigned ID of a performance
domain through debugfs (e.g., /sys/kernel/debug/energy_model/cpu0/id).

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/20251020220914.320832-3-changwoo@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-22 21:44:37 +02:00
Changwoo Min
cbe5aeedec PM: EM: Assign a unique ID when creating a performance domain
It is necessary to refer to a specific performance domain from a
userspace. For example, the energy model of a particular performance
domain is updated.

To this end, assign a unique ID to each performance domain to address it,
and manage them in a global linked list to look up a specific one by
matching ID. IDA is used for ID assignment, and the mutex is used to
protect the global list from concurrent access.

Note that the mutex (em_pd_list_mutex) is not supposed to hold while
holding em_pd_mutex to avoid ABBA deadlock.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/20251020220914.320832-2-changwoo@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-22 21:44:37 +02:00
Sakari Ailus
b889ed5abf ACPI: property: Rework acpi_graph_get_next_endpoint()
Rework the code obtaining the next endpoint in
acpi_graph_get_next_endpoint(). The resulting code removes unnecessary
contitionals and should be easier to follow.

Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Link: https://patch.msgid.link/20251001104320.1272752-4-sakari.ailus@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-22 16:57:00 +02:00
Sakari Ailus
5d010473cd ACPI: property: Use ACPI functions in acpi_graph_get_next_endpoint() only
Calling fwnode_get_next_child_node() in ACPI implementation of the fwnode
property API is somewhat problematic as the latter is used in the
impelementation of the former. Instead of using
fwnode_get_next_child_node() in acpi_graph_get_next_endpoint(), call
acpi_get_next_subnode() directly instead.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251001104320.1272752-3-sakari.ailus@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-22 16:57:00 +02:00
Sakari Ailus
159e851108 ACPI: property: Make acpi_get_next_subnode() static
acpi_get_next_subnode() is only used in drivers/acpi/property.c. Remove
its prototype from include/linux/acpi.h and make it static.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251001104320.1272752-2-sakari.ailus@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-22 16:57:00 +02:00
Huisong Li
945661d581 ACPI: processor: idle: Relocate state flags initialization
Since acpi_processor_setup_cstates() is a more logical place for setting
idle state flags than acpi_processor_setup_cpuidle_cx(), move that code
from the latter to the former.

It also allows direct references to acpi_idle_driver in
acpi_processor_setup_cpuidle_cx() to be avoided.

No intentional functional impact.

Signed-off-by: Huisong Li <lihuisong@huawei.com>
[ rjw: Subject and changelog rewrite ]
Link: https://patch.msgid.link/20250929093754.3998136-5-lihuisong@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-22 16:47:40 +02:00
Tamir Duberstein
e6fdbe8fea rust: opp: fix broken rustdoc link
Correct the spelling of "CString" to make the link work.

Fixes: ce32e2d47c ("rust: opp: Add abstractions for the configuration options")
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-10-22 09:26:20 +05:30
Thorsten Blum
ace0471774 cpufreq: Replace deprecated strcpy() in cpufreq_unregister_governor()
strcpy() is deprecated; assign the NUL terminator directly instead.

Link: https://github.com/KSPP/linux/issues/88
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
[ rjw: Subject tweaks ]
Link: https://patch.msgid.link/20251017153354.82009-2-thorsten.blum@linux.dev
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-20 21:25:36 +02:00
Rafael J. Wysocki
5313ec4a21 cpufreq: intel_pstate: Improve printing of debug messages
Some debug messages generated by intel_pstate on a given hybrid system
are only printed for some CPUs which is confusing, so modify the driver
to print them for all CPUs.  Also change those messages to avoid
printing local variable names in them.

Moreover, some debug messages printed by intel_pstate are quite hard
to understand without looking at the code printing them, so make them
somewhat clearer while at it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/8609836.T7Z3S40VBb@rafael.j.wysocki
2025-10-20 21:23:36 +02:00
Rafael J. Wysocki
d852b6f67b cpufreq: intel_pstate: hybrid: Adjust energy model rules
Instead of using HWP-to-frequency scaling factors for computing cost
coefficients in the energy model used on hybrid systems, which is
fragile, rely on CPU type information that is easily accessible now and
the information on whether or not L3 cache is present for this purpose.

This also allows the cost coefficients for P-cores to be adjusted so
that they start to be populated somewhat earlier (that is, before
E-cores are loaded up to their full capacity).

In addition to the above, replace an inaccurate comment regarding the
reason why the freq value is added to the cost in hybrid_get_cost().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Yaxiong Tian <tianyaxiong@kylinos.cn>
Link: https://patch.msgid.link/5932894.DvuYhMxLoT@rafael.j.wysocki
2025-10-20 21:22:21 +02:00
Rafael J. Wysocki
c17add7349 cpufreq: intel_pstate: Add and use hybrid_has_l3()
Introduce a function for checking whether or not a given CPU has L3
cache, called hybrid_has_l3(), and use it in hybrid_get_cost() for
computing cost coefficients associated with a given perf domain.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/13884343.uLZWGnKmhe@rafael.j.wysocki
2025-10-20 21:20:49 +02:00
Rafael J. Wysocki
528dde6619 cpufreq: intel_pstate: Add and use hybrid_get_cpu_type()
Introduce a function for identifying the type of a given CPU in a
hybrid system, called hybrid_get_cpu_type(), and use if for hybrid
scaling factor determination in hwp_get_cpu_scaling().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/1954386.tdWV9SEqCh@rafael.j.wysocki
2025-10-20 21:20:49 +02:00
Zihuan Zhang
6db0f533d3 cpufreq: preserve freq_table_sorted across suspend/hibernate
During S3/S4 suspend and resume, cpufreq policies are not freed or
recreated; the freq_table and policy structure remain intact. However,
set_freq_table_sorted() currently resets policy->freq_table_sorted to
UNSORTED unconditionally, which is unnecessary since the table order
does not change across suspend/resume.

This patch adds a check to skip validation if policy->freq_table_sorted
is already ASCENDING or DESCENDING. This avoids unnecessary traversal
of the frequency table on S3/S4 resume or repeated online events,
reducing overhead while preserving correctness.

Signed-off-by: Zihuan Zhang <zhangzihuan@kylinos.cn>
Link: https://patch.msgid.link/20251011072420.11495-1-zhangzihuan@kylinos.cn
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-20 21:01:35 +02:00
Rafael J. Wysocki
d3db87f89c PM: hibernate: Rework message printing in swsusp_save()
The messages printed by swsusp_save() are basically only useful for
debug, so printing them every time a hibernation image is created at
the "info" log level is not particularly useful.  Also printing a
message on a failing memory allocation is redundant.

Use pm_deferred_pr_dbg() for printing those messages so they will only
be printed when requested and the "deferred" variant is used because
this code runs in a deeply atomic context (one CPU with interrupts
off, no functional devices).  Also drop the useless message printed
when memory allocations fails.

While at it, extend one of the messages in question so it is less
cryptic.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[ rjw: Dropped a useless colon at the end of one of the messages ]
Link: https://patch.msgid.link/10750389.nUPlyArG6x@rafael.j.wysocki
2025-10-20 20:43:09 +02:00
Rafael J. Wysocki
32ece31db4 ACPI: PM: s2idle: Only retrieve constraints when needed
The evaluation of LPS0 _DSM Function 1 in lps0_device_attach() may be
useless if pm_debug_messages_on is never set.

For this reason, instead of evaluating it in lps0_device_attach(), do
that in a new .begin() callback for s2idle, acpi_s2idle_begin_lps0(),
only when pm_debug_messages_on is set at that point.

However, never attempt to evaluate LPS0 _DSM Function 1 more than once
to avoid recurring failures.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Link: https://patch.msgid.link/3027060.e9J7NaK4W3@rafael.j.wysocki
2025-10-20 20:39:33 +02:00
Rafael J. Wysocki
bfc09902de ACPI: PM: s2idle: Staticise LPS0 callback functions
The LPS0 callback functions in x86/s2idle.c can be made static, so do
that and remove their declarations from sleep.h.

While at it, add the _lps0 suffix to their names to indicate that
they are LPS0-specific.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Link: https://patch.msgid.link/2254836.irdbgypaU6@rafael.j.wysocki
2025-10-20 20:39:33 +02:00
Rafael J. Wysocki
a00f3dea03 ACPI: PM: s2idle: Drop acpi_get_lps0_constraint()
Drop unused function acpi_get_lps0_constraint().

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Link: https://patch.msgid.link/5032801.GXAFRqVoOG@rafael.j.wysocki
2025-10-20 20:39:33 +02:00
Sergey Senozhatsky
a67818f745 PM: dpm_watchdog: add module param to backtrace all CPUs
Add dpm_watchdog_all_cpu_backtrace module parameter which
controls all CPU backtrace dump before the DPM watchdog panics
the system.

This is expected to help understand what might have caused device
timeout.

Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Dhruva Gole <d-gole@ti.com>
Link: https://patch.msgid.link/20251007063551.3147937-1-senozhatsky@chromium.org
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-20 20:07:02 +02:00
Kaushlendra Kumar
5a151c2328 PM: sleep: Introduce CALL_PM_OP() macro to simplify code
Add CALL_PM_OP() macro to eliminate a repetitive code pattern in
power management generic operations.

Replace analogous driver PM callback invocation logic across all
pm_generic_*() functions with a single macro that handles the NULL
pointer checks and function calls.

This reduces code size while maintaining the same functionality and
improving code maintainability.

Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Reviewed-by: Dhruva Gole <d-gole@ti.com>
Link: https://patch.msgid.link/20250919124437.3075016-1-kaushlendra.kumar@intel.com
[ rjw: Subject and changelog edits, adjust white space ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-20 19:54:25 +02:00
Malaya Kumar Rout
b57100a3d9 PM: console: Fix memory allocation error handling in pm_vt_switch_required()
The pm_vt_switch_required() function fails silently when memory
allocation fails, offering no indication to callers that the operation
was unsuccessful. This behavior prevents drivers from handling allocation
errors correctly or implementing retry mechanisms. By ensuring that
failures are reported back to the caller, drivers can make informed
decisions, improve robustness, and avoid unexpected behavior during
critical power management operations.

Change the function signature to return an integer error code and modify
the implementation to return -ENOMEM when kmalloc() fails. Update both
the function declaration and the inline stub in include/linux/pm.h to
maintain consistency across CONFIG_VT_CONSOLE_SLEEP configurations.

The function now returns:
 - 0 on success (including when updating existing entries)
 - -ENOMEM when memory allocation fails

This change improves error reporting without breaking existing callers,
as the current callers in drivers/video/fbdev/core/fbmem.c already
ignore the return value, making this a backward-compatible improvement.

Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Malaya Kumar Rout <mrout@redhat.com>
Reviewed-by: Dhruva Gole <d-gole@ti.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Link: https://patch.msgid.link/20251013193028.89570-1-mrout@redhat.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-18 14:38:23 +02:00
Kaushlendra Kumar
67434ce57c PM: sleep: Replace snprintf() with scnprintf() in show_trace_dev_match()
Replace snprintf() with scnprintf() in show_trace_dev_match() to simplify
buffer length handling. The scnprintf() function returns the number of
characters actually written (excluding the null terminator), which
eliminates the need for manual length checking and clamping.

This change removes the redundant size check since scnprintf() guarantees
that the return value will never exceed the buffer size, making the code
cleaner and less error-prone.

Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Link: https://patch.msgid.link/20250922055231.3523680-1-kaushlendra.kumar@intel.com
[ rjw: Subject adjustment ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-13 21:19:12 +02:00
Marco Crivellari
c9ff363738 PM: WQ_UNBOUND added to pm_wq workqueue
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.

This lack of consistentcy cannot be addressed without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This change add the WQ_UNBOUND flag to pm_wq, to make explicit this
workqueue can be unbound and that it does not benefit from per-cpu work.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-10-13 20:50:09 +02:00
268 changed files with 10347 additions and 5106 deletions

View File

@@ -454,3 +454,19 @@ Description:
disables it. Reads from the file return the current value.
The default is "1" if the build-time "SUSPEND_SKIP_SYNC" config
flag is unset, or "0" otherwise.
What: /sys/power/hibernate_compression_threads
Date: October 2025
Contact: <luoxueqin@kylinos.cn>
Description:
Controls the number of threads used for compression
and decompression of hibernation images.
The value can be adjusted at runtime to balance
performance and CPU utilization.
The change takes effect on the next hibernation or
resume operation.
Minimum value: 1
Default value: 3

View File

@@ -1907,6 +1907,16 @@
/sys/power/pm_test). Only available when CONFIG_PM_DEBUG
is set. Default value is 5.
hibernate_compression_threads=
[HIBERNATION]
Set the number of threads used for compressing or decompressing
hibernation images.
Format: <integer>
Default: 3
Minimum: 1
Example: hibernate_compression_threads=4
highmem=nn[KMG] [KNL,BOOT,EARLY] forces the highmem zone to have an exact
size of <nn>. This works even on boxes that have no
highmem otherwise. This also works to reduce highmem

View File

@@ -580,6 +580,15 @@ the given CPU as the upper limit for the exit latency of the idle states that
they are allowed to select for that CPU. They should never select any idle
states with exit latency beyond that limit.
While the above CPU QoS constraints apply to CPU idle time management, user
space may also request a CPU system wakeup latency QoS limit, via the
`cpu_wakeup_latency` file. This QoS constraint is respected when selecting a
suitable idle state for the CPUs, while entering the system-wide suspend-to-idle
sleep state, but also to the regular CPU idle time management.
Note that, the management of the `cpu_wakeup_latency` file works according to
the 'cpu_dma_latency' file from user space point of view. Moreover, the unit
is also microseconds.
Idle States Control Via Kernel Command Line
===========================================

View File

@@ -48,8 +48,9 @@ only way to pass early-configuration-time parameters to it is via the kernel
command line. However, its configuration can be adjusted via ``sysfs`` to a
great extent. In some configurations it even is possible to unregister it via
``sysfs`` which allows another ``CPUFreq`` scaling driver to be loaded and
registered (see `below <status_attr_>`_).
registered (see :ref:`below <status_attr>`).
.. _operation_modes:
Operation Modes
===============
@@ -62,6 +63,8 @@ a certain performance scaling algorithm. Which of them will be in effect
depends on what kernel command line options are used and on the capabilities of
the processor.
.. _active_mode:
Active Mode
-----------
@@ -94,6 +97,8 @@ Which of the P-state selection algorithms is used by default depends on the
Namely, if that option is set, the ``performance`` algorithm will be used by
default, and the other one will be used by default if it is not set.
.. _active_mode_hwp:
Active Mode With HWP
~~~~~~~~~~~~~~~~~~~~
@@ -123,7 +128,7 @@ Energy-Performance Bias (EPB) knob (otherwise), which means that the processor's
internal P-state selection logic is expected to focus entirely on performance.
This will override the EPP/EPB setting coming from the ``sysfs`` interface
(see `Energy vs Performance Hints`_ below). Moreover, any attempts to change
(see :ref:`energy_performance_hints` below). Moreover, any attempts to change
the EPP/EPB to a value different from 0 ("performance") via ``sysfs`` in this
configuration will be rejected.
@@ -192,6 +197,8 @@ This is the default P-state selection algorithm if the
:c:macro:`CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE` kernel configuration option
is not set.
.. _passive_mode:
Passive Mode
------------
@@ -289,12 +296,12 @@ Unlike ``_PSS`` objects in the ACPI tables, ``intel_pstate`` always exposes
the entire range of available P-states, including the whole turbo range, to the
``CPUFreq`` core and (in the passive mode) to generic scaling governors. This
generally causes turbo P-states to be set more often when ``intel_pstate`` is
used relative to ACPI-based CPU performance scaling (see `below <acpi-cpufreq_>`_
for more information).
used relative to ACPI-based CPU performance scaling (see
:ref:`below <acpi-cpufreq>` for more information).
Moreover, since ``intel_pstate`` always knows what the real turbo threshold is
(even if the Configurable TDP feature is enabled in the processor), its
``no_turbo`` attribute in ``sysfs`` (described `below <no_turbo_attr_>`_) should
``no_turbo`` attribute in ``sysfs`` (described :ref:`below <no_turbo_attr>`) should
work as expected in all cases (that is, if set to disable turbo P-states, it
always should prevent ``intel_pstate`` from using them).
@@ -307,12 +314,12 @@ pieces of information on it to be known, including:
* The minimum supported P-state.
* The maximum supported `non-turbo P-state <turbo_>`_.
* The maximum supported :ref:`non-turbo P-state <turbo>`.
* Whether or not turbo P-states are supported at all.
* The maximum supported `one-core turbo P-state <turbo_>`_ (if turbo P-states
are supported).
* The maximum supported :ref:`one-core turbo P-state <turbo>` (if turbo
P-states are supported).
* The scaling formula to translate the driver's internal representation
of P-states into frequencies and the other way around.
@@ -400,10 +407,10 @@ Energy-Aware Scheduling Support
If ``CONFIG_ENERGY_MODEL`` has been set during kernel configuration and
``intel_pstate`` runs on a hybrid processor without SMT, in addition to enabling
`CAS <CAS_>`_ it registers an Energy Model for the processor. This allows the
:ref:`CAS` it registers an Energy Model for the processor. This allows the
Energy-Aware Scheduling (EAS) support to be enabled in the CPU scheduler if
``schedutil`` is used as the ``CPUFreq`` governor which requires ``intel_pstate``
to operate in the `passive mode <Passive Mode_>`_.
to operate in the :ref:`passive mode <passive_mode>`.
The Energy Model registered by ``intel_pstate`` is artificial (that is, it is
based on abstract cost values and it does not include any real power numbers)
@@ -432,6 +439,8 @@ the ``energy_model`` directory in ``debugfs`` (typlically mounted on
User Space Interface in ``sysfs``
=================================
.. _global_attributes:
Global Attributes
-----------------
@@ -444,8 +453,8 @@ argument is passed to the kernel in the command line.
``max_perf_pct``
Maximum P-state the driver is allowed to set in percent of the
maximum supported performance level (the highest supported `turbo
P-state <turbo_>`_).
maximum supported performance level (the highest supported :ref:`turbo
P-state <turbo>`).
This attribute will not be exposed if the
``intel_pstate=per_cpu_perf_limits`` argument is present in the kernel
@@ -453,8 +462,8 @@ argument is passed to the kernel in the command line.
``min_perf_pct``
Minimum P-state the driver is allowed to set in percent of the
maximum supported performance level (the highest supported `turbo
P-state <turbo_>`_).
maximum supported performance level (the highest supported :ref:`turbo
P-state <turbo>`).
This attribute will not be exposed if the
``intel_pstate=per_cpu_perf_limits`` argument is present in the kernel
@@ -463,18 +472,18 @@ argument is passed to the kernel in the command line.
``num_pstates``
Number of P-states supported by the processor (between 0 and 255
inclusive) including both turbo and non-turbo P-states (see
`Turbo P-states Support`_).
:ref:`turbo`).
This attribute is present only if the value exposed by it is the same
for all of the CPUs in the system.
The value of this attribute is not affected by the ``no_turbo``
setting described `below <no_turbo_attr_>`_.
setting described :ref:`below <no_turbo_attr>`.
This attribute is read-only.
``turbo_pct``
Ratio of the `turbo range <turbo_>`_ size to the size of the entire
Ratio of the :ref:`turbo range <turbo>` size to the size of the entire
range of supported P-states, in percent.
This attribute is present only if the value exposed by it is the same
@@ -486,7 +495,7 @@ argument is passed to the kernel in the command line.
``no_turbo``
If set (equal to 1), the driver is not allowed to set any turbo P-states
(see `Turbo P-states Support`_). If unset (equal to 0, which is the
(see :ref:`turbo`). If unset (equal to 0, which is the
default), turbo P-states can be set by the driver.
[Note that ``intel_pstate`` does not support the general ``boost``
attribute (supported by some other scaling drivers) which is replaced
@@ -495,11 +504,11 @@ argument is passed to the kernel in the command line.
This attribute does not affect the maximum supported frequency value
supplied to the ``CPUFreq`` core and exposed via the policy interface,
but it affects the maximum possible value of per-policy P-state limits
(see `Interpretation of Policy Attributes`_ below for details).
(see :ref:`policy_attributes_interpretation` below for details).
``hwp_dynamic_boost``
This attribute is only present if ``intel_pstate`` works in the
`active mode with the HWP feature enabled <Active Mode With HWP_>`_ in
:ref:`active mode with the HWP feature enabled <active_mode_hwp>` in
the processor. If set (equal to 1), it causes the minimum P-state limit
to be increased dynamically for a short time whenever a task previously
waiting on I/O is selected to run on a given logical CPU (the purpose
@@ -514,12 +523,12 @@ argument is passed to the kernel in the command line.
Operation mode of the driver: "active", "passive" or "off".
"active"
The driver is functional and in the `active mode
<Active Mode_>`_.
The driver is functional and in the :ref:`active mode
<active_mode>`.
"passive"
The driver is functional and in the `passive mode
<Passive Mode_>`_.
The driver is functional and in the :ref:`passive mode
<passive_mode>`.
"off"
The driver is not functional (it is not registered as a scaling
@@ -547,13 +556,15 @@ argument is passed to the kernel in the command line.
attribute to "1" enables the energy-efficiency optimizations and setting
to "0" disables them.
.. _policy_attributes_interpretation:
Interpretation of Policy Attributes
-----------------------------------
The interpretation of some ``CPUFreq`` policy attributes described in
Documentation/admin-guide/pm/cpufreq.rst is special with ``intel_pstate``
as the current scaling driver and it generally depends on the driver's
`operation mode <Operation Modes_>`_.
:ref:`operation mode <operation_modes>`.
First of all, the values of the ``cpuinfo_max_freq``, ``cpuinfo_min_freq`` and
``scaling_cur_freq`` attributes are produced by applying a processor-specific
@@ -562,9 +573,10 @@ Also, the values of the ``scaling_max_freq`` and ``scaling_min_freq``
attributes are capped by the frequency corresponding to the maximum P-state that
the driver is allowed to set.
If the ``no_turbo`` `global attribute <no_turbo_attr_>`_ is set, the driver is
not allowed to use turbo P-states, so the maximum value of ``scaling_max_freq``
and ``scaling_min_freq`` is limited to the maximum non-turbo P-state frequency.
If the ``no_turbo`` :ref:`global attribute <no_turbo_attr>` is set, the driver
is not allowed to use turbo P-states, so the maximum value of
``scaling_max_freq`` and ``scaling_min_freq`` is limited to the maximum
non-turbo P-state frequency.
Accordingly, setting ``no_turbo`` causes ``scaling_max_freq`` and
``scaling_min_freq`` to go down to that value if they were above it before.
However, the old values of ``scaling_max_freq`` and ``scaling_min_freq`` will be
@@ -576,7 +588,7 @@ and ``scaling_min_freq`` corresponds to the maximum supported turbo P-state,
which also is the value of ``cpuinfo_max_freq`` in either case.
Next, the following policy attributes have special meaning if
``intel_pstate`` works in the `active mode <Active Mode_>`_:
``intel_pstate`` works in the :ref:`active mode <active_mode>`:
``scaling_available_governors``
List of P-state selection algorithms provided by ``intel_pstate``.
@@ -597,20 +609,22 @@ processor:
Shows the base frequency of the CPU. Any frequency above this will be
in the turbo frequency range.
The meaning of these attributes in the `passive mode <Passive Mode_>`_ is the
The meaning of these attributes in the :ref:`passive mode <passive_mode>` is the
same as for other scaling drivers.
Additionally, the value of the ``scaling_driver`` attribute for ``intel_pstate``
depends on the operation mode of the driver. Namely, it is either
"intel_pstate" (in the `active mode <Active Mode_>`_) or "intel_cpufreq" (in the
`passive mode <Passive Mode_>`_).
"intel_pstate" (in the :ref:`active mode <active_mode>`) or "intel_cpufreq"
(in the :ref:`passive mode <passive_mode>`).
.. _pstate_limits_coordination:
Coordination of P-State Limits
------------------------------
``intel_pstate`` allows P-state limits to be set in two ways: with the help of
the ``max_perf_pct`` and ``min_perf_pct`` `global attributes
<Global Attributes_>`_ or via the ``scaling_max_freq`` and ``scaling_min_freq``
the ``max_perf_pct`` and ``min_perf_pct`` :ref:`global attributes
<global_attributes>` or via the ``scaling_max_freq`` and ``scaling_min_freq``
``CPUFreq`` policy attributes. The coordination between those limits is based
on the following rules, regardless of the current operation mode of the driver:
@@ -632,17 +646,18 @@ on the following rules, regardless of the current operation mode of the driver:
3. The global and per-policy limits can be set independently.
In the `active mode with the HWP feature enabled <Active Mode With HWP_>`_, the
In the :ref:`active mode with the HWP feature enabled <active_mode_hwp>`, the
resulting effective values are written into hardware registers whenever the
limits change in order to request its internal P-state selection logic to always
set P-states within these limits. Otherwise, the limits are taken into account
by scaling governors (in the `passive mode <Passive Mode_>`_) and by the driver
every time before setting a new P-state for a CPU.
by scaling governors (in the :ref:`passive mode <passive_mode>`) and by the
driver every time before setting a new P-state for a CPU.
Additionally, if the ``intel_pstate=per_cpu_perf_limits`` command line argument
is passed to the kernel, ``max_perf_pct`` and ``min_perf_pct`` are not exposed
at all and the only way to set the limits is by using the policy attributes.
.. _energy_performance_hints:
Energy vs Performance Hints
---------------------------
@@ -702,9 +717,9 @@ output.
On those systems each ``_PSS`` object returns a list of P-states supported by
the corresponding CPU which basically is a subset of the P-states range that can
be used by ``intel_pstate`` on the same system, with one exception: the whole
`turbo range <turbo_>`_ is represented by one item in it (the topmost one). By
convention, the frequency returned by ``_PSS`` for that item is greater by 1 MHz
than the frequency of the highest non-turbo P-state listed by it, but the
:ref:`turbo range <turbo>` is represented by one item in it (the topmost one).
By convention, the frequency returned by ``_PSS`` for that item is greater by
1 MHz than the frequency of the highest non-turbo P-state listed by it, but the
corresponding P-state representation (following the hardware specification)
returned for it matches the maximum supported turbo P-state (or is the
special value 255 meaning essentially "go as high as you can get").
@@ -730,18 +745,18 @@ benefit from running at turbo frequencies will be given non-turbo P-states
instead.
One more issue related to that may appear on systems supporting the
`Configurable TDP feature <turbo_>`_ allowing the platform firmware to set the
turbo threshold. Namely, if that is not coordinated with the lists of P-states
returned by ``_PSS`` properly, there may be more than one item corresponding to
a turbo P-state in those lists and there may be a problem with avoiding the
turbo range (if desirable or necessary). Usually, to avoid using turbo
P-states overall, ``acpi-cpufreq`` simply avoids using the topmost state listed
by ``_PSS``, but that is not sufficient when there are other turbo P-states in
the list returned by it.
:ref:`Configurable TDP feature <turbo>` allowing the platform firmware to set
the turbo threshold. Namely, if that is not coordinated with the lists of
P-states returned by ``_PSS`` properly, there may be more than one item
corresponding to a turbo P-state in those lists and there may be a problem with
avoiding the turbo range (if desirable or necessary). Usually, to avoid using
turbo P-states overall, ``acpi-cpufreq`` simply avoids using the topmost state
listed by ``_PSS``, but that is not sufficient when there are other turbo
P-states in the list returned by it.
Apart from the above, ``acpi-cpufreq`` works like ``intel_pstate`` in the
`passive mode <Passive Mode_>`_, except that the number of P-states it can set
is limited to the ones listed by the ACPI ``_PSS`` objects.
:ref:`passive mode <passive_mode>`, except that the number of P-states it can
set is limited to the ones listed by the ACPI ``_PSS`` objects.
Kernel Command Line Options for ``intel_pstate``
@@ -756,11 +771,11 @@ of them have to be prepended with the ``intel_pstate=`` prefix.
processor is supported by it.
``active``
Register ``intel_pstate`` in the `active mode <Active Mode_>`_ to start
with.
Register ``intel_pstate`` in the :ref:`active mode <active_mode>` to
start with.
``passive``
Register ``intel_pstate`` in the `passive mode <Passive Mode_>`_ to
Register ``intel_pstate`` in the :ref:`passive mode <passive_mode>` to
start with.
``force``
@@ -793,12 +808,12 @@ of them have to be prepended with the ``intel_pstate=`` prefix.
and this option has no effect.
``per_cpu_perf_limits``
Use per-logical-CPU P-State limits (see `Coordination of P-state
Limits`_ for details).
Use per-logical-CPU P-State limits (see
:ref:`pstate_limits_coordination` for details).
``no_cas``
Do not enable `capacity-aware scheduling <CAS_>`_ which is enabled by
default on hybrid systems without SMT.
Do not enable :ref:`capacity-aware scheduling <CAS>` which is enabled
by default on hybrid systems without SMT.
Diagnostics and Tuning
======================
@@ -810,7 +825,7 @@ There are two static trace events that can be used for ``intel_pstate``
diagnostics. One of them is the ``cpu_frequency`` trace event generally used
by ``CPUFreq``, and the other one is the ``pstate_sample`` trace event specific
to ``intel_pstate``. Both of them are triggered by ``intel_pstate`` only if
it works in the `active mode <Active Mode_>`_.
it works in the :ref:`active mode <active_mode>`.
The following sequence of shell commands can be used to enable them and see
their output (if the kernel is generally configured to support event tracing)::
@@ -822,7 +837,7 @@ their output (if the kernel is generally configured to support event tracing)::
gnome-terminal--4510 [001] ..s. 1177.680733: pstate_sample: core_busy=107 scaled=94 from=26 to=26 mperf=1143818 aperf=1230607 tsc=29838618 freq=2474476
cat-5235 [002] ..s. 1177.681723: cpu_frequency: state=2900000 cpu_id=2
If ``intel_pstate`` works in the `passive mode <Passive Mode_>`_, the
If ``intel_pstate`` works in the :ref:`passive mode <passive_mode>`, the
``cpu_frequency`` trace event will be triggered either by the ``schedutil``
scaling governor (for the policies it is attached to), or by the ``CPUFreq``
core (for the policies with other scaling governors).

View File

@@ -6,3 +6,4 @@ Thermal Subsystem
:maxdepth: 1
intel_powerclamp
intel_thermal_throttle

View File

@@ -0,0 +1,91 @@
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>
=======================================
Intel thermal throttle events reporting
=======================================
:Author: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Introduction
------------
Intel processors have built in automatic and adaptive thermal monitoring
mechanisms that force the processor to reduce its power consumption in order
to operate within predetermined temperature limits.
Refer to section "THERMAL MONITORING AND PROTECTION" in the "Intel® 64 and
IA-32 Architectures Software Developers Manual Volume 3 (3A, 3B, 3C, & 3D):
System Programming Guide" for more details.
In general, there are two mechanisms to control the core temperature of the
processor. They are called "Thermal Monitor 1 (TM1) and Thermal Monitor 2 (TM2)".
The status of the temperature sensor that triggers the thermal monitor (TM1/TM2)
is indicated through the "thermal status flag" and "thermal status log flag" in
MSR_IA32_THERM_STATUS for core level and MSR_IA32_PACKAGE_THERM_STATUS for
package level.
Thermal Status flag, bit 0 — When set, indicates that the processor core
temperature is currently at the trip temperature of the thermal monitor and that
the processor power consumption is being reduced via either TM1 or TM2, depending
on which is enabled. When clear, the flag indicates that the core temperature is
below the thermal monitor trip temperature. This flag is read only.
Thermal Status Log flag, bit 1 — When set, indicates that the thermal sensor has
tripped since the last power-up or reset or since the last time that software
cleared this flag. This flag is a sticky bit; once set it remains set until
cleared by software or until a power-up or reset of the processor. The default
state is clear.
It is possible that when user reads MSR_IA32_THERM_STATUS or
MSR_IA32_PACKAGE_THERM_STATUS, TM1/TM2 is not active. In this case,
"Thermal Status flag" will read "0" and the "Thermal Status Log flag" will be set
to show any previous "TM1/TM2" activation. But since it needs to be cleared by
the software, it can't show the number of occurrences of "TM1/TM2" activations.
Hence, Linux provides counters of how many times the "Thermal Status flag" was
set. Also presents how long the "Thermal Status flag" was active in milliseconds.
Using these counters, users can check if the performance was limited because of
thermal events. It is recommended to read from sysfs instead of directly reading
MSRs as the "Thermal Status Log flag" is reset by the driver to implement rate
control.
Sysfs Interface
---------------
Thermal throttling events are presented for each CPU under
"/sys/devices/system/cpu/cpuX/thermal_throttle/", where "X" is the CPU number.
All these counters are read-only. They can't be reset to 0. So, they can potentially
overflow after reaching the maximum 64 bit unsigned integer.
``core_throttle_count``
Shows the number of times "Thermal Status flag" changed from 0 to 1 for this
CPU since OS boot and thermal vector is initialized. This is a 64 bit counter.
``package_throttle_count``
Shows the number of times "Thermal Status flag" changed from 0 to 1 for the
package containing this CPU since OS boot and thermal vector is initialized.
Package status is broadcast to all CPUs; all CPUs in the package increment
this count. This is a 64-bit counter.
``core_throttle_max_time_ms``
Shows the maximum amount of time for which "Thermal Status flag" has been
set to 1 for this CPU at the core level since OS boot and thermal vector
is initialized.
``package_throttle_max_time_ms``
Shows the maximum amount of time for which "Thermal Status flag" has been
set to 1 for the package containing this CPU since OS boot and thermal
vector is initialized.
``core_throttle_total_time_ms``
Shows the cumulative time for which "Thermal Status flag" has been
set to 1 for this CPU for core level since OS boot and thermal vector
is initialized.
``package_throttle_total_time_ms``
Shows the cumulative time for which "Thermal Status flag" has been set
to 1 for the package containing this CPU since OS boot and thermal vector
is initialized.

View File

@@ -27,3 +27,4 @@ for cryptographic use cases, as well as programming examples.
descore-readme
device_drivers/index
krb5
sha3

View File

@@ -0,0 +1,130 @@
.. SPDX-License-Identifier: GPL-2.0-or-later
==========================
SHA-3 Algorithm Collection
==========================
.. contents::
Overview
========
The SHA-3 family of algorithms, as specified in NIST FIPS-202 [1]_, contains six
algorithms based on the Keccak sponge function. The differences between them
are: the "rate" (how much of the state buffer gets updated with new data between
invocations of the Keccak function and analogous to the "block size"), what
domain separation suffix gets appended to the input data, and how much output
data is extracted at the end. The Keccak sponge function is designed such that
arbitrary amounts of output can be obtained for certain algorithms.
Four digest algorithms are provided:
- SHA3-224
- SHA3-256
- SHA3-384
- SHA3-512
Additionally, two Extendable-Output Functions (XOFs) are provided:
- SHAKE128
- SHAKE256
The SHA-3 library API supports all six of these algorithms. The four digest
algorithms are also supported by the crypto_shash and crypto_ahash APIs.
This document describes the SHA-3 library API.
Digests
=======
The following functions compute SHA-3 digests::
void sha3_224(const u8 *in, size_t in_len, u8 out[SHA3_224_DIGEST_SIZE]);
void sha3_256(const u8 *in, size_t in_len, u8 out[SHA3_256_DIGEST_SIZE]);
void sha3_384(const u8 *in, size_t in_len, u8 out[SHA3_384_DIGEST_SIZE]);
void sha3_512(const u8 *in, size_t in_len, u8 out[SHA3_512_DIGEST_SIZE]);
For users that need to pass in data incrementally, an incremental API is also
provided. The incremental API uses the following struct::
struct sha3_ctx { ... };
Initialization is done with one of::
void sha3_224_init(struct sha3_ctx *ctx);
void sha3_256_init(struct sha3_ctx *ctx);
void sha3_384_init(struct sha3_ctx *ctx);
void sha3_512_init(struct sha3_ctx *ctx);
Input data is then added with any number of calls to::
void sha3_update(struct sha3_ctx *ctx, const u8 *in, size_t in_len);
Finally, the digest is generated using::
void sha3_final(struct sha3_ctx *ctx, u8 *out);
which also zeroizes the context. The length of the digest is determined by the
initialization function that was called.
Extendable-Output Functions
===========================
The following functions compute the SHA-3 extendable-output functions (XOFs)::
void shake128(const u8 *in, size_t in_len, u8 *out, size_t out_len);
void shake256(const u8 *in, size_t in_len, u8 *out, size_t out_len);
For users that need to provide the input data incrementally and/or receive the
output data incrementally, an incremental API is also provided. The incremental
API uses the following struct::
struct shake_ctx { ... };
Initialization is done with one of::
void shake128_init(struct shake_ctx *ctx);
void shake256_init(struct shake_ctx *ctx);
Input data is then added with any number of calls to::
void shake_update(struct shake_ctx *ctx, const u8 *in, size_t in_len);
Finally, the output data is extracted with any number of calls to::
void shake_squeeze(struct shake_ctx *ctx, u8 *out, size_t out_len);
and telling it how much data should be extracted. Note that performing multiple
squeezes, with the output laid consecutively in a buffer, gets exactly the same
output as doing a single squeeze for the combined amount over the same buffer.
More input data cannot be added after squeezing has started.
Once all the desired output has been extracted, zeroize the context::
void shake_zeroize_ctx(struct shake_ctx *ctx);
Testing
=======
To test the SHA-3 code, use sha3_kunit (CONFIG_CRYPTO_LIB_SHA3_KUNIT_TEST).
Since the SHA-3 algorithms are FIPS-approved, when the kernel is booted in FIPS
mode the SHA-3 library also performs a simple self-test. This is purely to meet
a FIPS requirement. Normal testing done by kernel developers and integrators
should use the much more comprehensive KUnit test suite instead.
References
==========
.. [1] https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf
API Function Reference
======================
.. kernel-doc:: include/crypto/sha3.h

View File

@@ -0,0 +1,87 @@
# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
%YAML 1.2
---
$id: http://devicetree.org/schemas/thermal/fsl,imx91-tmu.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: NXP i.MX91 Thermal
maintainers:
- Pengfei Li <pengfei.li_1@nxp.com>
description:
i.MX91 features a new temperature sensor. It includes programmable
temperature threshold comparators for both normal and privileged
accesses and allows a programmable measurement frequency for the
Periodic One-Shot Measurement mode. Additionally, it provides
status registers for indicating the end of measurement and threshold
violation events.
properties:
compatible:
items:
- const: fsl,imx91-tmu
reg:
maxItems: 1
clocks:
maxItems: 1
interrupts:
items:
- description: Comparator 1 irq
- description: Comparator 2 irq
- description: Data ready irq
interrupt-names:
items:
- const: thr1
- const: thr2
- const: ready
nvmem-cells:
items:
- description: Phandle to the trim control 1 provided by ocotp
- description: Phandle to the trim control 2 provided by ocotp
nvmem-cell-names:
items:
- const: trim1
- const: trim2
"#thermal-sensor-cells":
const: 0
required:
- compatible
- reg
- clocks
- interrupts
- interrupt-names
allOf:
- $ref: thermal-sensor.yaml
unevaluatedProperties: false
examples:
- |
#include <dt-bindings/interrupt-controller/arm-gic.h>
#include <dt-bindings/clock/imx93-clock.h>
thermal-sensor@44482000 {
compatible = "fsl,imx91-tmu";
reg = <0x44482000 0x1000>;
#thermal-sensor-cells = <0>;
clocks = <&clk IMX93_CLK_TMC_GATE>;
interrupt-parent = <&gic>;
interrupts = <GIC_SPI 83 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 84 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 85 IRQ_TYPE_LEVEL_HIGH>;
interrupt-names = "thr1", "thr2", "ready";
nvmem-cells = <&tmu_trim1>, <&tmu_trim2>;
nvmem-cell-names = "trim1", "trim2";
};
...

View File

@@ -36,10 +36,15 @@ properties:
- qcom,msm8974-tsens
- const: qcom,tsens-v0_1
- description:
v1 of TSENS without RPM which requires to be explicitly reset
and enabled in the driver.
enum:
- qcom,ipq5018-tsens
- description: v1 of TSENS
items:
- enum:
- qcom,ipq5018-tsens
- qcom,msm8937-tsens
- qcom,msm8956-tsens
- qcom,msm8976-tsens
@@ -50,11 +55,13 @@ properties:
items:
- enum:
- qcom,glymur-tsens
- qcom,kaanapali-tsens
- qcom,milos-tsens
- qcom,msm8953-tsens
- qcom,msm8996-tsens
- qcom,msm8998-tsens
- qcom,qcm2290-tsens
- qcom,qcs8300-tsens
- qcom,qcs615-tsens
- qcom,sa8255p-tsens
- qcom,sa8775p-tsens

View File

@@ -16,7 +16,11 @@ description:
properties:
compatible:
const: renesas,r9a09g047-tsu
oneOf:
- const: renesas,r9a09g047-tsu # RZ/G3E
- items:
- const: renesas,r9a09g057-tsu # RZ/V2H
- const: renesas,r9a09g047-tsu # RZ/G3E
reg:
maxItems: 1

View File

@@ -409,3 +409,26 @@ based on the processor generation.
Limit 1 from being exhausted.
4 Unknown: Can't classify.
On processors starting from Panther Lake additional hints are provided.
The hardware analyzes workload residencies over an extended period to
determine whether the workload classification tends toward idle/battery
life states or sustained/performance states. Based on this long-term
analysis, it classifies:
Power Classification: If the workload exhibits more idle or battery life
residencies, it is classified as "power".
Performance Classification: If the workload exhibits more sustained or
performance residencies, it is classified as "performance".
This approach enables applications to ignore short-term workload
fluctuations and instead respond to longer-term power vs. performance
trends.
Residency thresholds for this classification are CPU generation-specific.
Classification is reported via bit 4 of the workload_type_index:
Bit 4 = 1: Power classification
Bit 4 = 0: Performance classification

View File

@@ -450,9 +450,7 @@ API, but the filenames mode still does.
- CONFIG_CRYPTO_HCTR2
- Recommended:
- arm64: CONFIG_CRYPTO_AES_ARM64_CE_BLK
- arm64: CONFIG_CRYPTO_POLYVAL_ARM64_CE
- x86: CONFIG_CRYPTO_AES_NI_INTEL
- x86: CONFIG_CRYPTO_POLYVAL_CLMUL_NI
- Adiantum
- Mandatory:

View File

@@ -0,0 +1,113 @@
# SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)
name: em
doc: |
Energy model netlink interface to notify its changes.
protocol: genetlink
uapi-header: linux/energy_model.h
attribute-sets:
-
name: pds
attributes:
-
name: pd
type: nest
nested-attributes: pd
multi-attr: true
-
name: pd
attributes:
-
name: pad
type: pad
-
name: pd-id
type: u32
-
name: flags
type: u64
-
name: cpus
type: string
-
name: pd-table
attributes:
-
name: pd-id
type: u32
-
name: ps
type: nest
nested-attributes: ps
multi-attr: true
-
name: ps
attributes:
-
name: pad
type: pad
-
name: performance
type: u64
-
name: frequency
type: u64
-
name: power
type: u64
-
name: cost
type: u64
-
name: flags
type: u64
operations:
list:
-
name: get-pds
attribute-set: pds
doc: Get the list of information for all performance domains.
do:
reply:
attributes:
- pd
-
name: get-pd-table
attribute-set: pd-table
doc: Get the energy model table of a performance domain.
do:
request:
attributes:
- pd-id
reply:
attributes:
- pd-id
- ps
-
name: pd-created
doc: A performance domain is created.
notify: get-pd-table
mcgrp: event
-
name: pd-updated
doc: A performance domain is updated.
notify: get-pd-table
mcgrp: event
-
name: pd-deleted
doc: A performance domain is deleted.
attribute-set: pd-table
event:
attributes:
- pd-id
mcgrp: event
mcast-groups:
list:
-
name: event

View File

@@ -19,6 +19,7 @@ Power Management
power_supply_class
runtime_pm
s2ram
shutdown-debugging
suspend-and-cpuhotplug
suspend-and-interrupts
swsusp-and-swap-files

View File

@@ -55,7 +55,8 @@ int cpu_latency_qos_request_active(handle):
From user space:
The infrastructure exposes one device node, /dev/cpu_dma_latency, for the CPU
The infrastructure exposes two separate device nodes, /dev/cpu_dma_latency for
the CPU latency QoS and /dev/cpu_wakeup_latency for the CPU system wakeup
latency QoS.
Only processes can register a PM QoS request. To provide for automatic
@@ -63,15 +64,15 @@ cleanup of a process, the interface requires the process to register its
parameter requests as follows.
To register the default PM QoS target for the CPU latency QoS, the process must
open /dev/cpu_dma_latency.
open /dev/cpu_dma_latency. To register a CPU system wakeup QoS limit, the
process must open /dev/cpu_wakeup_latency.
As long as the device node is held open that process has a registered
request on the parameter.
To change the requested target value, the process needs to write an s32 value to
the open device node. Alternatively, it can write a hex string for the value
using the 10 char long format e.g. "0x12345678". This translates to a
cpu_latency_qos_update_request() call.
using the 10 char long format e.g. "0x12345678".
To remove the user mode request for a target value simply close the device
node.

View File

@@ -480,16 +480,6 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h:
`bool pm_runtime_status_suspended(struct device *dev);`
- return true if the device's runtime PM status is 'suspended'
`void pm_runtime_allow(struct device *dev);`
- set the power.runtime_auto flag for the device and decrease its usage
counter (used by the /sys/devices/.../power/control interface to
effectively allow the device to be power managed at run time)
`void pm_runtime_forbid(struct device *dev);`
- unset the power.runtime_auto flag for the device and increase its usage
counter (used by the /sys/devices/.../power/control interface to
effectively prevent the device from being power managed at run time)
`void pm_runtime_no_callbacks(struct device *dev);`
- set the power.no_callbacks flag for the device and remove the runtime
PM attributes from /sys/devices/.../power (or prevent them from being

View File

@@ -0,0 +1,53 @@
.. SPDX-License-Identifier: GPL-2.0
Debugging Kernel Shutdown Hangs with pstore
+++++++++++++++++++++++++++++++++++++++++++
Overview
========
If the system hangs while shutting down, the kernel logs may need to be
retrieved to debug the issue.
On systems that have a UART available, it is best to configure the kernel to use
this UART for kernel console output.
If a UART isn't available, the ``pstore`` subsystem provides a mechanism to
persist this data across a system reset, allowing it to be retrieved on the next
boot.
Kernel Configuration
====================
To enable ``pstore`` and enable saving kernel ring buffer logs, set the
following kernel configuration options:
* ``CONFIG_PSTORE=y``
* ``CONFIG_PSTORE_CONSOLE=y``
Additionally, enable a backend to store the data. Depending upon your platform
some potential options include:
* ``CONFIG_EFI_VARS_PSTORE=y``
* ``CONFIG_PSTORE_RAM=y``
* ``CONFIG_CHROMEOS_PSTORE=y``
* ``CONFIG_PSTORE_BLK=y``
Kernel Command-line Parameters
==============================
Add these parameters to your kernel command line:
* ``printk.always_kmsg_dump=Y``
* Forces the kernel to dump the entire message buffer to pstore during
shutdown
* ``efi_pstore.pstore_disable=N``
* For EFI-based systems, ensures the EFI backend is active
Userspace Interaction and Log Retrieval
=======================================
On the next boot after a hang, pstore logs will be available in the pstore
filesystem (``/sys/fs/pstore``) and can be retrieved by userspace.
On systemd systems, the ``systemd-pstore`` service will help do the following:
#. Locate pstore data in ``/sys/fs/pstore``
#. Read and save it to ``/var/lib/systemd/pstore``
#. Clear pstore data for the next event

View File

@@ -9188,6 +9188,9 @@ S: Maintained
F: kernel/power/energy_model.c
F: include/linux/energy_model.h
F: Documentation/power/energy-model.rst
F: Documentation/netlink/specs/em.yaml
F: include/uapi/linux/energy_model.h
F: kernel/power/em_netlink*.*
EPAPR HYPERVISOR BYTE CHANNEL DEVICE DRIVER
M: Laurentiu Tudor <laurentiu.tudor@nxp.com>

View File

@@ -33,22 +33,6 @@ config CRYPTO_NHPOLY1305_NEON
Architecture: arm using:
- NEON (Advanced SIMD) extensions
config CRYPTO_BLAKE2B_NEON
tristate "Hash functions: BLAKE2b (NEON)"
depends on KERNEL_MODE_NEON
select CRYPTO_BLAKE2B
help
BLAKE2b cryptographic hash function (RFC 7693)
Architecture: arm using
- NEON (Advanced SIMD) extensions
BLAKE2b digest algorithm optimized with ARM NEON instructions.
On ARM processors that have NEON support but not the ARMv8
Crypto Extensions, typically this BLAKE2b implementation is
much faster than the SHA-2 family and slightly faster than
SHA-1.
config CRYPTO_AES_ARM
tristate "Ciphers: AES"
select CRYPTO_ALGAPI

View File

@@ -5,7 +5,6 @@
obj-$(CONFIG_CRYPTO_AES_ARM) += aes-arm.o
obj-$(CONFIG_CRYPTO_AES_ARM_BS) += aes-arm-bs.o
obj-$(CONFIG_CRYPTO_BLAKE2B_NEON) += blake2b-neon.o
obj-$(CONFIG_CRYPTO_NHPOLY1305_NEON) += nhpoly1305-neon.o
obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
@@ -13,7 +12,6 @@ obj-$(CONFIG_CRYPTO_GHASH_ARM_CE) += ghash-arm-ce.o
aes-arm-y := aes-cipher-core.o aes-cipher-glue.o
aes-arm-bs-y := aes-neonbs-core.o aes-neonbs-glue.o
blake2b-neon-y := blake2b-neon-core.o blake2b-neon-glue.o
aes-arm-ce-y := aes-ce-core.o aes-ce-glue.o
ghash-arm-ce-y := ghash-ce-core.o ghash-ce-glue.o
nhpoly1305-neon-y := nh-neon-core.o nhpoly1305-neon-glue.o

View File

@@ -1,104 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* BLAKE2b digest algorithm, NEON accelerated
*
* Copyright 2020 Google LLC
*/
#include <crypto/internal/blake2b.h>
#include <crypto/internal/hash.h>
#include <linux/module.h>
#include <linux/sizes.h>
#include <asm/neon.h>
#include <asm/simd.h>
asmlinkage void blake2b_compress_neon(struct blake2b_state *state,
const u8 *block, size_t nblocks, u32 inc);
static void blake2b_compress_arch(struct blake2b_state *state,
const u8 *block, size_t nblocks, u32 inc)
{
do {
const size_t blocks = min_t(size_t, nblocks,
SZ_4K / BLAKE2B_BLOCK_SIZE);
kernel_neon_begin();
blake2b_compress_neon(state, block, blocks, inc);
kernel_neon_end();
nblocks -= blocks;
block += blocks * BLAKE2B_BLOCK_SIZE;
} while (nblocks);
}
static int crypto_blake2b_update_neon(struct shash_desc *desc,
const u8 *in, unsigned int inlen)
{
return crypto_blake2b_update_bo(desc, in, inlen, blake2b_compress_arch);
}
static int crypto_blake2b_finup_neon(struct shash_desc *desc, const u8 *in,
unsigned int inlen, u8 *out)
{
return crypto_blake2b_finup(desc, in, inlen, out,
blake2b_compress_arch);
}
#define BLAKE2B_ALG(name, driver_name, digest_size) \
{ \
.base.cra_name = name, \
.base.cra_driver_name = driver_name, \
.base.cra_priority = 200, \
.base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY | \
CRYPTO_AHASH_ALG_BLOCK_ONLY | \
CRYPTO_AHASH_ALG_FINAL_NONZERO, \
.base.cra_blocksize = BLAKE2B_BLOCK_SIZE, \
.base.cra_ctxsize = sizeof(struct blake2b_tfm_ctx), \
.base.cra_module = THIS_MODULE, \
.digestsize = digest_size, \
.setkey = crypto_blake2b_setkey, \
.init = crypto_blake2b_init, \
.update = crypto_blake2b_update_neon, \
.finup = crypto_blake2b_finup_neon, \
.descsize = sizeof(struct blake2b_state), \
.statesize = BLAKE2B_STATE_SIZE, \
}
static struct shash_alg blake2b_neon_algs[] = {
BLAKE2B_ALG("blake2b-160", "blake2b-160-neon", BLAKE2B_160_HASH_SIZE),
BLAKE2B_ALG("blake2b-256", "blake2b-256-neon", BLAKE2B_256_HASH_SIZE),
BLAKE2B_ALG("blake2b-384", "blake2b-384-neon", BLAKE2B_384_HASH_SIZE),
BLAKE2B_ALG("blake2b-512", "blake2b-512-neon", BLAKE2B_512_HASH_SIZE),
};
static int __init blake2b_neon_mod_init(void)
{
if (!(elf_hwcap & HWCAP_NEON))
return -ENODEV;
return crypto_register_shashes(blake2b_neon_algs,
ARRAY_SIZE(blake2b_neon_algs));
}
static void __exit blake2b_neon_mod_exit(void)
{
crypto_unregister_shashes(blake2b_neon_algs,
ARRAY_SIZE(blake2b_neon_algs));
}
module_init(blake2b_neon_mod_init);
module_exit(blake2b_neon_mod_exit);
MODULE_DESCRIPTION("BLAKE2b digest algorithm, NEON accelerated");
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
MODULE_ALIAS_CRYPTO("blake2b-160");
MODULE_ALIAS_CRYPTO("blake2b-160-neon");
MODULE_ALIAS_CRYPTO("blake2b-256");
MODULE_ALIAS_CRYPTO("blake2b-256-neon");
MODULE_ALIAS_CRYPTO("blake2b-384");
MODULE_ALIAS_CRYPTO("blake2b-384-neon");
MODULE_ALIAS_CRYPTO("blake2b-512");
MODULE_ALIAS_CRYPTO("blake2b-512-neon");

View File

@@ -2,14 +2,21 @@
#ifndef _ASM_SIMD_H
#define _ASM_SIMD_H
#include <linux/cleanup.h>
#include <linux/compiler_attributes.h>
#include <linux/preempt.h>
#include <linux/types.h>
#include <asm/neon.h>
static __must_check inline bool may_use_simd(void)
{
return IS_ENABLED(CONFIG_KERNEL_MODE_NEON) && !in_hardirq()
&& !irqs_disabled();
}
DEFINE_LOCK_GUARD_0(ksimd, kernel_neon_begin(), kernel_neon_end())
#define scoped_ksimd() scoped_guard(ksimd)
#endif /* _ASM_SIMD_H */

View File

@@ -1783,10 +1783,10 @@ CONFIG_CRYPTO_CHACHA20=m
CONFIG_CRYPTO_BENCHMARK=m
CONFIG_CRYPTO_ECHAINIV=y
CONFIG_CRYPTO_MICHAEL_MIC=m
CONFIG_CRYPTO_SHA3=m
CONFIG_CRYPTO_ANSI_CPRNG=y
CONFIG_CRYPTO_USER_API_RNG=m
CONFIG_CRYPTO_GHASH_ARM64_CE=y
CONFIG_CRYPTO_SHA3_ARM64=m
CONFIG_CRYPTO_SM3_ARM64_CE=m
CONFIG_CRYPTO_AES_ARM64_CE_BLK=y
CONFIG_CRYPTO_AES_ARM64_BS=m

View File

@@ -25,17 +25,6 @@ config CRYPTO_NHPOLY1305_NEON
Architecture: arm64 using:
- NEON (Advanced SIMD) extensions
config CRYPTO_SHA3_ARM64
tristate "Hash functions: SHA-3 (ARMv8.2 Crypto Extensions)"
depends on KERNEL_MODE_NEON
select CRYPTO_HASH
select CRYPTO_SHA3
help
SHA-3 secure hash algorithms (FIPS 202)
Architecture: arm64 using:
- ARMv8.2 Crypto Extensions
config CRYPTO_SM3_NEON
tristate "Hash functions: SM3 (NEON)"
depends on KERNEL_MODE_NEON
@@ -58,16 +47,6 @@ config CRYPTO_SM3_ARM64_CE
Architecture: arm64 using:
- ARMv8.2 Crypto Extensions
config CRYPTO_POLYVAL_ARM64_CE
tristate "Hash functions: POLYVAL (ARMv8 Crypto Extensions)"
depends on KERNEL_MODE_NEON
select CRYPTO_POLYVAL
help
POLYVAL hash function for HCTR2
Architecture: arm64 using:
- ARMv8 Crypto Extensions
config CRYPTO_AES_ARM64
tristate "Ciphers: AES, modes: ECB, CBC, CTR, CTS, XCTR, XTS"
select CRYPTO_AES

View File

@@ -5,9 +5,6 @@
# Copyright (C) 2014 Linaro Ltd <ard.biesheuvel@linaro.org>
#
obj-$(CONFIG_CRYPTO_SHA3_ARM64) += sha3-ce.o
sha3-ce-y := sha3-ce-glue.o sha3-ce-core.o
obj-$(CONFIG_CRYPTO_SM3_NEON) += sm3-neon.o
sm3-neon-y := sm3-neon-glue.o sm3-neon-core.o
@@ -32,9 +29,6 @@ sm4-neon-y := sm4-neon-glue.o sm4-neon-core.o
obj-$(CONFIG_CRYPTO_GHASH_ARM64_CE) += ghash-ce.o
ghash-ce-y := ghash-ce-glue.o ghash-ce-core.o
obj-$(CONFIG_CRYPTO_POLYVAL_ARM64_CE) += polyval-ce.o
polyval-ce-y := polyval-ce-glue.o polyval-ce-core.o
obj-$(CONFIG_CRYPTO_AES_ARM64_CE) += aes-ce-cipher.o
aes-ce-cipher-y := aes-ce-core.o aes-ce-glue.o

View File

@@ -8,7 +8,6 @@
* Author: Ard Biesheuvel <ardb@kernel.org>
*/
#include <asm/neon.h>
#include <linux/unaligned.h>
#include <crypto/aes.h>
#include <crypto/scatterwalk.h>
@@ -16,6 +15,8 @@
#include <crypto/internal/skcipher.h>
#include <linux/module.h>
#include <asm/simd.h>
#include "aes-ce-setkey.h"
MODULE_IMPORT_NS("CRYPTO_INTERNAL");
@@ -114,11 +115,8 @@ static u32 ce_aes_ccm_auth_data(u8 mac[], u8 const in[], u32 abytes,
in += adv;
abytes -= adv;
if (unlikely(rem)) {
kernel_neon_end();
kernel_neon_begin();
if (unlikely(rem))
macp = 0;
}
} else {
u32 l = min(AES_BLOCK_SIZE - macp, abytes);
@@ -187,40 +185,38 @@ static int ccm_encrypt(struct aead_request *req)
if (unlikely(err))
return err;
kernel_neon_begin();
scoped_ksimd() {
if (req->assoclen)
ccm_calculate_auth_mac(req, mac);
if (req->assoclen)
ccm_calculate_auth_mac(req, mac);
do {
u32 tail = walk.nbytes % AES_BLOCK_SIZE;
const u8 *src = walk.src.virt.addr;
u8 *dst = walk.dst.virt.addr;
u8 buf[AES_BLOCK_SIZE];
u8 *final_iv = NULL;
do {
u32 tail = walk.nbytes % AES_BLOCK_SIZE;
const u8 *src = walk.src.virt.addr;
u8 *dst = walk.dst.virt.addr;
u8 buf[AES_BLOCK_SIZE];
u8 *final_iv = NULL;
if (walk.nbytes == walk.total) {
tail = 0;
final_iv = orig_iv;
}
if (walk.nbytes == walk.total) {
tail = 0;
final_iv = orig_iv;
}
if (unlikely(walk.nbytes < AES_BLOCK_SIZE))
src = dst = memcpy(&buf[sizeof(buf) - walk.nbytes],
src, walk.nbytes);
if (unlikely(walk.nbytes < AES_BLOCK_SIZE))
src = dst = memcpy(&buf[sizeof(buf) - walk.nbytes],
src, walk.nbytes);
ce_aes_ccm_encrypt(dst, src, walk.nbytes - tail,
ctx->key_enc, num_rounds(ctx),
mac, walk.iv, final_iv);
ce_aes_ccm_encrypt(dst, src, walk.nbytes - tail,
ctx->key_enc, num_rounds(ctx),
mac, walk.iv, final_iv);
if (unlikely(walk.nbytes < AES_BLOCK_SIZE))
memcpy(walk.dst.virt.addr, dst, walk.nbytes);
if (unlikely(walk.nbytes < AES_BLOCK_SIZE))
memcpy(walk.dst.virt.addr, dst, walk.nbytes);
if (walk.nbytes) {
err = skcipher_walk_done(&walk, tail);
}
} while (walk.nbytes);
kernel_neon_end();
if (walk.nbytes) {
err = skcipher_walk_done(&walk, tail);
}
} while (walk.nbytes);
}
if (unlikely(err))
return err;
@@ -254,40 +250,38 @@ static int ccm_decrypt(struct aead_request *req)
if (unlikely(err))
return err;
kernel_neon_begin();
scoped_ksimd() {
if (req->assoclen)
ccm_calculate_auth_mac(req, mac);
if (req->assoclen)
ccm_calculate_auth_mac(req, mac);
do {
u32 tail = walk.nbytes % AES_BLOCK_SIZE;
const u8 *src = walk.src.virt.addr;
u8 *dst = walk.dst.virt.addr;
u8 buf[AES_BLOCK_SIZE];
u8 *final_iv = NULL;
do {
u32 tail = walk.nbytes % AES_BLOCK_SIZE;
const u8 *src = walk.src.virt.addr;
u8 *dst = walk.dst.virt.addr;
u8 buf[AES_BLOCK_SIZE];
u8 *final_iv = NULL;
if (walk.nbytes == walk.total) {
tail = 0;
final_iv = orig_iv;
}
if (walk.nbytes == walk.total) {
tail = 0;
final_iv = orig_iv;
}
if (unlikely(walk.nbytes < AES_BLOCK_SIZE))
src = dst = memcpy(&buf[sizeof(buf) - walk.nbytes],
src, walk.nbytes);
if (unlikely(walk.nbytes < AES_BLOCK_SIZE))
src = dst = memcpy(&buf[sizeof(buf) - walk.nbytes],
src, walk.nbytes);
ce_aes_ccm_decrypt(dst, src, walk.nbytes - tail,
ctx->key_enc, num_rounds(ctx),
mac, walk.iv, final_iv);
ce_aes_ccm_decrypt(dst, src, walk.nbytes - tail,
ctx->key_enc, num_rounds(ctx),
mac, walk.iv, final_iv);
if (unlikely(walk.nbytes < AES_BLOCK_SIZE))
memcpy(walk.dst.virt.addr, dst, walk.nbytes);
if (unlikely(walk.nbytes < AES_BLOCK_SIZE))
memcpy(walk.dst.virt.addr, dst, walk.nbytes);
if (walk.nbytes) {
err = skcipher_walk_done(&walk, tail);
}
} while (walk.nbytes);
kernel_neon_end();
if (walk.nbytes) {
err = skcipher_walk_done(&walk, tail);
}
} while (walk.nbytes);
}
if (unlikely(err))
return err;

View File

@@ -52,9 +52,8 @@ static void aes_cipher_encrypt(struct crypto_tfm *tfm, u8 dst[], u8 const src[])
return;
}
kernel_neon_begin();
__aes_ce_encrypt(ctx->key_enc, dst, src, num_rounds(ctx));
kernel_neon_end();
scoped_ksimd()
__aes_ce_encrypt(ctx->key_enc, dst, src, num_rounds(ctx));
}
static void aes_cipher_decrypt(struct crypto_tfm *tfm, u8 dst[], u8 const src[])
@@ -66,9 +65,8 @@ static void aes_cipher_decrypt(struct crypto_tfm *tfm, u8 dst[], u8 const src[])
return;
}
kernel_neon_begin();
__aes_ce_decrypt(ctx->key_dec, dst, src, num_rounds(ctx));
kernel_neon_end();
scoped_ksimd()
__aes_ce_decrypt(ctx->key_dec, dst, src, num_rounds(ctx));
}
int ce_aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
@@ -94,47 +92,48 @@ int ce_aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
for (i = 0; i < kwords; i++)
ctx->key_enc[i] = get_unaligned_le32(in_key + i * sizeof(u32));
kernel_neon_begin();
for (i = 0; i < sizeof(rcon); i++) {
u32 *rki = ctx->key_enc + (i * kwords);
u32 *rko = rki + kwords;
scoped_ksimd() {
for (i = 0; i < sizeof(rcon); i++) {
u32 *rki = ctx->key_enc + (i * kwords);
u32 *rko = rki + kwords;
rko[0] = ror32(__aes_ce_sub(rki[kwords - 1]), 8) ^ rcon[i] ^ rki[0];
rko[1] = rko[0] ^ rki[1];
rko[2] = rko[1] ^ rki[2];
rko[3] = rko[2] ^ rki[3];
rko[0] = ror32(__aes_ce_sub(rki[kwords - 1]), 8) ^
rcon[i] ^ rki[0];
rko[1] = rko[0] ^ rki[1];
rko[2] = rko[1] ^ rki[2];
rko[3] = rko[2] ^ rki[3];
if (key_len == AES_KEYSIZE_192) {
if (i >= 7)
break;
rko[4] = rko[3] ^ rki[4];
rko[5] = rko[4] ^ rki[5];
} else if (key_len == AES_KEYSIZE_256) {
if (i >= 6)
break;
rko[4] = __aes_ce_sub(rko[3]) ^ rki[4];
rko[5] = rko[4] ^ rki[5];
rko[6] = rko[5] ^ rki[6];
rko[7] = rko[6] ^ rki[7];
if (key_len == AES_KEYSIZE_192) {
if (i >= 7)
break;
rko[4] = rko[3] ^ rki[4];
rko[5] = rko[4] ^ rki[5];
} else if (key_len == AES_KEYSIZE_256) {
if (i >= 6)
break;
rko[4] = __aes_ce_sub(rko[3]) ^ rki[4];
rko[5] = rko[4] ^ rki[5];
rko[6] = rko[5] ^ rki[6];
rko[7] = rko[6] ^ rki[7];
}
}
/*
* Generate the decryption keys for the Equivalent Inverse
* Cipher. This involves reversing the order of the round
* keys, and applying the Inverse Mix Columns transformation on
* all but the first and the last one.
*/
key_enc = (struct aes_block *)ctx->key_enc;
key_dec = (struct aes_block *)ctx->key_dec;
j = num_rounds(ctx);
key_dec[0] = key_enc[j];
for (i = 1, j--; j > 0; i++, j--)
__aes_ce_invert(key_dec + i, key_enc + j);
key_dec[i] = key_enc[0];
}
/*
* Generate the decryption keys for the Equivalent Inverse Cipher.
* This involves reversing the order of the round keys, and applying
* the Inverse Mix Columns transformation on all but the first and
* the last one.
*/
key_enc = (struct aes_block *)ctx->key_enc;
key_dec = (struct aes_block *)ctx->key_dec;
j = num_rounds(ctx);
key_dec[0] = key_enc[j];
for (i = 1, j--; j > 0; i++, j--)
__aes_ce_invert(key_dec + i, key_enc + j);
key_dec[i] = key_enc[0];
kernel_neon_end();
return 0;
}
EXPORT_SYMBOL(ce_aes_expandkey);

View File

@@ -5,8 +5,6 @@
* Copyright (C) 2013 - 2017 Linaro Ltd <ard.biesheuvel@linaro.org>
*/
#include <asm/hwcap.h>
#include <asm/neon.h>
#include <crypto/aes.h>
#include <crypto/ctr.h>
#include <crypto/internal/hash.h>
@@ -20,6 +18,9 @@
#include <linux/module.h>
#include <linux/string.h>
#include <asm/hwcap.h>
#include <asm/simd.h>
#include "aes-ce-setkey.h"
#ifdef USE_V8_CRYPTO_EXTENSIONS
@@ -186,10 +187,9 @@ static int __maybe_unused ecb_encrypt(struct skcipher_request *req)
err = skcipher_walk_virt(&walk, req, false);
while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
kernel_neon_begin();
aes_ecb_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key_enc, rounds, blocks);
kernel_neon_end();
scoped_ksimd()
aes_ecb_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key_enc, rounds, blocks);
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
return err;
@@ -206,10 +206,9 @@ static int __maybe_unused ecb_decrypt(struct skcipher_request *req)
err = skcipher_walk_virt(&walk, req, false);
while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
kernel_neon_begin();
aes_ecb_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key_dec, rounds, blocks);
kernel_neon_end();
scoped_ksimd()
aes_ecb_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key_dec, rounds, blocks);
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
return err;
@@ -224,10 +223,9 @@ static int cbc_encrypt_walk(struct skcipher_request *req,
unsigned int blocks;
while ((blocks = (walk->nbytes / AES_BLOCK_SIZE))) {
kernel_neon_begin();
aes_cbc_encrypt(walk->dst.virt.addr, walk->src.virt.addr,
ctx->key_enc, rounds, blocks, walk->iv);
kernel_neon_end();
scoped_ksimd()
aes_cbc_encrypt(walk->dst.virt.addr, walk->src.virt.addr,
ctx->key_enc, rounds, blocks, walk->iv);
err = skcipher_walk_done(walk, walk->nbytes % AES_BLOCK_SIZE);
}
return err;
@@ -253,10 +251,9 @@ static int cbc_decrypt_walk(struct skcipher_request *req,
unsigned int blocks;
while ((blocks = (walk->nbytes / AES_BLOCK_SIZE))) {
kernel_neon_begin();
aes_cbc_decrypt(walk->dst.virt.addr, walk->src.virt.addr,
ctx->key_dec, rounds, blocks, walk->iv);
kernel_neon_end();
scoped_ksimd()
aes_cbc_decrypt(walk->dst.virt.addr, walk->src.virt.addr,
ctx->key_dec, rounds, blocks, walk->iv);
err = skcipher_walk_done(walk, walk->nbytes % AES_BLOCK_SIZE);
}
return err;
@@ -322,10 +319,9 @@ static int cts_cbc_encrypt(struct skcipher_request *req)
if (err)
return err;
kernel_neon_begin();
aes_cbc_cts_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key_enc, rounds, walk.nbytes, walk.iv);
kernel_neon_end();
scoped_ksimd()
aes_cbc_cts_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key_enc, rounds, walk.nbytes, walk.iv);
return skcipher_walk_done(&walk, 0);
}
@@ -379,10 +375,9 @@ static int cts_cbc_decrypt(struct skcipher_request *req)
if (err)
return err;
kernel_neon_begin();
aes_cbc_cts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key_dec, rounds, walk.nbytes, walk.iv);
kernel_neon_end();
scoped_ksimd()
aes_cbc_cts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key_dec, rounds, walk.nbytes, walk.iv);
return skcipher_walk_done(&walk, 0);
}
@@ -399,11 +394,11 @@ static int __maybe_unused essiv_cbc_encrypt(struct skcipher_request *req)
blocks = walk.nbytes / AES_BLOCK_SIZE;
if (blocks) {
kernel_neon_begin();
aes_essiv_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_enc, rounds, blocks,
req->iv, ctx->key2.key_enc);
kernel_neon_end();
scoped_ksimd()
aes_essiv_cbc_encrypt(walk.dst.virt.addr,
walk.src.virt.addr,
ctx->key1.key_enc, rounds, blocks,
req->iv, ctx->key2.key_enc);
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
return err ?: cbc_encrypt_walk(req, &walk);
@@ -421,11 +416,11 @@ static int __maybe_unused essiv_cbc_decrypt(struct skcipher_request *req)
blocks = walk.nbytes / AES_BLOCK_SIZE;
if (blocks) {
kernel_neon_begin();
aes_essiv_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_dec, rounds, blocks,
req->iv, ctx->key2.key_enc);
kernel_neon_end();
scoped_ksimd()
aes_essiv_cbc_decrypt(walk.dst.virt.addr,
walk.src.virt.addr,
ctx->key1.key_dec, rounds, blocks,
req->iv, ctx->key2.key_enc);
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
return err ?: cbc_decrypt_walk(req, &walk);
@@ -461,10 +456,9 @@ static int __maybe_unused xctr_encrypt(struct skcipher_request *req)
else if (nbytes < walk.total)
nbytes &= ~(AES_BLOCK_SIZE - 1);
kernel_neon_begin();
aes_xctr_encrypt(dst, src, ctx->key_enc, rounds, nbytes,
walk.iv, byte_ctr);
kernel_neon_end();
scoped_ksimd()
aes_xctr_encrypt(dst, src, ctx->key_enc, rounds, nbytes,
walk.iv, byte_ctr);
if (unlikely(nbytes < AES_BLOCK_SIZE))
memcpy(walk.dst.virt.addr,
@@ -506,10 +500,9 @@ static int __maybe_unused ctr_encrypt(struct skcipher_request *req)
else if (nbytes < walk.total)
nbytes &= ~(AES_BLOCK_SIZE - 1);
kernel_neon_begin();
aes_ctr_encrypt(dst, src, ctx->key_enc, rounds, nbytes,
walk.iv);
kernel_neon_end();
scoped_ksimd()
aes_ctr_encrypt(dst, src, ctx->key_enc, rounds, nbytes,
walk.iv);
if (unlikely(nbytes < AES_BLOCK_SIZE))
memcpy(walk.dst.virt.addr,
@@ -562,11 +555,10 @@ static int __maybe_unused xts_encrypt(struct skcipher_request *req)
if (walk.nbytes < walk.total)
nbytes &= ~(AES_BLOCK_SIZE - 1);
kernel_neon_begin();
aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_enc, rounds, nbytes,
ctx->key2.key_enc, walk.iv, first);
kernel_neon_end();
scoped_ksimd()
aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_enc, rounds, nbytes,
ctx->key2.key_enc, walk.iv, first);
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
}
@@ -584,11 +576,10 @@ static int __maybe_unused xts_encrypt(struct skcipher_request *req)
if (err)
return err;
kernel_neon_begin();
aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_enc, rounds, walk.nbytes,
ctx->key2.key_enc, walk.iv, first);
kernel_neon_end();
scoped_ksimd()
aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_enc, rounds, walk.nbytes,
ctx->key2.key_enc, walk.iv, first);
return skcipher_walk_done(&walk, 0);
}
@@ -634,11 +625,10 @@ static int __maybe_unused xts_decrypt(struct skcipher_request *req)
if (walk.nbytes < walk.total)
nbytes &= ~(AES_BLOCK_SIZE - 1);
kernel_neon_begin();
aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_dec, rounds, nbytes,
ctx->key2.key_enc, walk.iv, first);
kernel_neon_end();
scoped_ksimd()
aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_dec, rounds, nbytes,
ctx->key2.key_enc, walk.iv, first);
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
}
@@ -657,11 +647,10 @@ static int __maybe_unused xts_decrypt(struct skcipher_request *req)
return err;
kernel_neon_begin();
aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_dec, rounds, walk.nbytes,
ctx->key2.key_enc, walk.iv, first);
kernel_neon_end();
scoped_ksimd()
aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key1.key_dec, rounds, walk.nbytes,
ctx->key2.key_enc, walk.iv, first);
return skcipher_walk_done(&walk, 0);
}
@@ -808,10 +797,9 @@ static int cmac_setkey(struct crypto_shash *tfm, const u8 *in_key,
return err;
/* encrypt the zero vector */
kernel_neon_begin();
aes_ecb_encrypt(ctx->consts, (u8[AES_BLOCK_SIZE]){}, ctx->key.key_enc,
rounds, 1);
kernel_neon_end();
scoped_ksimd()
aes_ecb_encrypt(ctx->consts, (u8[AES_BLOCK_SIZE]){},
ctx->key.key_enc, rounds, 1);
cmac_gf128_mul_by_x(consts, consts);
cmac_gf128_mul_by_x(consts + 1, consts);
@@ -837,10 +825,10 @@ static int xcbc_setkey(struct crypto_shash *tfm, const u8 *in_key,
if (err)
return err;
kernel_neon_begin();
aes_ecb_encrypt(key, ks[0], ctx->key.key_enc, rounds, 1);
aes_ecb_encrypt(ctx->consts, ks[1], ctx->key.key_enc, rounds, 2);
kernel_neon_end();
scoped_ksimd() {
aes_ecb_encrypt(key, ks[0], ctx->key.key_enc, rounds, 1);
aes_ecb_encrypt(ctx->consts, ks[1], ctx->key.key_enc, rounds, 2);
}
return cbcmac_setkey(tfm, key, sizeof(key));
}
@@ -860,10 +848,9 @@ static void mac_do_update(struct crypto_aes_ctx *ctx, u8 const in[], int blocks,
int rem;
do {
kernel_neon_begin();
rem = aes_mac_update(in, ctx->key_enc, rounds, blocks,
dg, enc_before, !enc_before);
kernel_neon_end();
scoped_ksimd()
rem = aes_mac_update(in, ctx->key_enc, rounds, blocks,
dg, enc_before, !enc_before);
in += (blocks - rem) * AES_BLOCK_SIZE;
blocks = rem;
} while (blocks);

View File

@@ -85,9 +85,8 @@ static int aesbs_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
ctx->rounds = 6 + key_len / 4;
kernel_neon_begin();
aesbs_convert_key(ctx->rk, rk.key_enc, ctx->rounds);
kernel_neon_end();
scoped_ksimd()
aesbs_convert_key(ctx->rk, rk.key_enc, ctx->rounds);
return 0;
}
@@ -110,10 +109,9 @@ static int __ecb_crypt(struct skcipher_request *req,
blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE);
kernel_neon_begin();
fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->rk,
ctx->rounds, blocks);
kernel_neon_end();
scoped_ksimd()
fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->rk,
ctx->rounds, blocks);
err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE);
}
@@ -146,9 +144,8 @@ static int aesbs_cbc_ctr_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
memcpy(ctx->enc, rk.key_enc, sizeof(ctx->enc));
kernel_neon_begin();
aesbs_convert_key(ctx->key.rk, rk.key_enc, ctx->key.rounds);
kernel_neon_end();
scoped_ksimd()
aesbs_convert_key(ctx->key.rk, rk.key_enc, ctx->key.rounds);
memzero_explicit(&rk, sizeof(rk));
return 0;
@@ -167,11 +164,11 @@ static int cbc_encrypt(struct skcipher_request *req)
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
/* fall back to the non-bitsliced NEON implementation */
kernel_neon_begin();
neon_aes_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->enc, ctx->key.rounds, blocks,
walk.iv);
kernel_neon_end();
scoped_ksimd()
neon_aes_cbc_encrypt(walk.dst.virt.addr,
walk.src.virt.addr,
ctx->enc, ctx->key.rounds, blocks,
walk.iv);
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
return err;
@@ -193,11 +190,10 @@ static int cbc_decrypt(struct skcipher_request *req)
blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE);
kernel_neon_begin();
aesbs_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key.rk, ctx->key.rounds, blocks,
walk.iv);
kernel_neon_end();
scoped_ksimd()
aesbs_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key.rk, ctx->key.rounds, blocks,
walk.iv);
err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE);
}
@@ -220,30 +216,32 @@ static int ctr_encrypt(struct skcipher_request *req)
const u8 *src = walk.src.virt.addr;
u8 *dst = walk.dst.virt.addr;
kernel_neon_begin();
if (blocks >= 8) {
aesbs_ctr_encrypt(dst, src, ctx->key.rk, ctx->key.rounds,
blocks, walk.iv);
dst += blocks * AES_BLOCK_SIZE;
src += blocks * AES_BLOCK_SIZE;
scoped_ksimd() {
if (blocks >= 8) {
aesbs_ctr_encrypt(dst, src, ctx->key.rk,
ctx->key.rounds, blocks,
walk.iv);
dst += blocks * AES_BLOCK_SIZE;
src += blocks * AES_BLOCK_SIZE;
}
if (nbytes && walk.nbytes == walk.total) {
u8 buf[AES_BLOCK_SIZE];
u8 *d = dst;
if (unlikely(nbytes < AES_BLOCK_SIZE))
src = dst = memcpy(buf + sizeof(buf) -
nbytes, src, nbytes);
neon_aes_ctr_encrypt(dst, src, ctx->enc,
ctx->key.rounds, nbytes,
walk.iv);
if (unlikely(nbytes < AES_BLOCK_SIZE))
memcpy(d, dst, nbytes);
nbytes = 0;
}
}
if (nbytes && walk.nbytes == walk.total) {
u8 buf[AES_BLOCK_SIZE];
u8 *d = dst;
if (unlikely(nbytes < AES_BLOCK_SIZE))
src = dst = memcpy(buf + sizeof(buf) - nbytes,
src, nbytes);
neon_aes_ctr_encrypt(dst, src, ctx->enc, ctx->key.rounds,
nbytes, walk.iv);
if (unlikely(nbytes < AES_BLOCK_SIZE))
memcpy(d, dst, nbytes);
nbytes = 0;
}
kernel_neon_end();
err = skcipher_walk_done(&walk, nbytes);
}
return err;
@@ -320,33 +318,33 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt,
in = walk.src.virt.addr;
nbytes = walk.nbytes;
kernel_neon_begin();
if (blocks >= 8) {
if (first == 1)
neon_aes_ecb_encrypt(walk.iv, walk.iv,
ctx->twkey,
ctx->key.rounds, 1);
first = 2;
scoped_ksimd() {
if (blocks >= 8) {
if (first == 1)
neon_aes_ecb_encrypt(walk.iv, walk.iv,
ctx->twkey,
ctx->key.rounds, 1);
first = 2;
fn(out, in, ctx->key.rk, ctx->key.rounds, blocks,
walk.iv);
fn(out, in, ctx->key.rk, ctx->key.rounds, blocks,
walk.iv);
out += blocks * AES_BLOCK_SIZE;
in += blocks * AES_BLOCK_SIZE;
nbytes -= blocks * AES_BLOCK_SIZE;
out += blocks * AES_BLOCK_SIZE;
in += blocks * AES_BLOCK_SIZE;
nbytes -= blocks * AES_BLOCK_SIZE;
}
if (walk.nbytes == walk.total && nbytes > 0) {
if (encrypt)
neon_aes_xts_encrypt(out, in, ctx->cts.key_enc,
ctx->key.rounds, nbytes,
ctx->twkey, walk.iv, first);
else
neon_aes_xts_decrypt(out, in, ctx->cts.key_dec,
ctx->key.rounds, nbytes,
ctx->twkey, walk.iv, first);
nbytes = first = 0;
}
}
if (walk.nbytes == walk.total && nbytes > 0) {
if (encrypt)
neon_aes_xts_encrypt(out, in, ctx->cts.key_enc,
ctx->key.rounds, nbytes,
ctx->twkey, walk.iv, first);
else
neon_aes_xts_decrypt(out, in, ctx->cts.key_dec,
ctx->key.rounds, nbytes,
ctx->twkey, walk.iv, first);
nbytes = first = 0;
}
kernel_neon_end();
err = skcipher_walk_done(&walk, nbytes);
}
@@ -369,14 +367,16 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt,
in = walk.src.virt.addr;
nbytes = walk.nbytes;
kernel_neon_begin();
if (encrypt)
neon_aes_xts_encrypt(out, in, ctx->cts.key_enc, ctx->key.rounds,
nbytes, ctx->twkey, walk.iv, first);
else
neon_aes_xts_decrypt(out, in, ctx->cts.key_dec, ctx->key.rounds,
nbytes, ctx->twkey, walk.iv, first);
kernel_neon_end();
scoped_ksimd() {
if (encrypt)
neon_aes_xts_encrypt(out, in, ctx->cts.key_enc,
ctx->key.rounds, nbytes, ctx->twkey,
walk.iv, first);
else
neon_aes_xts_decrypt(out, in, ctx->cts.key_dec,
ctx->key.rounds, nbytes, ctx->twkey,
walk.iv, first);
}
return skcipher_walk_done(&walk, 0);
}

View File

@@ -5,7 +5,6 @@
* Copyright (C) 2014 - 2018 Linaro Ltd. <ard.biesheuvel@linaro.org>
*/
#include <asm/neon.h>
#include <crypto/aes.h>
#include <crypto/b128ops.h>
#include <crypto/gcm.h>
@@ -22,6 +21,8 @@
#include <linux/string.h>
#include <linux/unaligned.h>
#include <asm/simd.h>
MODULE_DESCRIPTION("GHASH and AES-GCM using ARMv8 Crypto Extensions");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
@@ -74,9 +75,8 @@ void ghash_do_simd_update(int blocks, u64 dg[], const char *src,
u64 const h[][2],
const char *head))
{
kernel_neon_begin();
simd_update(blocks, dg, src, key->h, head);
kernel_neon_end();
scoped_ksimd()
simd_update(blocks, dg, src, key->h, head);
}
/* avoid hogging the CPU for too long */
@@ -329,11 +329,10 @@ static int gcm_encrypt(struct aead_request *req, char *iv, int assoclen)
tag = NULL;
}
kernel_neon_begin();
pmull_gcm_encrypt(nbytes, dst, src, ctx->ghash_key.h,
dg, iv, ctx->aes_key.key_enc, nrounds,
tag);
kernel_neon_end();
scoped_ksimd()
pmull_gcm_encrypt(nbytes, dst, src, ctx->ghash_key.h,
dg, iv, ctx->aes_key.key_enc, nrounds,
tag);
if (unlikely(!nbytes))
break;
@@ -399,11 +398,11 @@ static int gcm_decrypt(struct aead_request *req, char *iv, int assoclen)
tag = NULL;
}
kernel_neon_begin();
ret = pmull_gcm_decrypt(nbytes, dst, src, ctx->ghash_key.h,
dg, iv, ctx->aes_key.key_enc,
nrounds, tag, otag, authsize);
kernel_neon_end();
scoped_ksimd()
ret = pmull_gcm_decrypt(nbytes, dst, src,
ctx->ghash_key.h,
dg, iv, ctx->aes_key.key_enc,
nrounds, tag, otag, authsize);
if (unlikely(!nbytes))
break;

View File

@@ -25,9 +25,8 @@ static int nhpoly1305_neon_update(struct shash_desc *desc,
do {
unsigned int n = min_t(unsigned int, srclen, SZ_4K);
kernel_neon_begin();
crypto_nhpoly1305_update_helper(desc, src, n, nh_neon);
kernel_neon_end();
scoped_ksimd()
crypto_nhpoly1305_update_helper(desc, src, n, nh_neon);
src += n;
srclen -= n;
} while (srclen);

View File

@@ -1,158 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* Glue code for POLYVAL using ARMv8 Crypto Extensions
*
* Copyright (c) 2007 Nokia Siemens Networks - Mikko Herranen <mh1@iki.fi>
* Copyright (c) 2009 Intel Corp.
* Author: Huang Ying <ying.huang@intel.com>
* Copyright 2021 Google LLC
*/
/*
* Glue code based on ghash-clmulni-intel_glue.c.
*
* This implementation of POLYVAL uses montgomery multiplication accelerated by
* ARMv8 Crypto Extensions instructions to implement the finite field operations.
*/
#include <asm/neon.h>
#include <crypto/internal/hash.h>
#include <crypto/polyval.h>
#include <crypto/utils.h>
#include <linux/cpufeature.h>
#include <linux/errno.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/string.h>
#define NUM_KEY_POWERS 8
struct polyval_tfm_ctx {
/*
* These powers must be in the order h^8, ..., h^1.
*/
u8 key_powers[NUM_KEY_POWERS][POLYVAL_BLOCK_SIZE];
};
struct polyval_desc_ctx {
u8 buffer[POLYVAL_BLOCK_SIZE];
};
asmlinkage void pmull_polyval_update(const struct polyval_tfm_ctx *keys,
const u8 *in, size_t nblocks, u8 *accumulator);
asmlinkage void pmull_polyval_mul(u8 *op1, const u8 *op2);
static void internal_polyval_update(const struct polyval_tfm_ctx *keys,
const u8 *in, size_t nblocks, u8 *accumulator)
{
kernel_neon_begin();
pmull_polyval_update(keys, in, nblocks, accumulator);
kernel_neon_end();
}
static void internal_polyval_mul(u8 *op1, const u8 *op2)
{
kernel_neon_begin();
pmull_polyval_mul(op1, op2);
kernel_neon_end();
}
static int polyval_arm64_setkey(struct crypto_shash *tfm,
const u8 *key, unsigned int keylen)
{
struct polyval_tfm_ctx *tctx = crypto_shash_ctx(tfm);
int i;
if (keylen != POLYVAL_BLOCK_SIZE)
return -EINVAL;
memcpy(tctx->key_powers[NUM_KEY_POWERS-1], key, POLYVAL_BLOCK_SIZE);
for (i = NUM_KEY_POWERS-2; i >= 0; i--) {
memcpy(tctx->key_powers[i], key, POLYVAL_BLOCK_SIZE);
internal_polyval_mul(tctx->key_powers[i],
tctx->key_powers[i+1]);
}
return 0;
}
static int polyval_arm64_init(struct shash_desc *desc)
{
struct polyval_desc_ctx *dctx = shash_desc_ctx(desc);
memset(dctx, 0, sizeof(*dctx));
return 0;
}
static int polyval_arm64_update(struct shash_desc *desc,
const u8 *src, unsigned int srclen)
{
struct polyval_desc_ctx *dctx = shash_desc_ctx(desc);
const struct polyval_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm);
unsigned int nblocks;
do {
/* allow rescheduling every 4K bytes */
nblocks = min(srclen, 4096U) / POLYVAL_BLOCK_SIZE;
internal_polyval_update(tctx, src, nblocks, dctx->buffer);
srclen -= nblocks * POLYVAL_BLOCK_SIZE;
src += nblocks * POLYVAL_BLOCK_SIZE;
} while (srclen >= POLYVAL_BLOCK_SIZE);
return srclen;
}
static int polyval_arm64_finup(struct shash_desc *desc, const u8 *src,
unsigned int len, u8 *dst)
{
struct polyval_desc_ctx *dctx = shash_desc_ctx(desc);
const struct polyval_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm);
if (len) {
crypto_xor(dctx->buffer, src, len);
internal_polyval_mul(dctx->buffer,
tctx->key_powers[NUM_KEY_POWERS-1]);
}
memcpy(dst, dctx->buffer, POLYVAL_BLOCK_SIZE);
return 0;
}
static struct shash_alg polyval_alg = {
.digestsize = POLYVAL_DIGEST_SIZE,
.init = polyval_arm64_init,
.update = polyval_arm64_update,
.finup = polyval_arm64_finup,
.setkey = polyval_arm64_setkey,
.descsize = sizeof(struct polyval_desc_ctx),
.base = {
.cra_name = "polyval",
.cra_driver_name = "polyval-ce",
.cra_priority = 200,
.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY,
.cra_blocksize = POLYVAL_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct polyval_tfm_ctx),
.cra_module = THIS_MODULE,
},
};
static int __init polyval_ce_mod_init(void)
{
return crypto_register_shash(&polyval_alg);
}
static void __exit polyval_ce_mod_exit(void)
{
crypto_unregister_shash(&polyval_alg);
}
module_cpu_feature_match(PMULL, polyval_ce_mod_init)
module_exit(polyval_ce_mod_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("POLYVAL hash function accelerated by ARMv8 Crypto Extensions");
MODULE_ALIAS_CRYPTO("polyval");
MODULE_ALIAS_CRYPTO("polyval-ce");

View File

@@ -1,151 +0,0 @@
// SPDX-License-Identifier: GPL-2.0
/*
* sha3-ce-glue.c - core SHA-3 transform using v8.2 Crypto Extensions
*
* Copyright (C) 2018 Linaro Ltd <ard.biesheuvel@linaro.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <asm/hwcap.h>
#include <asm/neon.h>
#include <asm/simd.h>
#include <crypto/internal/hash.h>
#include <crypto/sha3.h>
#include <linux/cpufeature.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/string.h>
#include <linux/unaligned.h>
MODULE_DESCRIPTION("SHA3 secure hash using ARMv8 Crypto Extensions");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
MODULE_ALIAS_CRYPTO("sha3-224");
MODULE_ALIAS_CRYPTO("sha3-256");
MODULE_ALIAS_CRYPTO("sha3-384");
MODULE_ALIAS_CRYPTO("sha3-512");
asmlinkage int sha3_ce_transform(u64 *st, const u8 *data, int blocks,
int md_len);
static int sha3_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
struct sha3_state *sctx = shash_desc_ctx(desc);
struct crypto_shash *tfm = desc->tfm;
unsigned int bs, ds;
int blocks;
ds = crypto_shash_digestsize(tfm);
bs = crypto_shash_blocksize(tfm);
blocks = len / bs;
len -= blocks * bs;
do {
int rem;
kernel_neon_begin();
rem = sha3_ce_transform(sctx->st, data, blocks, ds);
kernel_neon_end();
data += (blocks - rem) * bs;
blocks = rem;
} while (blocks);
return len;
}
static int sha3_finup(struct shash_desc *desc, const u8 *src, unsigned int len,
u8 *out)
{
struct sha3_state *sctx = shash_desc_ctx(desc);
struct crypto_shash *tfm = desc->tfm;
__le64 *digest = (__le64 *)out;
u8 block[SHA3_224_BLOCK_SIZE];
unsigned int bs, ds;
int i;
ds = crypto_shash_digestsize(tfm);
bs = crypto_shash_blocksize(tfm);
memcpy(block, src, len);
block[len++] = 0x06;
memset(block + len, 0, bs - len);
block[bs - 1] |= 0x80;
kernel_neon_begin();
sha3_ce_transform(sctx->st, block, 1, ds);
kernel_neon_end();
memzero_explicit(block , sizeof(block));
for (i = 0; i < ds / 8; i++)
put_unaligned_le64(sctx->st[i], digest++);
if (ds & 4)
put_unaligned_le32(sctx->st[i], (__le32 *)digest);
return 0;
}
static struct shash_alg algs[] = { {
.digestsize = SHA3_224_DIGEST_SIZE,
.init = crypto_sha3_init,
.update = sha3_update,
.finup = sha3_finup,
.descsize = SHA3_STATE_SIZE,
.base.cra_name = "sha3-224",
.base.cra_driver_name = "sha3-224-ce",
.base.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY,
.base.cra_blocksize = SHA3_224_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
.base.cra_priority = 200,
}, {
.digestsize = SHA3_256_DIGEST_SIZE,
.init = crypto_sha3_init,
.update = sha3_update,
.finup = sha3_finup,
.descsize = SHA3_STATE_SIZE,
.base.cra_name = "sha3-256",
.base.cra_driver_name = "sha3-256-ce",
.base.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY,
.base.cra_blocksize = SHA3_256_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
.base.cra_priority = 200,
}, {
.digestsize = SHA3_384_DIGEST_SIZE,
.init = crypto_sha3_init,
.update = sha3_update,
.finup = sha3_finup,
.descsize = SHA3_STATE_SIZE,
.base.cra_name = "sha3-384",
.base.cra_driver_name = "sha3-384-ce",
.base.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY,
.base.cra_blocksize = SHA3_384_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
.base.cra_priority = 200,
}, {
.digestsize = SHA3_512_DIGEST_SIZE,
.init = crypto_sha3_init,
.update = sha3_update,
.finup = sha3_finup,
.descsize = SHA3_STATE_SIZE,
.base.cra_name = "sha3-512",
.base.cra_driver_name = "sha3-512-ce",
.base.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY,
.base.cra_blocksize = SHA3_512_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
.base.cra_priority = 200,
} };
static int __init sha3_neon_mod_init(void)
{
return crypto_register_shashes(algs, ARRAY_SIZE(algs));
}
static void __exit sha3_neon_mod_fini(void)
{
crypto_unregister_shashes(algs, ARRAY_SIZE(algs));
}
module_cpu_feature_match(SHA3, sha3_neon_mod_init);
module_exit(sha3_neon_mod_fini);

View File

@@ -5,7 +5,6 @@
* Copyright (C) 2018 Linaro Ltd <ard.biesheuvel@linaro.org>
*/
#include <asm/neon.h>
#include <crypto/internal/hash.h>
#include <crypto/sm3.h>
#include <crypto/sm3_base.h>
@@ -13,6 +12,8 @@
#include <linux/kernel.h>
#include <linux/module.h>
#include <asm/simd.h>
MODULE_DESCRIPTION("SM3 secure hash using ARMv8 Crypto Extensions");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
@@ -25,18 +26,18 @@ static int sm3_ce_update(struct shash_desc *desc, const u8 *data,
{
int remain;
kernel_neon_begin();
remain = sm3_base_do_update_blocks(desc, data, len, sm3_ce_transform);
kernel_neon_end();
scoped_ksimd() {
remain = sm3_base_do_update_blocks(desc, data, len, sm3_ce_transform);
}
return remain;
}
static int sm3_ce_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
kernel_neon_begin();
sm3_base_do_finup(desc, data, len, sm3_ce_transform);
kernel_neon_end();
scoped_ksimd() {
sm3_base_do_finup(desc, data, len, sm3_ce_transform);
}
return sm3_base_finish(desc, out);
}

View File

@@ -5,7 +5,7 @@
* Copyright (C) 2022 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
*/
#include <asm/neon.h>
#include <asm/simd.h>
#include <crypto/internal/hash.h>
#include <crypto/sm3.h>
#include <crypto/sm3_base.h>
@@ -20,20 +20,16 @@ asmlinkage void sm3_neon_transform(struct sm3_state *sst, u8 const *src,
static int sm3_neon_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
int remain;
kernel_neon_begin();
remain = sm3_base_do_update_blocks(desc, data, len, sm3_neon_transform);
kernel_neon_end();
return remain;
scoped_ksimd()
return sm3_base_do_update_blocks(desc, data, len,
sm3_neon_transform);
}
static int sm3_neon_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
kernel_neon_begin();
sm3_base_do_finup(desc, data, len, sm3_neon_transform);
kernel_neon_end();
scoped_ksimd()
sm3_base_do_finup(desc, data, len, sm3_neon_transform);
return sm3_base_finish(desc, out);
}

View File

@@ -11,7 +11,7 @@
#include <linux/crypto.h>
#include <linux/kernel.h>
#include <linux/cpufeature.h>
#include <asm/neon.h>
#include <asm/simd.h>
#include <crypto/scatterwalk.h>
#include <crypto/internal/aead.h>
#include <crypto/internal/skcipher.h>
@@ -35,10 +35,9 @@ static int ccm_setkey(struct crypto_aead *tfm, const u8 *key,
if (key_len != SM4_KEY_SIZE)
return -EINVAL;
kernel_neon_begin();
sm4_ce_expand_key(key, ctx->rkey_enc, ctx->rkey_dec,
crypto_sm4_fk, crypto_sm4_ck);
kernel_neon_end();
scoped_ksimd()
sm4_ce_expand_key(key, ctx->rkey_enc, ctx->rkey_dec,
crypto_sm4_fk, crypto_sm4_ck);
return 0;
}
@@ -167,39 +166,23 @@ static int ccm_crypt(struct aead_request *req, struct skcipher_walk *walk,
memcpy(ctr0, walk->iv, SM4_BLOCK_SIZE);
crypto_inc(walk->iv, SM4_BLOCK_SIZE);
kernel_neon_begin();
scoped_ksimd() {
if (req->assoclen)
ccm_calculate_auth_mac(req, mac);
if (req->assoclen)
ccm_calculate_auth_mac(req, mac);
while (walk->nbytes) {
unsigned int tail = walk->nbytes % SM4_BLOCK_SIZE;
while (walk->nbytes && walk->nbytes != walk->total) {
unsigned int tail = walk->nbytes % SM4_BLOCK_SIZE;
if (walk->nbytes == walk->total)
tail = 0;
sm4_ce_ccm_crypt(rkey_enc, walk->dst.virt.addr,
walk->src.virt.addr, walk->iv,
walk->nbytes - tail, mac);
kernel_neon_end();
err = skcipher_walk_done(walk, tail);
kernel_neon_begin();
}
if (walk->nbytes) {
sm4_ce_ccm_crypt(rkey_enc, walk->dst.virt.addr,
walk->src.virt.addr, walk->iv,
walk->nbytes, mac);
sm4_ce_ccm_crypt(rkey_enc, walk->dst.virt.addr,
walk->src.virt.addr, walk->iv,
walk->nbytes - tail, mac);
err = skcipher_walk_done(walk, tail);
}
sm4_ce_ccm_final(rkey_enc, ctr0, mac);
kernel_neon_end();
err = skcipher_walk_done(walk, 0);
} else {
sm4_ce_ccm_final(rkey_enc, ctr0, mac);
kernel_neon_end();
}
return err;

View File

@@ -32,9 +32,8 @@ static void sm4_ce_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
if (!crypto_simd_usable()) {
sm4_crypt_block(ctx->rkey_enc, out, in);
} else {
kernel_neon_begin();
sm4_ce_do_crypt(ctx->rkey_enc, out, in);
kernel_neon_end();
scoped_ksimd()
sm4_ce_do_crypt(ctx->rkey_enc, out, in);
}
}
@@ -45,9 +44,8 @@ static void sm4_ce_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
if (!crypto_simd_usable()) {
sm4_crypt_block(ctx->rkey_dec, out, in);
} else {
kernel_neon_begin();
sm4_ce_do_crypt(ctx->rkey_dec, out, in);
kernel_neon_end();
scoped_ksimd()
sm4_ce_do_crypt(ctx->rkey_dec, out, in);
}
}

View File

@@ -11,7 +11,7 @@
#include <linux/crypto.h>
#include <linux/kernel.h>
#include <linux/cpufeature.h>
#include <asm/neon.h>
#include <asm/simd.h>
#include <crypto/b128ops.h>
#include <crypto/scatterwalk.h>
#include <crypto/internal/aead.h>
@@ -48,13 +48,11 @@ static int gcm_setkey(struct crypto_aead *tfm, const u8 *key,
if (key_len != SM4_KEY_SIZE)
return -EINVAL;
kernel_neon_begin();
sm4_ce_expand_key(key, ctx->key.rkey_enc, ctx->key.rkey_dec,
crypto_sm4_fk, crypto_sm4_ck);
sm4_ce_pmull_ghash_setup(ctx->key.rkey_enc, ctx->ghash_table);
kernel_neon_end();
scoped_ksimd() {
sm4_ce_expand_key(key, ctx->key.rkey_enc, ctx->key.rkey_dec,
crypto_sm4_fk, crypto_sm4_ck);
sm4_ce_pmull_ghash_setup(ctx->key.rkey_enc, ctx->ghash_table);
}
return 0;
}
@@ -149,44 +147,28 @@ static int gcm_crypt(struct aead_request *req, struct skcipher_walk *walk,
memcpy(iv, req->iv, GCM_IV_SIZE);
put_unaligned_be32(2, iv + GCM_IV_SIZE);
kernel_neon_begin();
scoped_ksimd() {
if (req->assoclen)
gcm_calculate_auth_mac(req, ghash);
if (req->assoclen)
gcm_calculate_auth_mac(req, ghash);
do {
unsigned int tail = walk->nbytes % SM4_BLOCK_SIZE;
const u8 *src = walk->src.virt.addr;
u8 *dst = walk->dst.virt.addr;
const u8 *l = NULL;
while (walk->nbytes) {
unsigned int tail = walk->nbytes % SM4_BLOCK_SIZE;
const u8 *src = walk->src.virt.addr;
u8 *dst = walk->dst.virt.addr;
if (walk->nbytes == walk->total) {
l = (const u8 *)&lengths;
tail = 0;
}
if (walk->nbytes == walk->total) {
sm4_ce_pmull_gcm_crypt(ctx->key.rkey_enc, dst, src, iv,
walk->nbytes, ghash,
ctx->ghash_table,
(const u8 *)&lengths);
walk->nbytes - tail, ghash,
ctx->ghash_table, l);
kernel_neon_end();
return skcipher_walk_done(walk, 0);
}
sm4_ce_pmull_gcm_crypt(ctx->key.rkey_enc, dst, src, iv,
walk->nbytes - tail, ghash,
ctx->ghash_table, NULL);
kernel_neon_end();
err = skcipher_walk_done(walk, tail);
kernel_neon_begin();
err = skcipher_walk_done(walk, tail);
} while (walk->nbytes);
}
sm4_ce_pmull_gcm_crypt(ctx->key.rkey_enc, NULL, NULL, iv,
walk->nbytes, ghash, ctx->ghash_table,
(const u8 *)&lengths);
kernel_neon_end();
return err;
}

View File

@@ -8,7 +8,7 @@
* Copyright (C) 2022 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
*/
#include <asm/neon.h>
#include <asm/simd.h>
#include <crypto/b128ops.h>
#include <crypto/internal/hash.h>
#include <crypto/internal/skcipher.h>
@@ -74,10 +74,9 @@ static int sm4_setkey(struct crypto_skcipher *tfm, const u8 *key,
if (key_len != SM4_KEY_SIZE)
return -EINVAL;
kernel_neon_begin();
sm4_ce_expand_key(key, ctx->rkey_enc, ctx->rkey_dec,
crypto_sm4_fk, crypto_sm4_ck);
kernel_neon_end();
scoped_ksimd()
sm4_ce_expand_key(key, ctx->rkey_enc, ctx->rkey_dec,
crypto_sm4_fk, crypto_sm4_ck);
return 0;
}
@@ -94,12 +93,12 @@ static int sm4_xts_setkey(struct crypto_skcipher *tfm, const u8 *key,
if (ret)
return ret;
kernel_neon_begin();
sm4_ce_expand_key(key, ctx->key1.rkey_enc,
ctx->key1.rkey_dec, crypto_sm4_fk, crypto_sm4_ck);
sm4_ce_expand_key(&key[SM4_KEY_SIZE], ctx->key2.rkey_enc,
ctx->key2.rkey_dec, crypto_sm4_fk, crypto_sm4_ck);
kernel_neon_end();
scoped_ksimd() {
sm4_ce_expand_key(key, ctx->key1.rkey_enc,
ctx->key1.rkey_dec, crypto_sm4_fk, crypto_sm4_ck);
sm4_ce_expand_key(&key[SM4_KEY_SIZE], ctx->key2.rkey_enc,
ctx->key2.rkey_dec, crypto_sm4_fk, crypto_sm4_ck);
}
return 0;
}
@@ -117,16 +116,14 @@ static int sm4_ecb_do_crypt(struct skcipher_request *req, const u32 *rkey)
u8 *dst = walk.dst.virt.addr;
unsigned int nblks;
kernel_neon_begin();
nblks = BYTES2BLKS(nbytes);
if (nblks) {
sm4_ce_crypt(rkey, dst, src, nblks);
nbytes -= nblks * SM4_BLOCK_SIZE;
scoped_ksimd() {
nblks = BYTES2BLKS(nbytes);
if (nblks) {
sm4_ce_crypt(rkey, dst, src, nblks);
nbytes -= nblks * SM4_BLOCK_SIZE;
}
}
kernel_neon_end();
err = skcipher_walk_done(&walk, nbytes);
}
@@ -167,16 +164,14 @@ static int sm4_cbc_crypt(struct skcipher_request *req,
nblocks = nbytes / SM4_BLOCK_SIZE;
if (nblocks) {
kernel_neon_begin();
if (encrypt)
sm4_ce_cbc_enc(ctx->rkey_enc, dst, src,
walk.iv, nblocks);
else
sm4_ce_cbc_dec(ctx->rkey_dec, dst, src,
walk.iv, nblocks);
kernel_neon_end();
scoped_ksimd() {
if (encrypt)
sm4_ce_cbc_enc(ctx->rkey_enc, dst, src,
walk.iv, nblocks);
else
sm4_ce_cbc_dec(ctx->rkey_dec, dst, src,
walk.iv, nblocks);
}
}
err = skcipher_walk_done(&walk, nbytes % SM4_BLOCK_SIZE);
@@ -249,16 +244,14 @@ static int sm4_cbc_cts_crypt(struct skcipher_request *req, bool encrypt)
if (err)
return err;
kernel_neon_begin();
if (encrypt)
sm4_ce_cbc_cts_enc(ctx->rkey_enc, walk.dst.virt.addr,
walk.src.virt.addr, walk.iv, walk.nbytes);
else
sm4_ce_cbc_cts_dec(ctx->rkey_dec, walk.dst.virt.addr,
walk.src.virt.addr, walk.iv, walk.nbytes);
kernel_neon_end();
scoped_ksimd() {
if (encrypt)
sm4_ce_cbc_cts_enc(ctx->rkey_enc, walk.dst.virt.addr,
walk.src.virt.addr, walk.iv, walk.nbytes);
else
sm4_ce_cbc_cts_dec(ctx->rkey_dec, walk.dst.virt.addr,
walk.src.virt.addr, walk.iv, walk.nbytes);
}
return skcipher_walk_done(&walk, 0);
}
@@ -288,28 +281,26 @@ static int sm4_ctr_crypt(struct skcipher_request *req)
u8 *dst = walk.dst.virt.addr;
unsigned int nblks;
kernel_neon_begin();
scoped_ksimd() {
nblks = BYTES2BLKS(nbytes);
if (nblks) {
sm4_ce_ctr_enc(ctx->rkey_enc, dst, src, walk.iv, nblks);
dst += nblks * SM4_BLOCK_SIZE;
src += nblks * SM4_BLOCK_SIZE;
nbytes -= nblks * SM4_BLOCK_SIZE;
}
nblks = BYTES2BLKS(nbytes);
if (nblks) {
sm4_ce_ctr_enc(ctx->rkey_enc, dst, src, walk.iv, nblks);
dst += nblks * SM4_BLOCK_SIZE;
src += nblks * SM4_BLOCK_SIZE;
nbytes -= nblks * SM4_BLOCK_SIZE;
/* tail */
if (walk.nbytes == walk.total && nbytes > 0) {
u8 keystream[SM4_BLOCK_SIZE];
sm4_ce_crypt_block(ctx->rkey_enc, keystream, walk.iv);
crypto_inc(walk.iv, SM4_BLOCK_SIZE);
crypto_xor_cpy(dst, src, keystream, nbytes);
nbytes = 0;
}
}
/* tail */
if (walk.nbytes == walk.total && nbytes > 0) {
u8 keystream[SM4_BLOCK_SIZE];
sm4_ce_crypt_block(ctx->rkey_enc, keystream, walk.iv);
crypto_inc(walk.iv, SM4_BLOCK_SIZE);
crypto_xor_cpy(dst, src, keystream, nbytes);
nbytes = 0;
}
kernel_neon_end();
err = skcipher_walk_done(&walk, nbytes);
}
@@ -359,18 +350,16 @@ static int sm4_xts_crypt(struct skcipher_request *req, bool encrypt)
if (nbytes < walk.total)
nbytes &= ~(SM4_BLOCK_SIZE - 1);
kernel_neon_begin();
if (encrypt)
sm4_ce_xts_enc(ctx->key1.rkey_enc, walk.dst.virt.addr,
walk.src.virt.addr, walk.iv, nbytes,
rkey2_enc);
else
sm4_ce_xts_dec(ctx->key1.rkey_dec, walk.dst.virt.addr,
walk.src.virt.addr, walk.iv, nbytes,
rkey2_enc);
kernel_neon_end();
scoped_ksimd() {
if (encrypt)
sm4_ce_xts_enc(ctx->key1.rkey_enc, walk.dst.virt.addr,
walk.src.virt.addr, walk.iv, nbytes,
rkey2_enc);
else
sm4_ce_xts_dec(ctx->key1.rkey_dec, walk.dst.virt.addr,
walk.src.virt.addr, walk.iv, nbytes,
rkey2_enc);
}
rkey2_enc = NULL;
@@ -395,18 +384,16 @@ static int sm4_xts_crypt(struct skcipher_request *req, bool encrypt)
if (err)
return err;
kernel_neon_begin();
if (encrypt)
sm4_ce_xts_enc(ctx->key1.rkey_enc, walk.dst.virt.addr,
walk.src.virt.addr, walk.iv, walk.nbytes,
rkey2_enc);
else
sm4_ce_xts_dec(ctx->key1.rkey_dec, walk.dst.virt.addr,
walk.src.virt.addr, walk.iv, walk.nbytes,
rkey2_enc);
kernel_neon_end();
scoped_ksimd() {
if (encrypt)
sm4_ce_xts_enc(ctx->key1.rkey_enc, walk.dst.virt.addr,
walk.src.virt.addr, walk.iv, walk.nbytes,
rkey2_enc);
else
sm4_ce_xts_dec(ctx->key1.rkey_dec, walk.dst.virt.addr,
walk.src.virt.addr, walk.iv, walk.nbytes,
rkey2_enc);
}
return skcipher_walk_done(&walk, 0);
}
@@ -510,11 +497,9 @@ static int sm4_cbcmac_setkey(struct crypto_shash *tfm, const u8 *key,
if (key_len != SM4_KEY_SIZE)
return -EINVAL;
kernel_neon_begin();
sm4_ce_expand_key(key, ctx->key.rkey_enc, ctx->key.rkey_dec,
crypto_sm4_fk, crypto_sm4_ck);
kernel_neon_end();
scoped_ksimd()
sm4_ce_expand_key(key, ctx->key.rkey_enc, ctx->key.rkey_dec,
crypto_sm4_fk, crypto_sm4_ck);
return 0;
}
@@ -530,15 +515,13 @@ static int sm4_cmac_setkey(struct crypto_shash *tfm, const u8 *key,
memset(consts, 0, SM4_BLOCK_SIZE);
kernel_neon_begin();
scoped_ksimd() {
sm4_ce_expand_key(key, ctx->key.rkey_enc, ctx->key.rkey_dec,
crypto_sm4_fk, crypto_sm4_ck);
sm4_ce_expand_key(key, ctx->key.rkey_enc, ctx->key.rkey_dec,
crypto_sm4_fk, crypto_sm4_ck);
/* encrypt the zero block */
sm4_ce_crypt_block(ctx->key.rkey_enc, (u8 *)consts, (const u8 *)consts);
kernel_neon_end();
/* encrypt the zero block */
sm4_ce_crypt_block(ctx->key.rkey_enc, (u8 *)consts, (const u8 *)consts);
}
/* gf(2^128) multiply zero-ciphertext with u and u^2 */
a = be64_to_cpu(consts[0].a);
@@ -568,18 +551,16 @@ static int sm4_xcbc_setkey(struct crypto_shash *tfm, const u8 *key,
if (key_len != SM4_KEY_SIZE)
return -EINVAL;
kernel_neon_begin();
scoped_ksimd() {
sm4_ce_expand_key(key, ctx->key.rkey_enc, ctx->key.rkey_dec,
crypto_sm4_fk, crypto_sm4_ck);
sm4_ce_expand_key(key, ctx->key.rkey_enc, ctx->key.rkey_dec,
crypto_sm4_fk, crypto_sm4_ck);
sm4_ce_crypt_block(ctx->key.rkey_enc, key2, ks[0]);
sm4_ce_crypt(ctx->key.rkey_enc, ctx->consts, ks[1], 2);
sm4_ce_crypt_block(ctx->key.rkey_enc, key2, ks[0]);
sm4_ce_crypt(ctx->key.rkey_enc, ctx->consts, ks[1], 2);
sm4_ce_expand_key(key2, ctx->key.rkey_enc, ctx->key.rkey_dec,
crypto_sm4_fk, crypto_sm4_ck);
kernel_neon_end();
sm4_ce_expand_key(key2, ctx->key.rkey_enc, ctx->key.rkey_dec,
crypto_sm4_fk, crypto_sm4_ck);
}
return 0;
}
@@ -600,10 +581,9 @@ static int sm4_mac_update(struct shash_desc *desc, const u8 *p,
unsigned int nblocks = len / SM4_BLOCK_SIZE;
len %= SM4_BLOCK_SIZE;
kernel_neon_begin();
sm4_ce_mac_update(tctx->key.rkey_enc, ctx->digest, p,
nblocks, false, true);
kernel_neon_end();
scoped_ksimd()
sm4_ce_mac_update(tctx->key.rkey_enc, ctx->digest, p,
nblocks, false, true);
return len;
}
@@ -619,10 +599,9 @@ static int sm4_cmac_finup(struct shash_desc *desc, const u8 *src,
ctx->digest[len] ^= 0x80;
consts += SM4_BLOCK_SIZE;
}
kernel_neon_begin();
sm4_ce_mac_update(tctx->key.rkey_enc, ctx->digest, consts, 1,
false, true);
kernel_neon_end();
scoped_ksimd()
sm4_ce_mac_update(tctx->key.rkey_enc, ctx->digest, consts, 1,
false, true);
memcpy(out, ctx->digest, SM4_BLOCK_SIZE);
return 0;
}
@@ -635,10 +614,9 @@ static int sm4_cbcmac_finup(struct shash_desc *desc, const u8 *src,
if (len) {
crypto_xor(ctx->digest, src, len);
kernel_neon_begin();
sm4_ce_crypt_block(tctx->key.rkey_enc, ctx->digest,
ctx->digest);
kernel_neon_end();
scoped_ksimd()
sm4_ce_crypt_block(tctx->key.rkey_enc, ctx->digest,
ctx->digest);
}
memcpy(out, ctx->digest, SM4_BLOCK_SIZE);
return 0;

View File

@@ -48,11 +48,8 @@ static int sm4_ecb_do_crypt(struct skcipher_request *req, const u32 *rkey)
nblocks = nbytes / SM4_BLOCK_SIZE;
if (nblocks) {
kernel_neon_begin();
sm4_neon_crypt(rkey, dst, src, nblocks);
kernel_neon_end();
scoped_ksimd()
sm4_neon_crypt(rkey, dst, src, nblocks);
}
err = skcipher_walk_done(&walk, nbytes % SM4_BLOCK_SIZE);
@@ -126,12 +123,9 @@ static int sm4_cbc_decrypt(struct skcipher_request *req)
nblocks = nbytes / SM4_BLOCK_SIZE;
if (nblocks) {
kernel_neon_begin();
sm4_neon_cbc_dec(ctx->rkey_dec, dst, src,
walk.iv, nblocks);
kernel_neon_end();
scoped_ksimd()
sm4_neon_cbc_dec(ctx->rkey_dec, dst, src,
walk.iv, nblocks);
}
err = skcipher_walk_done(&walk, nbytes % SM4_BLOCK_SIZE);
@@ -157,12 +151,9 @@ static int sm4_ctr_crypt(struct skcipher_request *req)
nblocks = nbytes / SM4_BLOCK_SIZE;
if (nblocks) {
kernel_neon_begin();
sm4_neon_ctr_crypt(ctx->rkey_enc, dst, src,
walk.iv, nblocks);
kernel_neon_end();
scoped_ksimd()
sm4_neon_ctr_crypt(ctx->rkey_enc, dst, src,
walk.iv, nblocks);
dst += nblocks * SM4_BLOCK_SIZE;
src += nblocks * SM4_BLOCK_SIZE;

View File

@@ -6,10 +6,22 @@
#ifndef __ASM_FPU_H
#define __ASM_FPU_H
#include <linux/preempt.h>
#include <asm/neon.h>
#define kernel_fpu_available() cpu_has_neon()
#define kernel_fpu_begin() kernel_neon_begin()
#define kernel_fpu_end() kernel_neon_end()
static inline void kernel_fpu_begin(void)
{
BUG_ON(!in_task());
preempt_disable();
kernel_neon_begin(NULL);
}
static inline void kernel_fpu_end(void)
{
kernel_neon_end(NULL);
preempt_enable();
}
#endif /* ! __ASM_FPU_H */

View File

@@ -13,7 +13,7 @@
#define cpu_has_neon() system_supports_fpsimd()
void kernel_neon_begin(void);
void kernel_neon_end(void);
void kernel_neon_begin(struct user_fpsimd_state *);
void kernel_neon_end(struct user_fpsimd_state *);
#endif /* ! __ASM_NEON_H */

View File

@@ -172,7 +172,12 @@ struct thread_struct {
unsigned long fault_code; /* ESR_EL1 value */
struct debug_info debug; /* debugging */
struct user_fpsimd_state kernel_fpsimd_state;
/*
* Set [cleared] by kernel_neon_begin() [kernel_neon_end()] to the
* address of a caller provided buffer that will be used to preserve a
* task's kernel mode FPSIMD state while it is scheduled out.
*/
struct user_fpsimd_state *kernel_fpsimd_state;
unsigned int kernel_fpsimd_cpu;
#ifdef CONFIG_ARM64_PTR_AUTH
struct ptrauth_keys_user keys_user;

View File

@@ -6,12 +6,15 @@
#ifndef __ASM_SIMD_H
#define __ASM_SIMD_H
#include <linux/cleanup.h>
#include <linux/compiler.h>
#include <linux/irqflags.h>
#include <linux/percpu.h>
#include <linux/preempt.h>
#include <linux/types.h>
#include <asm/neon.h>
#ifdef CONFIG_KERNEL_MODE_NEON
/*
@@ -40,4 +43,11 @@ static __must_check inline bool may_use_simd(void) {
#endif /* ! CONFIG_KERNEL_MODE_NEON */
DEFINE_LOCK_GUARD_1(ksimd,
struct user_fpsimd_state,
kernel_neon_begin(_T->lock),
kernel_neon_end(_T->lock))
#define scoped_ksimd() scoped_guard(ksimd, &(struct user_fpsimd_state){})
#endif

View File

@@ -9,7 +9,7 @@
#include <linux/hardirq.h>
#include <asm-generic/xor.h>
#include <asm/hwcap.h>
#include <asm/neon.h>
#include <asm/simd.h>
#ifdef CONFIG_KERNEL_MODE_NEON
@@ -19,9 +19,8 @@ static void
xor_neon_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
kernel_neon_begin();
xor_block_inner_neon.do_2(bytes, p1, p2);
kernel_neon_end();
scoped_ksimd()
xor_block_inner_neon.do_2(bytes, p1, p2);
}
static void
@@ -29,9 +28,8 @@ xor_neon_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
kernel_neon_begin();
xor_block_inner_neon.do_3(bytes, p1, p2, p3);
kernel_neon_end();
scoped_ksimd()
xor_block_inner_neon.do_3(bytes, p1, p2, p3);
}
static void
@@ -40,9 +38,8 @@ xor_neon_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
kernel_neon_begin();
xor_block_inner_neon.do_4(bytes, p1, p2, p3, p4);
kernel_neon_end();
scoped_ksimd()
xor_block_inner_neon.do_4(bytes, p1, p2, p3, p4);
}
static void
@@ -52,9 +49,8 @@ xor_neon_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
kernel_neon_begin();
xor_block_inner_neon.do_5(bytes, p1, p2, p3, p4, p5);
kernel_neon_end();
scoped_ksimd()
xor_block_inner_neon.do_5(bytes, p1, p2, p3, p4, p5);
}
static struct xor_block_template xor_block_arm64 = {

View File

@@ -1502,21 +1502,23 @@ static void fpsimd_load_kernel_state(struct task_struct *task)
* Elide the load if this CPU holds the most recent kernel mode
* FPSIMD context of the current task.
*/
if (last->st == &task->thread.kernel_fpsimd_state &&
if (last->st == task->thread.kernel_fpsimd_state &&
task->thread.kernel_fpsimd_cpu == smp_processor_id())
return;
fpsimd_load_state(&task->thread.kernel_fpsimd_state);
fpsimd_load_state(task->thread.kernel_fpsimd_state);
}
static void fpsimd_save_kernel_state(struct task_struct *task)
{
struct cpu_fp_state cpu_fp_state = {
.st = &task->thread.kernel_fpsimd_state,
.st = task->thread.kernel_fpsimd_state,
.to_save = FP_STATE_FPSIMD,
};
fpsimd_save_state(&task->thread.kernel_fpsimd_state);
BUG_ON(!cpu_fp_state.st);
fpsimd_save_state(task->thread.kernel_fpsimd_state);
fpsimd_bind_state_to_cpu(&cpu_fp_state);
task->thread.kernel_fpsimd_cpu = smp_processor_id();
@@ -1787,6 +1789,7 @@ void fpsimd_update_current_state(struct user_fpsimd_state const *state)
void fpsimd_flush_task_state(struct task_struct *t)
{
t->thread.fpsimd_cpu = NR_CPUS;
t->thread.kernel_fpsimd_state = NULL;
/*
* If we don't support fpsimd, bail out after we have
* reset the fpsimd_cpu for this task and clear the
@@ -1846,12 +1849,19 @@ void fpsimd_save_and_flush_cpu_state(void)
*
* The caller may freely use the FPSIMD registers until kernel_neon_end() is
* called.
*
* Unless called from non-preemptible task context, @state must point to a
* caller provided buffer that will be used to preserve the task's kernel mode
* FPSIMD context when it is scheduled out, or if it is interrupted by kernel
* mode FPSIMD occurring in softirq context. May be %NULL otherwise.
*/
void kernel_neon_begin(void)
void kernel_neon_begin(struct user_fpsimd_state *state)
{
if (WARN_ON(!system_supports_fpsimd()))
return;
WARN_ON((preemptible() || in_serving_softirq()) && !state);
BUG_ON(!may_use_simd());
get_cpu_fpsimd_context();
@@ -1859,7 +1869,7 @@ void kernel_neon_begin(void)
/* Save unsaved fpsimd state, if any: */
if (test_thread_flag(TIF_KERNEL_FPSTATE)) {
BUG_ON(IS_ENABLED(CONFIG_PREEMPT_RT) || !in_serving_softirq());
fpsimd_save_kernel_state(current);
fpsimd_save_state(state);
} else {
fpsimd_save_user_state();
@@ -1880,8 +1890,16 @@ void kernel_neon_begin(void)
* mode in task context. So in this case, setting the flag here
* is always appropriate.
*/
if (IS_ENABLED(CONFIG_PREEMPT_RT) || !in_serving_softirq())
if (IS_ENABLED(CONFIG_PREEMPT_RT) || !in_serving_softirq()) {
/*
* Record the caller provided buffer as the kernel mode
* FP/SIMD buffer for this task, so that the state can
* be preserved and restored on a context switch.
*/
WARN_ON(current->thread.kernel_fpsimd_state != NULL);
current->thread.kernel_fpsimd_state = state;
set_thread_flag(TIF_KERNEL_FPSTATE);
}
}
/* Invalidate any task state remaining in the fpsimd regs: */
@@ -1899,22 +1917,30 @@ EXPORT_SYMBOL_GPL(kernel_neon_begin);
*
* The caller must not use the FPSIMD registers after this function is called,
* unless kernel_neon_begin() is called again in the meantime.
*
* The value of @state must match the value passed to the preceding call to
* kernel_neon_begin().
*/
void kernel_neon_end(void)
void kernel_neon_end(struct user_fpsimd_state *state)
{
if (!system_supports_fpsimd())
return;
if (!test_thread_flag(TIF_KERNEL_FPSTATE))
return;
/*
* If we are returning from a nested use of kernel mode FPSIMD, restore
* the task context kernel mode FPSIMD state. This can only happen when
* running in softirq context on non-PREEMPT_RT.
*/
if (!IS_ENABLED(CONFIG_PREEMPT_RT) && in_serving_softirq() &&
test_thread_flag(TIF_KERNEL_FPSTATE))
fpsimd_load_kernel_state(current);
else
if (!IS_ENABLED(CONFIG_PREEMPT_RT) && in_serving_softirq()) {
fpsimd_load_state(state);
} else {
clear_thread_flag(TIF_KERNEL_FPSTATE);
WARN_ON(current->thread.kernel_fpsimd_state != state);
current->thread.kernel_fpsimd_state = NULL;
}
}
EXPORT_SYMBOL_GPL(kernel_neon_end);
@@ -1948,7 +1974,7 @@ void __efi_fpsimd_begin(void)
return;
if (may_use_simd()) {
kernel_neon_begin();
kernel_neon_begin(&efi_fpsimd_state);
} else {
WARN_ON(preemptible());
@@ -1999,7 +2025,7 @@ void __efi_fpsimd_end(void)
return;
if (!efi_fpsimd_state_used) {
kernel_neon_end();
kernel_neon_end(&efi_fpsimd_state);
} else {
if (system_supports_sve() && efi_sve_state_used) {
bool ffr = true;

View File

@@ -796,6 +796,7 @@ CONFIG_CRYPTO_MD4=m
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_MICHAEL_MIC=m
CONFIG_CRYPTO_RMD160=m
CONFIG_CRYPTO_SHA3=m
CONFIG_CRYPTO_SM3_GENERIC=m
CONFIG_CRYPTO_WP512=m
CONFIG_CRYPTO_XCBC=m
@@ -809,8 +810,6 @@ CONFIG_CRYPTO_USER_API_HASH=m
CONFIG_CRYPTO_USER_API_SKCIPHER=m
CONFIG_CRYPTO_USER_API_RNG=m
CONFIG_CRYPTO_USER_API_AEAD=m
CONFIG_CRYPTO_SHA3_256_S390=m
CONFIG_CRYPTO_SHA3_512_S390=m
CONFIG_CRYPTO_GHASH_S390=m
CONFIG_CRYPTO_AES_S390=m
CONFIG_CRYPTO_DES_S390=m

View File

@@ -780,6 +780,7 @@ CONFIG_CRYPTO_MD4=m
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_MICHAEL_MIC=m
CONFIG_CRYPTO_RMD160=m
CONFIG_CRYPTO_SHA3=m
CONFIG_CRYPTO_SM3_GENERIC=m
CONFIG_CRYPTO_WP512=m
CONFIG_CRYPTO_XCBC=m
@@ -794,8 +795,6 @@ CONFIG_CRYPTO_USER_API_HASH=m
CONFIG_CRYPTO_USER_API_SKCIPHER=m
CONFIG_CRYPTO_USER_API_RNG=m
CONFIG_CRYPTO_USER_API_AEAD=m
CONFIG_CRYPTO_SHA3_256_S390=m
CONFIG_CRYPTO_SHA3_512_S390=m
CONFIG_CRYPTO_GHASH_S390=m
CONFIG_CRYPTO_AES_S390=m
CONFIG_CRYPTO_DES_S390=m

View File

@@ -2,26 +2,6 @@
menu "Accelerated Cryptographic Algorithms for CPU (s390)"
config CRYPTO_SHA3_256_S390
tristate "Hash functions: SHA3-224 and SHA3-256"
select CRYPTO_HASH
help
SHA3-224 and SHA3-256 secure hash algorithms (FIPS 202)
Architecture: s390
It is available as of z14.
config CRYPTO_SHA3_512_S390
tristate "Hash functions: SHA3-384 and SHA3-512"
select CRYPTO_HASH
help
SHA3-384 and SHA3-512 secure hash algorithms (FIPS 202)
Architecture: s390
It is available as of z14.
config CRYPTO_GHASH_S390
tristate "Hash functions: GHASH"
select CRYPTO_HASH

View File

@@ -3,8 +3,6 @@
# Cryptographic API
#
obj-$(CONFIG_CRYPTO_SHA3_256_S390) += sha3_256_s390.o sha_common.o
obj-$(CONFIG_CRYPTO_SHA3_512_S390) += sha3_512_s390.o sha_common.o
obj-$(CONFIG_CRYPTO_DES_S390) += des_s390.o
obj-$(CONFIG_CRYPTO_AES_S390) += aes_s390.o
obj-$(CONFIG_CRYPTO_PAES_S390) += paes_s390.o

View File

@@ -1,51 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0+ */
/*
* Cryptographic API.
*
* s390 generic implementation of the SHA Secure Hash Algorithms.
*
* Copyright IBM Corp. 2007
* Author(s): Jan Glauber (jang@de.ibm.com)
*/
#ifndef _CRYPTO_ARCH_S390_SHA_H
#define _CRYPTO_ARCH_S390_SHA_H
#include <crypto/hash.h>
#include <crypto/sha2.h>
#include <crypto/sha3.h>
#include <linux/build_bug.h>
#include <linux/types.h>
/* must be big enough for the largest SHA variant */
#define CPACF_MAX_PARMBLOCK_SIZE SHA3_STATE_SIZE
#define SHA_MAX_BLOCK_SIZE SHA3_224_BLOCK_SIZE
struct s390_sha_ctx {
u64 count; /* message length in bytes */
union {
u32 state[CPACF_MAX_PARMBLOCK_SIZE / sizeof(u32)];
struct {
u64 state[SHA512_DIGEST_SIZE / sizeof(u64)];
u64 count_hi;
} sha512;
struct {
__le64 state[SHA3_STATE_SIZE / sizeof(u64)];
} sha3;
};
int func; /* KIMD function to use */
bool first_message_part;
};
struct shash_desc;
int s390_sha_update_blocks(struct shash_desc *desc, const u8 *data,
unsigned int len);
int s390_sha_finup(struct shash_desc *desc, const u8 *src, unsigned int len,
u8 *out);
static inline void __check_s390_sha_ctx_size(void)
{
BUILD_BUG_ON(S390_SHA_CTX_SIZE != sizeof(struct s390_sha_ctx));
}
#endif

View File

@@ -1,157 +0,0 @@
// SPDX-License-Identifier: GPL-2.0+
/*
* Cryptographic API.
*
* s390 implementation of the SHA256 and SHA224 Secure Hash Algorithm.
*
* s390 Version:
* Copyright IBM Corp. 2019
* Author(s): Joerg Schmidbauer (jschmidb@de.ibm.com)
*/
#include <asm/cpacf.h>
#include <crypto/internal/hash.h>
#include <crypto/sha3.h>
#include <linux/cpufeature.h>
#include <linux/errno.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/string.h>
#include "sha.h"
static int sha3_256_init(struct shash_desc *desc)
{
struct s390_sha_ctx *sctx = shash_desc_ctx(desc);
sctx->first_message_part = test_facility(86);
if (!sctx->first_message_part)
memset(sctx->state, 0, sizeof(sctx->state));
sctx->count = 0;
sctx->func = CPACF_KIMD_SHA3_256;
return 0;
}
static int sha3_256_export(struct shash_desc *desc, void *out)
{
struct s390_sha_ctx *sctx = shash_desc_ctx(desc);
union {
u8 *u8;
u64 *u64;
} p = { .u8 = out };
int i;
if (sctx->first_message_part) {
memset(out, 0, SHA3_STATE_SIZE);
return 0;
}
for (i = 0; i < SHA3_STATE_SIZE / 8; i++)
put_unaligned(le64_to_cpu(sctx->sha3.state[i]), p.u64++);
return 0;
}
static int sha3_256_import(struct shash_desc *desc, const void *in)
{
struct s390_sha_ctx *sctx = shash_desc_ctx(desc);
union {
const u8 *u8;
const u64 *u64;
} p = { .u8 = in };
int i;
for (i = 0; i < SHA3_STATE_SIZE / 8; i++)
sctx->sha3.state[i] = cpu_to_le64(get_unaligned(p.u64++));
sctx->count = 0;
sctx->first_message_part = 0;
sctx->func = CPACF_KIMD_SHA3_256;
return 0;
}
static int sha3_224_import(struct shash_desc *desc, const void *in)
{
struct s390_sha_ctx *sctx = shash_desc_ctx(desc);
sha3_256_import(desc, in);
sctx->func = CPACF_KIMD_SHA3_224;
return 0;
}
static struct shash_alg sha3_256_alg = {
.digestsize = SHA3_256_DIGEST_SIZE, /* = 32 */
.init = sha3_256_init,
.update = s390_sha_update_blocks,
.finup = s390_sha_finup,
.export = sha3_256_export,
.import = sha3_256_import,
.descsize = S390_SHA_CTX_SIZE,
.statesize = SHA3_STATE_SIZE,
.base = {
.cra_name = "sha3-256",
.cra_driver_name = "sha3-256-s390",
.cra_priority = 300,
.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY,
.cra_blocksize = SHA3_256_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
};
static int sha3_224_init(struct shash_desc *desc)
{
struct s390_sha_ctx *sctx = shash_desc_ctx(desc);
sha3_256_init(desc);
sctx->func = CPACF_KIMD_SHA3_224;
return 0;
}
static struct shash_alg sha3_224_alg = {
.digestsize = SHA3_224_DIGEST_SIZE,
.init = sha3_224_init,
.update = s390_sha_update_blocks,
.finup = s390_sha_finup,
.export = sha3_256_export, /* same as for 256 */
.import = sha3_224_import, /* function code different! */
.descsize = S390_SHA_CTX_SIZE,
.statesize = SHA3_STATE_SIZE,
.base = {
.cra_name = "sha3-224",
.cra_driver_name = "sha3-224-s390",
.cra_priority = 300,
.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY,
.cra_blocksize = SHA3_224_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
};
static int __init sha3_256_s390_init(void)
{
int ret;
if (!cpacf_query_func(CPACF_KIMD, CPACF_KIMD_SHA3_256))
return -ENODEV;
ret = crypto_register_shash(&sha3_256_alg);
if (ret < 0)
goto out;
ret = crypto_register_shash(&sha3_224_alg);
if (ret < 0)
crypto_unregister_shash(&sha3_256_alg);
out:
return ret;
}
static void __exit sha3_256_s390_fini(void)
{
crypto_unregister_shash(&sha3_224_alg);
crypto_unregister_shash(&sha3_256_alg);
}
module_cpu_feature_match(S390_CPU_FEATURE_MSA, sha3_256_s390_init);
module_exit(sha3_256_s390_fini);
MODULE_ALIAS_CRYPTO("sha3-256");
MODULE_ALIAS_CRYPTO("sha3-224");
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("SHA3-256 and SHA3-224 Secure Hash Algorithm");

View File

@@ -1,157 +0,0 @@
// SPDX-License-Identifier: GPL-2.0+
/*
* Cryptographic API.
*
* s390 implementation of the SHA512 and SHA384 Secure Hash Algorithm.
*
* Copyright IBM Corp. 2019
* Author(s): Joerg Schmidbauer (jschmidb@de.ibm.com)
*/
#include <asm/cpacf.h>
#include <crypto/internal/hash.h>
#include <crypto/sha3.h>
#include <linux/cpufeature.h>
#include <linux/errno.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/string.h>
#include "sha.h"
static int sha3_512_init(struct shash_desc *desc)
{
struct s390_sha_ctx *sctx = shash_desc_ctx(desc);
sctx->first_message_part = test_facility(86);
if (!sctx->first_message_part)
memset(sctx->state, 0, sizeof(sctx->state));
sctx->count = 0;
sctx->func = CPACF_KIMD_SHA3_512;
return 0;
}
static int sha3_512_export(struct shash_desc *desc, void *out)
{
struct s390_sha_ctx *sctx = shash_desc_ctx(desc);
union {
u8 *u8;
u64 *u64;
} p = { .u8 = out };
int i;
if (sctx->first_message_part) {
memset(out, 0, SHA3_STATE_SIZE);
return 0;
}
for (i = 0; i < SHA3_STATE_SIZE / 8; i++)
put_unaligned(le64_to_cpu(sctx->sha3.state[i]), p.u64++);
return 0;
}
static int sha3_512_import(struct shash_desc *desc, const void *in)
{
struct s390_sha_ctx *sctx = shash_desc_ctx(desc);
union {
const u8 *u8;
const u64 *u64;
} p = { .u8 = in };
int i;
for (i = 0; i < SHA3_STATE_SIZE / 8; i++)
sctx->sha3.state[i] = cpu_to_le64(get_unaligned(p.u64++));
sctx->count = 0;
sctx->first_message_part = 0;
sctx->func = CPACF_KIMD_SHA3_512;
return 0;
}
static int sha3_384_import(struct shash_desc *desc, const void *in)
{
struct s390_sha_ctx *sctx = shash_desc_ctx(desc);
sha3_512_import(desc, in);
sctx->func = CPACF_KIMD_SHA3_384;
return 0;
}
static struct shash_alg sha3_512_alg = {
.digestsize = SHA3_512_DIGEST_SIZE,
.init = sha3_512_init,
.update = s390_sha_update_blocks,
.finup = s390_sha_finup,
.export = sha3_512_export,
.import = sha3_512_import,
.descsize = S390_SHA_CTX_SIZE,
.statesize = SHA3_STATE_SIZE,
.base = {
.cra_name = "sha3-512",
.cra_driver_name = "sha3-512-s390",
.cra_priority = 300,
.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY,
.cra_blocksize = SHA3_512_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
};
MODULE_ALIAS_CRYPTO("sha3-512");
static int sha3_384_init(struct shash_desc *desc)
{
struct s390_sha_ctx *sctx = shash_desc_ctx(desc);
sha3_512_init(desc);
sctx->func = CPACF_KIMD_SHA3_384;
return 0;
}
static struct shash_alg sha3_384_alg = {
.digestsize = SHA3_384_DIGEST_SIZE,
.init = sha3_384_init,
.update = s390_sha_update_blocks,
.finup = s390_sha_finup,
.export = sha3_512_export, /* same as for 512 */
.import = sha3_384_import, /* function code different! */
.descsize = S390_SHA_CTX_SIZE,
.statesize = SHA3_STATE_SIZE,
.base = {
.cra_name = "sha3-384",
.cra_driver_name = "sha3-384-s390",
.cra_priority = 300,
.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY,
.cra_blocksize = SHA3_384_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct s390_sha_ctx),
.cra_module = THIS_MODULE,
}
};
MODULE_ALIAS_CRYPTO("sha3-384");
static int __init init(void)
{
int ret;
if (!cpacf_query_func(CPACF_KIMD, CPACF_KIMD_SHA3_512))
return -ENODEV;
ret = crypto_register_shash(&sha3_512_alg);
if (ret < 0)
goto out;
ret = crypto_register_shash(&sha3_384_alg);
if (ret < 0)
crypto_unregister_shash(&sha3_512_alg);
out:
return ret;
}
static void __exit fini(void)
{
crypto_unregister_shash(&sha3_512_alg);
crypto_unregister_shash(&sha3_384_alg);
}
module_cpu_feature_match(S390_CPU_FEATURE_MSA, init);
module_exit(fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("SHA3-512 and SHA3-384 Secure Hash Algorithm");

View File

@@ -1,117 +0,0 @@
// SPDX-License-Identifier: GPL-2.0+
/*
* Cryptographic API.
*
* s390 generic implementation of the SHA Secure Hash Algorithms.
*
* Copyright IBM Corp. 2007
* Author(s): Jan Glauber (jang@de.ibm.com)
*/
#include <crypto/internal/hash.h>
#include <linux/export.h>
#include <linux/module.h>
#include <asm/cpacf.h>
#include "sha.h"
int s390_sha_update_blocks(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
unsigned int bsize = crypto_shash_blocksize(desc->tfm);
struct s390_sha_ctx *ctx = shash_desc_ctx(desc);
unsigned int n;
int fc;
fc = ctx->func;
if (ctx->first_message_part)
fc |= CPACF_KIMD_NIP;
/* process as many blocks as possible */
n = (len / bsize) * bsize;
ctx->count += n;
switch (ctx->func) {
case CPACF_KLMD_SHA_512:
case CPACF_KLMD_SHA3_384:
if (ctx->count < n)
ctx->sha512.count_hi++;
break;
}
cpacf_kimd(fc, ctx->state, data, n);
ctx->first_message_part = 0;
return len - n;
}
EXPORT_SYMBOL_GPL(s390_sha_update_blocks);
static int s390_crypto_shash_parmsize(int func)
{
switch (func) {
case CPACF_KLMD_SHA_1:
return 20;
case CPACF_KLMD_SHA_256:
return 32;
case CPACF_KLMD_SHA_512:
return 64;
case CPACF_KLMD_SHA3_224:
case CPACF_KLMD_SHA3_256:
case CPACF_KLMD_SHA3_384:
case CPACF_KLMD_SHA3_512:
return 200;
default:
return -EINVAL;
}
}
int s390_sha_finup(struct shash_desc *desc, const u8 *src, unsigned int len,
u8 *out)
{
struct s390_sha_ctx *ctx = shash_desc_ctx(desc);
int mbl_offset, fc;
u64 bits;
ctx->count += len;
bits = ctx->count * 8;
mbl_offset = s390_crypto_shash_parmsize(ctx->func);
if (mbl_offset < 0)
return -EINVAL;
mbl_offset = mbl_offset / sizeof(u32);
/* set total msg bit length (mbl) in CPACF parmblock */
switch (ctx->func) {
case CPACF_KLMD_SHA_512:
/* The SHA512 parmblock has a 128-bit mbl field. */
if (ctx->count < len)
ctx->sha512.count_hi++;
ctx->sha512.count_hi <<= 3;
ctx->sha512.count_hi |= ctx->count >> 61;
mbl_offset += sizeof(u64) / sizeof(u32);
fallthrough;
case CPACF_KLMD_SHA_1:
case CPACF_KLMD_SHA_256:
memcpy(ctx->state + mbl_offset, &bits, sizeof(bits));
break;
case CPACF_KLMD_SHA3_224:
case CPACF_KLMD_SHA3_256:
case CPACF_KLMD_SHA3_384:
case CPACF_KLMD_SHA3_512:
break;
default:
return -EINVAL;
}
fc = ctx->func;
fc |= test_facility(86) ? CPACF_KLMD_DUFOP : 0;
if (ctx->first_message_part)
fc |= CPACF_KLMD_NIP;
cpacf_klmd(fc, ctx->state, src, len);
/* copy digest to out */
memcpy(out, ctx->state, crypto_shash_digestsize(desc->tfm));
return 0;
}
EXPORT_SYMBOL_GPL(s390_sha_finup);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("s390 SHA cipher common functions");

View File

@@ -353,16 +353,6 @@ config CRYPTO_NHPOLY1305_AVX2
Architecture: x86_64 using:
- AVX2 (Advanced Vector Extensions 2)
config CRYPTO_POLYVAL_CLMUL_NI
tristate "Hash functions: POLYVAL (CLMUL-NI)"
depends on 64BIT
select CRYPTO_POLYVAL
help
POLYVAL hash function for HCTR2
Architecture: x86_64 using:
- CLMUL-NI (carry-less multiplication new instructions)
config CRYPTO_SM3_AVX_X86_64
tristate "Hash functions: SM3 (AVX)"
depends on 64BIT

View File

@@ -46,15 +46,13 @@ obj-$(CONFIG_CRYPTO_AES_NI_INTEL) += aesni-intel.o
aesni-intel-y := aesni-intel_asm.o aesni-intel_glue.o
aesni-intel-$(CONFIG_64BIT) += aes-ctr-avx-x86_64.o \
aes-gcm-aesni-x86_64.o \
aes-xts-avx-x86_64.o \
aes-gcm-avx10-x86_64.o
aes-gcm-vaes-avx2.o \
aes-gcm-vaes-avx512.o \
aes-xts-avx-x86_64.o
obj-$(CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL) += ghash-clmulni-intel.o
ghash-clmulni-intel-y := ghash-clmulni-intel_asm.o ghash-clmulni-intel_glue.o
obj-$(CONFIG_CRYPTO_POLYVAL_CLMUL_NI) += polyval-clmulni.o
polyval-clmulni-y := polyval-clmulni_asm.o polyval-clmulni_glue.o
obj-$(CONFIG_CRYPTO_NHPOLY1305_SSE2) += nhpoly1305-sse2.o
nhpoly1305-sse2-y := nh-sse2-x86_64.o nhpoly1305-sse2-glue.o
obj-$(CONFIG_CRYPTO_NHPOLY1305_AVX2) += nhpoly1305-avx2.o

View File

@@ -61,15 +61,15 @@
// for the *_aesni functions or AVX for the *_aesni_avx ones. (But it seems
// there are no CPUs that support AES-NI without also PCLMULQDQ and SSE4.1.)
//
// The design generally follows that of aes-gcm-avx10-x86_64.S, and that file is
// The design generally follows that of aes-gcm-vaes-avx512.S, and that file is
// more thoroughly commented. This file has the following notable changes:
//
// - The vector length is fixed at 128-bit, i.e. xmm registers. This means
// there is only one AES block (and GHASH block) per register.
//
// - Without AVX512 / AVX10, only 16 SIMD registers are available instead of
// 32. We work around this by being much more careful about using
// registers, relying heavily on loads to load values as they are needed.
// - Without AVX512, only 16 SIMD registers are available instead of 32. We
// work around this by being much more careful about using registers,
// relying heavily on loads to load values as they are needed.
//
// - Masking is not available either. We work around this by implementing
// partial block loads and stores using overlapping scalar loads and stores
@@ -90,8 +90,8 @@
// multiplication instead of schoolbook multiplication. This saves one
// pclmulqdq instruction per block, at the cost of one 64-bit load, one
// pshufd, and 0.25 pxors per block. (This is without the three-argument
// XOR support that would be provided by AVX512 / AVX10, which would be
// more beneficial to schoolbook than Karatsuba.)
// XOR support that would be provided by AVX512, which would be more
// beneficial to schoolbook than Karatsuba.)
//
// As a rough approximation, we can assume that Karatsuba multiplication is
// faster than schoolbook multiplication in this context if one pshufd and

File diff suppressed because it is too large Load Diff

View File

@@ -874,8 +874,38 @@ struct aes_gcm_key_aesni {
#define AES_GCM_KEY_AESNI_SIZE \
(sizeof(struct aes_gcm_key_aesni) + (15 & ~(CRYPTO_MINALIGN - 1)))
/* Key struct used by the VAES + AVX10 implementations of AES-GCM */
struct aes_gcm_key_avx10 {
/* Key struct used by the VAES + AVX2 implementation of AES-GCM */
struct aes_gcm_key_vaes_avx2 {
/*
* Common part of the key. The assembly code prefers 16-byte alignment
* for the round keys; we get this by them being located at the start of
* the struct and the whole struct being 32-byte aligned.
*/
struct aes_gcm_key base;
/*
* Powers of the hash key H^8 through H^1. These are 128-bit values.
* They all have an extra factor of x^-1 and are byte-reversed.
* The assembly code prefers 32-byte alignment for this.
*/
u64 h_powers[8][2] __aligned(32);
/*
* Each entry in this array contains the two halves of an entry of
* h_powers XOR'd together, in the following order:
* H^8,H^6,H^7,H^5,H^4,H^2,H^3,H^1 i.e. indices 0,2,1,3,4,6,5,7.
* This is used for Karatsuba multiplication.
*/
u64 h_powers_xored[8];
};
#define AES_GCM_KEY_VAES_AVX2(key) \
container_of((key), struct aes_gcm_key_vaes_avx2, base)
#define AES_GCM_KEY_VAES_AVX2_SIZE \
(sizeof(struct aes_gcm_key_vaes_avx2) + (31 & ~(CRYPTO_MINALIGN - 1)))
/* Key struct used by the VAES + AVX512 implementation of AES-GCM */
struct aes_gcm_key_vaes_avx512 {
/*
* Common part of the key. The assembly code prefers 16-byte alignment
* for the round keys; we get this by them being located at the start of
@@ -895,10 +925,10 @@ struct aes_gcm_key_avx10 {
/* Three padding blocks required by the assembly code */
u64 padding[3][2];
};
#define AES_GCM_KEY_AVX10(key) \
container_of((key), struct aes_gcm_key_avx10, base)
#define AES_GCM_KEY_AVX10_SIZE \
(sizeof(struct aes_gcm_key_avx10) + (63 & ~(CRYPTO_MINALIGN - 1)))
#define AES_GCM_KEY_VAES_AVX512(key) \
container_of((key), struct aes_gcm_key_vaes_avx512, base)
#define AES_GCM_KEY_VAES_AVX512_SIZE \
(sizeof(struct aes_gcm_key_vaes_avx512) + (63 & ~(CRYPTO_MINALIGN - 1)))
/*
* These flags are passed to the AES-GCM helper functions to specify the
@@ -910,14 +940,16 @@ struct aes_gcm_key_avx10 {
#define FLAG_RFC4106 BIT(0)
#define FLAG_ENC BIT(1)
#define FLAG_AVX BIT(2)
#define FLAG_AVX10_256 BIT(3)
#define FLAG_AVX10_512 BIT(4)
#define FLAG_VAES_AVX2 BIT(3)
#define FLAG_VAES_AVX512 BIT(4)
static inline struct aes_gcm_key *
aes_gcm_key_get(struct crypto_aead *tfm, int flags)
{
if (flags & (FLAG_AVX10_256 | FLAG_AVX10_512))
if (flags & FLAG_VAES_AVX512)
return PTR_ALIGN(crypto_aead_ctx(tfm), 64);
else if (flags & FLAG_VAES_AVX2)
return PTR_ALIGN(crypto_aead_ctx(tfm), 32);
else
return PTR_ALIGN(crypto_aead_ctx(tfm), 16);
}
@@ -927,26 +959,16 @@ aes_gcm_precompute_aesni(struct aes_gcm_key_aesni *key);
asmlinkage void
aes_gcm_precompute_aesni_avx(struct aes_gcm_key_aesni *key);
asmlinkage void
aes_gcm_precompute_vaes_avx10_256(struct aes_gcm_key_avx10 *key);
aes_gcm_precompute_vaes_avx2(struct aes_gcm_key_vaes_avx2 *key);
asmlinkage void
aes_gcm_precompute_vaes_avx10_512(struct aes_gcm_key_avx10 *key);
aes_gcm_precompute_vaes_avx512(struct aes_gcm_key_vaes_avx512 *key);
static void aes_gcm_precompute(struct aes_gcm_key *key, int flags)
{
/*
* To make things a bit easier on the assembly side, the AVX10
* implementations use the same key format. Therefore, a single
* function using 256-bit vectors would suffice here. However, it's
* straightforward to provide a 512-bit one because of how the assembly
* code is structured, and it works nicely because the total size of the
* key powers is a multiple of 512 bits. So we take advantage of that.
*
* A similar situation applies to the AES-NI implementations.
*/
if (flags & FLAG_AVX10_512)
aes_gcm_precompute_vaes_avx10_512(AES_GCM_KEY_AVX10(key));
else if (flags & FLAG_AVX10_256)
aes_gcm_precompute_vaes_avx10_256(AES_GCM_KEY_AVX10(key));
if (flags & FLAG_VAES_AVX512)
aes_gcm_precompute_vaes_avx512(AES_GCM_KEY_VAES_AVX512(key));
else if (flags & FLAG_VAES_AVX2)
aes_gcm_precompute_vaes_avx2(AES_GCM_KEY_VAES_AVX2(key));
else if (flags & FLAG_AVX)
aes_gcm_precompute_aesni_avx(AES_GCM_KEY_AESNI(key));
else
@@ -960,15 +982,21 @@ asmlinkage void
aes_gcm_aad_update_aesni_avx(const struct aes_gcm_key_aesni *key,
u8 ghash_acc[16], const u8 *aad, int aadlen);
asmlinkage void
aes_gcm_aad_update_vaes_avx10(const struct aes_gcm_key_avx10 *key,
u8 ghash_acc[16], const u8 *aad, int aadlen);
aes_gcm_aad_update_vaes_avx2(const struct aes_gcm_key_vaes_avx2 *key,
u8 ghash_acc[16], const u8 *aad, int aadlen);
asmlinkage void
aes_gcm_aad_update_vaes_avx512(const struct aes_gcm_key_vaes_avx512 *key,
u8 ghash_acc[16], const u8 *aad, int aadlen);
static void aes_gcm_aad_update(const struct aes_gcm_key *key, u8 ghash_acc[16],
const u8 *aad, int aadlen, int flags)
{
if (flags & (FLAG_AVX10_256 | FLAG_AVX10_512))
aes_gcm_aad_update_vaes_avx10(AES_GCM_KEY_AVX10(key), ghash_acc,
aad, aadlen);
if (flags & FLAG_VAES_AVX512)
aes_gcm_aad_update_vaes_avx512(AES_GCM_KEY_VAES_AVX512(key),
ghash_acc, aad, aadlen);
else if (flags & FLAG_VAES_AVX2)
aes_gcm_aad_update_vaes_avx2(AES_GCM_KEY_VAES_AVX2(key),
ghash_acc, aad, aadlen);
else if (flags & FLAG_AVX)
aes_gcm_aad_update_aesni_avx(AES_GCM_KEY_AESNI(key), ghash_acc,
aad, aadlen);
@@ -986,13 +1014,13 @@ aes_gcm_enc_update_aesni_avx(const struct aes_gcm_key_aesni *key,
const u32 le_ctr[4], u8 ghash_acc[16],
const u8 *src, u8 *dst, int datalen);
asmlinkage void
aes_gcm_enc_update_vaes_avx10_256(const struct aes_gcm_key_avx10 *key,
const u32 le_ctr[4], u8 ghash_acc[16],
const u8 *src, u8 *dst, int datalen);
aes_gcm_enc_update_vaes_avx2(const struct aes_gcm_key_vaes_avx2 *key,
const u32 le_ctr[4], u8 ghash_acc[16],
const u8 *src, u8 *dst, int datalen);
asmlinkage void
aes_gcm_enc_update_vaes_avx10_512(const struct aes_gcm_key_avx10 *key,
const u32 le_ctr[4], u8 ghash_acc[16],
const u8 *src, u8 *dst, int datalen);
aes_gcm_enc_update_vaes_avx512(const struct aes_gcm_key_vaes_avx512 *key,
const u32 le_ctr[4], u8 ghash_acc[16],
const u8 *src, u8 *dst, int datalen);
asmlinkage void
aes_gcm_dec_update_aesni(const struct aes_gcm_key_aesni *key,
@@ -1003,13 +1031,13 @@ aes_gcm_dec_update_aesni_avx(const struct aes_gcm_key_aesni *key,
const u32 le_ctr[4], u8 ghash_acc[16],
const u8 *src, u8 *dst, int datalen);
asmlinkage void
aes_gcm_dec_update_vaes_avx10_256(const struct aes_gcm_key_avx10 *key,
const u32 le_ctr[4], u8 ghash_acc[16],
const u8 *src, u8 *dst, int datalen);
aes_gcm_dec_update_vaes_avx2(const struct aes_gcm_key_vaes_avx2 *key,
const u32 le_ctr[4], u8 ghash_acc[16],
const u8 *src, u8 *dst, int datalen);
asmlinkage void
aes_gcm_dec_update_vaes_avx10_512(const struct aes_gcm_key_avx10 *key,
const u32 le_ctr[4], u8 ghash_acc[16],
const u8 *src, u8 *dst, int datalen);
aes_gcm_dec_update_vaes_avx512(const struct aes_gcm_key_vaes_avx512 *key,
const u32 le_ctr[4], u8 ghash_acc[16],
const u8 *src, u8 *dst, int datalen);
/* __always_inline to optimize out the branches based on @flags */
static __always_inline void
@@ -1018,14 +1046,14 @@ aes_gcm_update(const struct aes_gcm_key *key,
const u8 *src, u8 *dst, int datalen, int flags)
{
if (flags & FLAG_ENC) {
if (flags & FLAG_AVX10_512)
aes_gcm_enc_update_vaes_avx10_512(AES_GCM_KEY_AVX10(key),
le_ctr, ghash_acc,
src, dst, datalen);
else if (flags & FLAG_AVX10_256)
aes_gcm_enc_update_vaes_avx10_256(AES_GCM_KEY_AVX10(key),
le_ctr, ghash_acc,
src, dst, datalen);
if (flags & FLAG_VAES_AVX512)
aes_gcm_enc_update_vaes_avx512(AES_GCM_KEY_VAES_AVX512(key),
le_ctr, ghash_acc,
src, dst, datalen);
else if (flags & FLAG_VAES_AVX2)
aes_gcm_enc_update_vaes_avx2(AES_GCM_KEY_VAES_AVX2(key),
le_ctr, ghash_acc,
src, dst, datalen);
else if (flags & FLAG_AVX)
aes_gcm_enc_update_aesni_avx(AES_GCM_KEY_AESNI(key),
le_ctr, ghash_acc,
@@ -1034,14 +1062,14 @@ aes_gcm_update(const struct aes_gcm_key *key,
aes_gcm_enc_update_aesni(AES_GCM_KEY_AESNI(key), le_ctr,
ghash_acc, src, dst, datalen);
} else {
if (flags & FLAG_AVX10_512)
aes_gcm_dec_update_vaes_avx10_512(AES_GCM_KEY_AVX10(key),
le_ctr, ghash_acc,
src, dst, datalen);
else if (flags & FLAG_AVX10_256)
aes_gcm_dec_update_vaes_avx10_256(AES_GCM_KEY_AVX10(key),
le_ctr, ghash_acc,
src, dst, datalen);
if (flags & FLAG_VAES_AVX512)
aes_gcm_dec_update_vaes_avx512(AES_GCM_KEY_VAES_AVX512(key),
le_ctr, ghash_acc,
src, dst, datalen);
else if (flags & FLAG_VAES_AVX2)
aes_gcm_dec_update_vaes_avx2(AES_GCM_KEY_VAES_AVX2(key),
le_ctr, ghash_acc,
src, dst, datalen);
else if (flags & FLAG_AVX)
aes_gcm_dec_update_aesni_avx(AES_GCM_KEY_AESNI(key),
le_ctr, ghash_acc,
@@ -1062,9 +1090,13 @@ aes_gcm_enc_final_aesni_avx(const struct aes_gcm_key_aesni *key,
const u32 le_ctr[4], u8 ghash_acc[16],
u64 total_aadlen, u64 total_datalen);
asmlinkage void
aes_gcm_enc_final_vaes_avx10(const struct aes_gcm_key_avx10 *key,
const u32 le_ctr[4], u8 ghash_acc[16],
u64 total_aadlen, u64 total_datalen);
aes_gcm_enc_final_vaes_avx2(const struct aes_gcm_key_vaes_avx2 *key,
const u32 le_ctr[4], u8 ghash_acc[16],
u64 total_aadlen, u64 total_datalen);
asmlinkage void
aes_gcm_enc_final_vaes_avx512(const struct aes_gcm_key_vaes_avx512 *key,
const u32 le_ctr[4], u8 ghash_acc[16],
u64 total_aadlen, u64 total_datalen);
/* __always_inline to optimize out the branches based on @flags */
static __always_inline void
@@ -1072,10 +1104,14 @@ aes_gcm_enc_final(const struct aes_gcm_key *key,
const u32 le_ctr[4], u8 ghash_acc[16],
u64 total_aadlen, u64 total_datalen, int flags)
{
if (flags & (FLAG_AVX10_256 | FLAG_AVX10_512))
aes_gcm_enc_final_vaes_avx10(AES_GCM_KEY_AVX10(key),
le_ctr, ghash_acc,
total_aadlen, total_datalen);
if (flags & FLAG_VAES_AVX512)
aes_gcm_enc_final_vaes_avx512(AES_GCM_KEY_VAES_AVX512(key),
le_ctr, ghash_acc,
total_aadlen, total_datalen);
else if (flags & FLAG_VAES_AVX2)
aes_gcm_enc_final_vaes_avx2(AES_GCM_KEY_VAES_AVX2(key),
le_ctr, ghash_acc,
total_aadlen, total_datalen);
else if (flags & FLAG_AVX)
aes_gcm_enc_final_aesni_avx(AES_GCM_KEY_AESNI(key),
le_ctr, ghash_acc,
@@ -1097,10 +1133,15 @@ aes_gcm_dec_final_aesni_avx(const struct aes_gcm_key_aesni *key,
u64 total_aadlen, u64 total_datalen,
const u8 tag[16], int taglen);
asmlinkage bool __must_check
aes_gcm_dec_final_vaes_avx10(const struct aes_gcm_key_avx10 *key,
const u32 le_ctr[4], const u8 ghash_acc[16],
u64 total_aadlen, u64 total_datalen,
const u8 tag[16], int taglen);
aes_gcm_dec_final_vaes_avx2(const struct aes_gcm_key_vaes_avx2 *key,
const u32 le_ctr[4], const u8 ghash_acc[16],
u64 total_aadlen, u64 total_datalen,
const u8 tag[16], int taglen);
asmlinkage bool __must_check
aes_gcm_dec_final_vaes_avx512(const struct aes_gcm_key_vaes_avx512 *key,
const u32 le_ctr[4], const u8 ghash_acc[16],
u64 total_aadlen, u64 total_datalen,
const u8 tag[16], int taglen);
/* __always_inline to optimize out the branches based on @flags */
static __always_inline bool __must_check
@@ -1108,11 +1149,16 @@ aes_gcm_dec_final(const struct aes_gcm_key *key, const u32 le_ctr[4],
u8 ghash_acc[16], u64 total_aadlen, u64 total_datalen,
u8 tag[16], int taglen, int flags)
{
if (flags & (FLAG_AVX10_256 | FLAG_AVX10_512))
return aes_gcm_dec_final_vaes_avx10(AES_GCM_KEY_AVX10(key),
le_ctr, ghash_acc,
total_aadlen, total_datalen,
tag, taglen);
if (flags & FLAG_VAES_AVX512)
return aes_gcm_dec_final_vaes_avx512(AES_GCM_KEY_VAES_AVX512(key),
le_ctr, ghash_acc,
total_aadlen, total_datalen,
tag, taglen);
else if (flags & FLAG_VAES_AVX2)
return aes_gcm_dec_final_vaes_avx2(AES_GCM_KEY_VAES_AVX2(key),
le_ctr, ghash_acc,
total_aadlen, total_datalen,
tag, taglen);
else if (flags & FLAG_AVX)
return aes_gcm_dec_final_aesni_avx(AES_GCM_KEY_AESNI(key),
le_ctr, ghash_acc,
@@ -1195,10 +1241,14 @@ static int gcm_setkey(struct crypto_aead *tfm, const u8 *raw_key,
BUILD_BUG_ON(offsetof(struct aes_gcm_key_aesni, h_powers) != 496);
BUILD_BUG_ON(offsetof(struct aes_gcm_key_aesni, h_powers_xored) != 624);
BUILD_BUG_ON(offsetof(struct aes_gcm_key_aesni, h_times_x64) != 688);
BUILD_BUG_ON(offsetof(struct aes_gcm_key_avx10, base.aes_key.key_enc) != 0);
BUILD_BUG_ON(offsetof(struct aes_gcm_key_avx10, base.aes_key.key_length) != 480);
BUILD_BUG_ON(offsetof(struct aes_gcm_key_avx10, h_powers) != 512);
BUILD_BUG_ON(offsetof(struct aes_gcm_key_avx10, padding) != 768);
BUILD_BUG_ON(offsetof(struct aes_gcm_key_vaes_avx2, base.aes_key.key_enc) != 0);
BUILD_BUG_ON(offsetof(struct aes_gcm_key_vaes_avx2, base.aes_key.key_length) != 480);
BUILD_BUG_ON(offsetof(struct aes_gcm_key_vaes_avx2, h_powers) != 512);
BUILD_BUG_ON(offsetof(struct aes_gcm_key_vaes_avx2, h_powers_xored) != 640);
BUILD_BUG_ON(offsetof(struct aes_gcm_key_vaes_avx512, base.aes_key.key_enc) != 0);
BUILD_BUG_ON(offsetof(struct aes_gcm_key_vaes_avx512, base.aes_key.key_length) != 480);
BUILD_BUG_ON(offsetof(struct aes_gcm_key_vaes_avx512, h_powers) != 512);
BUILD_BUG_ON(offsetof(struct aes_gcm_key_vaes_avx512, padding) != 768);
if (likely(crypto_simd_usable())) {
err = aes_check_keylen(keylen);
@@ -1231,8 +1281,9 @@ static int gcm_setkey(struct crypto_aead *tfm, const u8 *raw_key,
gf128mul_lle(&h, (const be128 *)x_to_the_minus1);
/* Compute the needed key powers */
if (flags & (FLAG_AVX10_256 | FLAG_AVX10_512)) {
struct aes_gcm_key_avx10 *k = AES_GCM_KEY_AVX10(key);
if (flags & FLAG_VAES_AVX512) {
struct aes_gcm_key_vaes_avx512 *k =
AES_GCM_KEY_VAES_AVX512(key);
for (i = ARRAY_SIZE(k->h_powers) - 1; i >= 0; i--) {
k->h_powers[i][0] = be64_to_cpu(h.b);
@@ -1240,6 +1291,22 @@ static int gcm_setkey(struct crypto_aead *tfm, const u8 *raw_key,
gf128mul_lle(&h, &h1);
}
memset(k->padding, 0, sizeof(k->padding));
} else if (flags & FLAG_VAES_AVX2) {
struct aes_gcm_key_vaes_avx2 *k =
AES_GCM_KEY_VAES_AVX2(key);
static const u8 indices[8] = { 0, 2, 1, 3, 4, 6, 5, 7 };
for (i = ARRAY_SIZE(k->h_powers) - 1; i >= 0; i--) {
k->h_powers[i][0] = be64_to_cpu(h.b);
k->h_powers[i][1] = be64_to_cpu(h.a);
gf128mul_lle(&h, &h1);
}
for (i = 0; i < ARRAY_SIZE(k->h_powers_xored); i++) {
int j = indices[i];
k->h_powers_xored[i] = k->h_powers[j][0] ^
k->h_powers[j][1];
}
} else {
struct aes_gcm_key_aesni *k = AES_GCM_KEY_AESNI(key);
@@ -1508,15 +1575,15 @@ DEFINE_GCM_ALGS(aesni_avx, FLAG_AVX,
"generic-gcm-aesni-avx", "rfc4106-gcm-aesni-avx",
AES_GCM_KEY_AESNI_SIZE, 500);
/* aes_gcm_algs_vaes_avx10_256 */
DEFINE_GCM_ALGS(vaes_avx10_256, FLAG_AVX10_256,
"generic-gcm-vaes-avx10_256", "rfc4106-gcm-vaes-avx10_256",
AES_GCM_KEY_AVX10_SIZE, 700);
/* aes_gcm_algs_vaes_avx2 */
DEFINE_GCM_ALGS(vaes_avx2, FLAG_VAES_AVX2,
"generic-gcm-vaes-avx2", "rfc4106-gcm-vaes-avx2",
AES_GCM_KEY_VAES_AVX2_SIZE, 600);
/* aes_gcm_algs_vaes_avx10_512 */
DEFINE_GCM_ALGS(vaes_avx10_512, FLAG_AVX10_512,
"generic-gcm-vaes-avx10_512", "rfc4106-gcm-vaes-avx10_512",
AES_GCM_KEY_AVX10_SIZE, 800);
/* aes_gcm_algs_vaes_avx512 */
DEFINE_GCM_ALGS(vaes_avx512, FLAG_VAES_AVX512,
"generic-gcm-vaes-avx512", "rfc4106-gcm-vaes-avx512",
AES_GCM_KEY_VAES_AVX512_SIZE, 800);
static int __init register_avx_algs(void)
{
@@ -1548,6 +1615,10 @@ static int __init register_avx_algs(void)
ARRAY_SIZE(skcipher_algs_vaes_avx2));
if (err)
return err;
err = crypto_register_aeads(aes_gcm_algs_vaes_avx2,
ARRAY_SIZE(aes_gcm_algs_vaes_avx2));
if (err)
return err;
if (!boot_cpu_has(X86_FEATURE_AVX512BW) ||
!boot_cpu_has(X86_FEATURE_AVX512VL) ||
@@ -1556,26 +1627,21 @@ static int __init register_avx_algs(void)
XFEATURE_MASK_AVX512, NULL))
return 0;
err = crypto_register_aeads(aes_gcm_algs_vaes_avx10_256,
ARRAY_SIZE(aes_gcm_algs_vaes_avx10_256));
if (err)
return err;
if (boot_cpu_has(X86_FEATURE_PREFER_YMM)) {
int i;
for (i = 0; i < ARRAY_SIZE(skcipher_algs_vaes_avx512); i++)
skcipher_algs_vaes_avx512[i].base.cra_priority = 1;
for (i = 0; i < ARRAY_SIZE(aes_gcm_algs_vaes_avx10_512); i++)
aes_gcm_algs_vaes_avx10_512[i].base.cra_priority = 1;
for (i = 0; i < ARRAY_SIZE(aes_gcm_algs_vaes_avx512); i++)
aes_gcm_algs_vaes_avx512[i].base.cra_priority = 1;
}
err = crypto_register_skciphers(skcipher_algs_vaes_avx512,
ARRAY_SIZE(skcipher_algs_vaes_avx512));
if (err)
return err;
err = crypto_register_aeads(aes_gcm_algs_vaes_avx10_512,
ARRAY_SIZE(aes_gcm_algs_vaes_avx10_512));
err = crypto_register_aeads(aes_gcm_algs_vaes_avx512,
ARRAY_SIZE(aes_gcm_algs_vaes_avx512));
if (err)
return err;
@@ -1595,8 +1661,8 @@ static void unregister_avx_algs(void)
unregister_aeads(aes_gcm_algs_aesni_avx);
unregister_skciphers(skcipher_algs_vaes_avx2);
unregister_skciphers(skcipher_algs_vaes_avx512);
unregister_aeads(aes_gcm_algs_vaes_avx10_256);
unregister_aeads(aes_gcm_algs_vaes_avx10_512);
unregister_aeads(aes_gcm_algs_vaes_avx2);
unregister_aeads(aes_gcm_algs_vaes_avx512);
}
#else /* CONFIG_X86_64 */
static struct aead_alg aes_gcm_algs_aesni[0];

View File

@@ -1,180 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* Glue code for POLYVAL using PCMULQDQ-NI
*
* Copyright (c) 2007 Nokia Siemens Networks - Mikko Herranen <mh1@iki.fi>
* Copyright (c) 2009 Intel Corp.
* Author: Huang Ying <ying.huang@intel.com>
* Copyright 2021 Google LLC
*/
/*
* Glue code based on ghash-clmulni-intel_glue.c.
*
* This implementation of POLYVAL uses montgomery multiplication
* accelerated by PCLMULQDQ-NI to implement the finite field
* operations.
*/
#include <asm/cpu_device_id.h>
#include <asm/fpu/api.h>
#include <crypto/internal/hash.h>
#include <crypto/polyval.h>
#include <crypto/utils.h>
#include <linux/errno.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/string.h>
#define POLYVAL_ALIGN 16
#define POLYVAL_ALIGN_ATTR __aligned(POLYVAL_ALIGN)
#define POLYVAL_ALIGN_EXTRA ((POLYVAL_ALIGN - 1) & ~(CRYPTO_MINALIGN - 1))
#define POLYVAL_CTX_SIZE (sizeof(struct polyval_tfm_ctx) + POLYVAL_ALIGN_EXTRA)
#define NUM_KEY_POWERS 8
struct polyval_tfm_ctx {
/*
* These powers must be in the order h^8, ..., h^1.
*/
u8 key_powers[NUM_KEY_POWERS][POLYVAL_BLOCK_SIZE] POLYVAL_ALIGN_ATTR;
};
struct polyval_desc_ctx {
u8 buffer[POLYVAL_BLOCK_SIZE];
};
asmlinkage void clmul_polyval_update(const struct polyval_tfm_ctx *keys,
const u8 *in, size_t nblocks, u8 *accumulator);
asmlinkage void clmul_polyval_mul(u8 *op1, const u8 *op2);
static inline struct polyval_tfm_ctx *polyval_tfm_ctx(struct crypto_shash *tfm)
{
return PTR_ALIGN(crypto_shash_ctx(tfm), POLYVAL_ALIGN);
}
static void internal_polyval_update(const struct polyval_tfm_ctx *keys,
const u8 *in, size_t nblocks, u8 *accumulator)
{
kernel_fpu_begin();
clmul_polyval_update(keys, in, nblocks, accumulator);
kernel_fpu_end();
}
static void internal_polyval_mul(u8 *op1, const u8 *op2)
{
kernel_fpu_begin();
clmul_polyval_mul(op1, op2);
kernel_fpu_end();
}
static int polyval_x86_setkey(struct crypto_shash *tfm,
const u8 *key, unsigned int keylen)
{
struct polyval_tfm_ctx *tctx = polyval_tfm_ctx(tfm);
int i;
if (keylen != POLYVAL_BLOCK_SIZE)
return -EINVAL;
memcpy(tctx->key_powers[NUM_KEY_POWERS-1], key, POLYVAL_BLOCK_SIZE);
for (i = NUM_KEY_POWERS-2; i >= 0; i--) {
memcpy(tctx->key_powers[i], key, POLYVAL_BLOCK_SIZE);
internal_polyval_mul(tctx->key_powers[i],
tctx->key_powers[i+1]);
}
return 0;
}
static int polyval_x86_init(struct shash_desc *desc)
{
struct polyval_desc_ctx *dctx = shash_desc_ctx(desc);
memset(dctx, 0, sizeof(*dctx));
return 0;
}
static int polyval_x86_update(struct shash_desc *desc,
const u8 *src, unsigned int srclen)
{
struct polyval_desc_ctx *dctx = shash_desc_ctx(desc);
const struct polyval_tfm_ctx *tctx = polyval_tfm_ctx(desc->tfm);
unsigned int nblocks;
do {
/* Allow rescheduling every 4K bytes. */
nblocks = min(srclen, 4096U) / POLYVAL_BLOCK_SIZE;
internal_polyval_update(tctx, src, nblocks, dctx->buffer);
srclen -= nblocks * POLYVAL_BLOCK_SIZE;
src += nblocks * POLYVAL_BLOCK_SIZE;
} while (srclen >= POLYVAL_BLOCK_SIZE);
return srclen;
}
static int polyval_x86_finup(struct shash_desc *desc, const u8 *src,
unsigned int len, u8 *dst)
{
struct polyval_desc_ctx *dctx = shash_desc_ctx(desc);
const struct polyval_tfm_ctx *tctx = polyval_tfm_ctx(desc->tfm);
if (len) {
crypto_xor(dctx->buffer, src, len);
internal_polyval_mul(dctx->buffer,
tctx->key_powers[NUM_KEY_POWERS-1]);
}
memcpy(dst, dctx->buffer, POLYVAL_BLOCK_SIZE);
return 0;
}
static struct shash_alg polyval_alg = {
.digestsize = POLYVAL_DIGEST_SIZE,
.init = polyval_x86_init,
.update = polyval_x86_update,
.finup = polyval_x86_finup,
.setkey = polyval_x86_setkey,
.descsize = sizeof(struct polyval_desc_ctx),
.base = {
.cra_name = "polyval",
.cra_driver_name = "polyval-clmulni",
.cra_priority = 200,
.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY,
.cra_blocksize = POLYVAL_BLOCK_SIZE,
.cra_ctxsize = POLYVAL_CTX_SIZE,
.cra_module = THIS_MODULE,
},
};
__maybe_unused static const struct x86_cpu_id pcmul_cpu_id[] = {
X86_MATCH_FEATURE(X86_FEATURE_PCLMULQDQ, NULL),
{}
};
MODULE_DEVICE_TABLE(x86cpu, pcmul_cpu_id);
static int __init polyval_clmulni_mod_init(void)
{
if (!x86_match_cpu(pcmul_cpu_id))
return -ENODEV;
if (!boot_cpu_has(X86_FEATURE_AVX))
return -ENODEV;
return crypto_register_shash(&polyval_alg);
}
static void __exit polyval_clmulni_mod_exit(void)
{
crypto_unregister_shash(&polyval_alg);
}
module_init(polyval_clmulni_mod_init);
module_exit(polyval_clmulni_mod_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("POLYVAL hash function accelerated by PCLMULQDQ-NI");
MODULE_ALIAS_CRYPTO("polyval");
MODULE_ALIAS_CRYPTO("polyval-clmulni");

View File

@@ -696,7 +696,7 @@ config CRYPTO_ECB
config CRYPTO_HCTR2
tristate "HCTR2"
select CRYPTO_XCTR
select CRYPTO_POLYVAL
select CRYPTO_LIB_POLYVAL
select CRYPTO_MANAGER
help
HCTR2 length-preserving encryption mode
@@ -881,6 +881,7 @@ menu "Hashes, digests, and MACs"
config CRYPTO_BLAKE2B
tristate "BLAKE2b"
select CRYPTO_HASH
select CRYPTO_LIB_BLAKE2B
help
BLAKE2b cryptographic hash function (RFC 7693)
@@ -947,16 +948,6 @@ config CRYPTO_MICHAEL_MIC
This algorithm is required for TKIP, but it should not be used for
other purposes because of the weakness of the algorithm.
config CRYPTO_POLYVAL
tristate
select CRYPTO_HASH
select CRYPTO_LIB_GF128MUL
help
POLYVAL hash function for HCTR2
This is used in HCTR2. It is not a general-purpose
cryptographic hash function.
config CRYPTO_RMD160
tristate "RIPEMD-160"
select CRYPTO_HASH
@@ -1005,6 +996,7 @@ config CRYPTO_SHA512
config CRYPTO_SHA3
tristate "SHA-3"
select CRYPTO_HASH
select CRYPTO_LIB_SHA3
help
SHA-3 secure hash algorithms (FIPS 202, ISO/IEC 10118-3)

View File

@@ -78,13 +78,12 @@ obj-$(CONFIG_CRYPTO_RMD160) += rmd160.o
obj-$(CONFIG_CRYPTO_SHA1) += sha1.o
obj-$(CONFIG_CRYPTO_SHA256) += sha256.o
obj-$(CONFIG_CRYPTO_SHA512) += sha512.o
obj-$(CONFIG_CRYPTO_SHA3) += sha3_generic.o
obj-$(CONFIG_CRYPTO_SHA3) += sha3.o
obj-$(CONFIG_CRYPTO_SM3_GENERIC) += sm3_generic.o
obj-$(CONFIG_CRYPTO_STREEBOG) += streebog_generic.o
obj-$(CONFIG_CRYPTO_WP512) += wp512.o
CFLAGS_wp512.o := $(call cc-option,-fno-schedule-insns) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149
obj-$(CONFIG_CRYPTO_BLAKE2B) += blake2b_generic.o
CFLAGS_blake2b_generic.o := -Wframe-larger-than=4096 # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930
obj-$(CONFIG_CRYPTO_BLAKE2B) += blake2b.o
obj-$(CONFIG_CRYPTO_ECB) += ecb.o
obj-$(CONFIG_CRYPTO_CBC) += cbc.o
obj-$(CONFIG_CRYPTO_PCBC) += pcbc.o
@@ -173,7 +172,6 @@ jitterentropy_rng-y := jitterentropy.o jitterentropy-kcapi.o
obj-$(CONFIG_CRYPTO_JITTERENTROPY_TESTINTERFACE) += jitterentropy-testing.o
obj-$(CONFIG_CRYPTO_BENCHMARK) += tcrypt.o
obj-$(CONFIG_CRYPTO_GHASH) += ghash-generic.o
obj-$(CONFIG_CRYPTO_POLYVAL) += polyval-generic.o
obj-$(CONFIG_CRYPTO_USER_API) += af_alg.o
obj-$(CONFIG_CRYPTO_USER_API_HASH) += algif_hash.o
obj-$(CONFIG_CRYPTO_USER_API_SKCIPHER) += algif_skcipher.o

View File

@@ -4,7 +4,7 @@
*/
#include <asm/cpufeature.h>
#include <asm/neon.h>
#include <asm/simd.h>
#include "aegis.h"
#include "aegis-neon.h"
@@ -24,32 +24,28 @@ void crypto_aegis128_init_simd(struct aegis_state *state,
const union aegis_block *key,
const u8 *iv)
{
kernel_neon_begin();
crypto_aegis128_init_neon(state, key, iv);
kernel_neon_end();
scoped_ksimd()
crypto_aegis128_init_neon(state, key, iv);
}
void crypto_aegis128_update_simd(struct aegis_state *state, const void *msg)
{
kernel_neon_begin();
crypto_aegis128_update_neon(state, msg);
kernel_neon_end();
scoped_ksimd()
crypto_aegis128_update_neon(state, msg);
}
void crypto_aegis128_encrypt_chunk_simd(struct aegis_state *state, u8 *dst,
const u8 *src, unsigned int size)
{
kernel_neon_begin();
crypto_aegis128_encrypt_chunk_neon(state, dst, src, size);
kernel_neon_end();
scoped_ksimd()
crypto_aegis128_encrypt_chunk_neon(state, dst, src, size);
}
void crypto_aegis128_decrypt_chunk_simd(struct aegis_state *state, u8 *dst,
const u8 *src, unsigned int size)
{
kernel_neon_begin();
crypto_aegis128_decrypt_chunk_neon(state, dst, src, size);
kernel_neon_end();
scoped_ksimd()
crypto_aegis128_decrypt_chunk_neon(state, dst, src, size);
}
int crypto_aegis128_final_simd(struct aegis_state *state,
@@ -58,12 +54,7 @@ int crypto_aegis128_final_simd(struct aegis_state *state,
unsigned int cryptlen,
unsigned int authsize)
{
int ret;
kernel_neon_begin();
ret = crypto_aegis128_final_neon(state, tag_xor, assoclen, cryptlen,
authsize);
kernel_neon_end();
return ret;
scoped_ksimd()
return crypto_aegis128_final_neon(state, tag_xor, assoclen,
cryptlen, authsize);
}

111
crypto/blake2b.c Normal file
View File

@@ -0,0 +1,111 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Crypto API support for BLAKE2b
*
* Copyright 2025 Google LLC
*/
#include <crypto/blake2b.h>
#include <crypto/internal/hash.h>
#include <linux/kernel.h>
#include <linux/module.h>
struct blake2b_tfm_ctx {
unsigned int keylen;
u8 key[BLAKE2B_KEY_SIZE];
};
static int crypto_blake2b_setkey(struct crypto_shash *tfm,
const u8 *key, unsigned int keylen)
{
struct blake2b_tfm_ctx *tctx = crypto_shash_ctx(tfm);
if (keylen > BLAKE2B_KEY_SIZE)
return -EINVAL;
memcpy(tctx->key, key, keylen);
tctx->keylen = keylen;
return 0;
}
#define BLAKE2B_CTX(desc) ((struct blake2b_ctx *)shash_desc_ctx(desc))
static int crypto_blake2b_init(struct shash_desc *desc)
{
const struct blake2b_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm);
unsigned int digestsize = crypto_shash_digestsize(desc->tfm);
blake2b_init_key(BLAKE2B_CTX(desc), digestsize,
tctx->key, tctx->keylen);
return 0;
}
static int crypto_blake2b_update(struct shash_desc *desc,
const u8 *data, unsigned int len)
{
blake2b_update(BLAKE2B_CTX(desc), data, len);
return 0;
}
static int crypto_blake2b_final(struct shash_desc *desc, u8 *out)
{
blake2b_final(BLAKE2B_CTX(desc), out);
return 0;
}
static int crypto_blake2b_digest(struct shash_desc *desc,
const u8 *data, unsigned int len, u8 *out)
{
const struct blake2b_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm);
unsigned int digestsize = crypto_shash_digestsize(desc->tfm);
blake2b(tctx->key, tctx->keylen, data, len, out, digestsize);
return 0;
}
#define BLAKE2B_ALG(name, digest_size) \
{ \
.base.cra_name = name, \
.base.cra_driver_name = name "-lib", \
.base.cra_priority = 300, \
.base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, \
.base.cra_blocksize = BLAKE2B_BLOCK_SIZE, \
.base.cra_ctxsize = sizeof(struct blake2b_tfm_ctx), \
.base.cra_module = THIS_MODULE, \
.digestsize = digest_size, \
.setkey = crypto_blake2b_setkey, \
.init = crypto_blake2b_init, \
.update = crypto_blake2b_update, \
.final = crypto_blake2b_final, \
.digest = crypto_blake2b_digest, \
.descsize = sizeof(struct blake2b_ctx), \
}
static struct shash_alg algs[] = {
BLAKE2B_ALG("blake2b-160", BLAKE2B_160_HASH_SIZE),
BLAKE2B_ALG("blake2b-256", BLAKE2B_256_HASH_SIZE),
BLAKE2B_ALG("blake2b-384", BLAKE2B_384_HASH_SIZE),
BLAKE2B_ALG("blake2b-512", BLAKE2B_512_HASH_SIZE),
};
static int __init crypto_blake2b_mod_init(void)
{
return crypto_register_shashes(algs, ARRAY_SIZE(algs));
}
module_init(crypto_blake2b_mod_init);
static void __exit crypto_blake2b_mod_exit(void)
{
crypto_unregister_shashes(algs, ARRAY_SIZE(algs));
}
module_exit(crypto_blake2b_mod_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Crypto API support for BLAKE2b");
MODULE_ALIAS_CRYPTO("blake2b-160");
MODULE_ALIAS_CRYPTO("blake2b-160-lib");
MODULE_ALIAS_CRYPTO("blake2b-256");
MODULE_ALIAS_CRYPTO("blake2b-256-lib");
MODULE_ALIAS_CRYPTO("blake2b-384");
MODULE_ALIAS_CRYPTO("blake2b-384-lib");
MODULE_ALIAS_CRYPTO("blake2b-512");
MODULE_ALIAS_CRYPTO("blake2b-512-lib");

View File

@@ -1,192 +0,0 @@
// SPDX-License-Identifier: (GPL-2.0-only OR Apache-2.0)
/*
* Generic implementation of the BLAKE2b digest algorithm. Based on the BLAKE2b
* reference implementation, but it has been heavily modified for use in the
* kernel. The reference implementation was:
*
* Copyright 2012, Samuel Neves <sneves@dei.uc.pt>. You may use this under
* the terms of the CC0, the OpenSSL Licence, or the Apache Public License
* 2.0, at your option. The terms of these licenses can be found at:
*
* - CC0 1.0 Universal : http://creativecommons.org/publicdomain/zero/1.0
* - OpenSSL license : https://www.openssl.org/source/license.html
* - Apache 2.0 : https://www.apache.org/licenses/LICENSE-2.0
*
* More information about BLAKE2 can be found at https://blake2.net.
*/
#include <crypto/internal/blake2b.h>
#include <crypto/internal/hash.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/string.h>
#include <linux/unaligned.h>
static const u8 blake2b_sigma[12][16] = {
{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 },
{ 14, 10, 4, 8, 9, 15, 13, 6, 1, 12, 0, 2, 11, 7, 5, 3 },
{ 11, 8, 12, 0, 5, 2, 15, 13, 10, 14, 3, 6, 7, 1, 9, 4 },
{ 7, 9, 3, 1, 13, 12, 11, 14, 2, 6, 5, 10, 4, 0, 15, 8 },
{ 9, 0, 5, 7, 2, 4, 10, 15, 14, 1, 11, 12, 6, 8, 3, 13 },
{ 2, 12, 6, 10, 0, 11, 8, 3, 4, 13, 7, 5, 15, 14, 1, 9 },
{ 12, 5, 1, 15, 14, 13, 4, 10, 0, 7, 6, 3, 9, 2, 8, 11 },
{ 13, 11, 7, 14, 12, 1, 3, 9, 5, 0, 15, 4, 8, 6, 2, 10 },
{ 6, 15, 14, 9, 11, 3, 0, 8, 12, 2, 13, 7, 1, 4, 10, 5 },
{ 10, 2, 8, 4, 7, 6, 1, 5, 15, 11, 9, 14, 3, 12, 13, 0 },
{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 },
{ 14, 10, 4, 8, 9, 15, 13, 6, 1, 12, 0, 2, 11, 7, 5, 3 }
};
static void blake2b_increment_counter(struct blake2b_state *S, const u64 inc)
{
S->t[0] += inc;
S->t[1] += (S->t[0] < inc);
}
#define G(r,i,a,b,c,d) \
do { \
a = a + b + m[blake2b_sigma[r][2*i+0]]; \
d = ror64(d ^ a, 32); \
c = c + d; \
b = ror64(b ^ c, 24); \
a = a + b + m[blake2b_sigma[r][2*i+1]]; \
d = ror64(d ^ a, 16); \
c = c + d; \
b = ror64(b ^ c, 63); \
} while (0)
#define ROUND(r) \
do { \
G(r,0,v[ 0],v[ 4],v[ 8],v[12]); \
G(r,1,v[ 1],v[ 5],v[ 9],v[13]); \
G(r,2,v[ 2],v[ 6],v[10],v[14]); \
G(r,3,v[ 3],v[ 7],v[11],v[15]); \
G(r,4,v[ 0],v[ 5],v[10],v[15]); \
G(r,5,v[ 1],v[ 6],v[11],v[12]); \
G(r,6,v[ 2],v[ 7],v[ 8],v[13]); \
G(r,7,v[ 3],v[ 4],v[ 9],v[14]); \
} while (0)
static void blake2b_compress_one_generic(struct blake2b_state *S,
const u8 block[BLAKE2B_BLOCK_SIZE])
{
u64 m[16];
u64 v[16];
size_t i;
for (i = 0; i < 16; ++i)
m[i] = get_unaligned_le64(block + i * sizeof(m[i]));
for (i = 0; i < 8; ++i)
v[i] = S->h[i];
v[ 8] = BLAKE2B_IV0;
v[ 9] = BLAKE2B_IV1;
v[10] = BLAKE2B_IV2;
v[11] = BLAKE2B_IV3;
v[12] = BLAKE2B_IV4 ^ S->t[0];
v[13] = BLAKE2B_IV5 ^ S->t[1];
v[14] = BLAKE2B_IV6 ^ S->f[0];
v[15] = BLAKE2B_IV7 ^ S->f[1];
ROUND(0);
ROUND(1);
ROUND(2);
ROUND(3);
ROUND(4);
ROUND(5);
ROUND(6);
ROUND(7);
ROUND(8);
ROUND(9);
ROUND(10);
ROUND(11);
#ifdef CONFIG_CC_IS_CLANG
#pragma nounroll /* https://llvm.org/pr45803 */
#endif
for (i = 0; i < 8; ++i)
S->h[i] = S->h[i] ^ v[i] ^ v[i + 8];
}
#undef G
#undef ROUND
static void blake2b_compress_generic(struct blake2b_state *state,
const u8 *block, size_t nblocks, u32 inc)
{
do {
blake2b_increment_counter(state, inc);
blake2b_compress_one_generic(state, block);
block += BLAKE2B_BLOCK_SIZE;
} while (--nblocks);
}
static int crypto_blake2b_update_generic(struct shash_desc *desc,
const u8 *in, unsigned int inlen)
{
return crypto_blake2b_update_bo(desc, in, inlen,
blake2b_compress_generic);
}
static int crypto_blake2b_finup_generic(struct shash_desc *desc, const u8 *in,
unsigned int inlen, u8 *out)
{
return crypto_blake2b_finup(desc, in, inlen, out,
blake2b_compress_generic);
}
#define BLAKE2B_ALG(name, driver_name, digest_size) \
{ \
.base.cra_name = name, \
.base.cra_driver_name = driver_name, \
.base.cra_priority = 100, \
.base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY | \
CRYPTO_AHASH_ALG_BLOCK_ONLY | \
CRYPTO_AHASH_ALG_FINAL_NONZERO, \
.base.cra_blocksize = BLAKE2B_BLOCK_SIZE, \
.base.cra_ctxsize = sizeof(struct blake2b_tfm_ctx), \
.base.cra_module = THIS_MODULE, \
.digestsize = digest_size, \
.setkey = crypto_blake2b_setkey, \
.init = crypto_blake2b_init, \
.update = crypto_blake2b_update_generic, \
.finup = crypto_blake2b_finup_generic, \
.descsize = BLAKE2B_DESC_SIZE, \
.statesize = BLAKE2B_STATE_SIZE, \
}
static struct shash_alg blake2b_algs[] = {
BLAKE2B_ALG("blake2b-160", "blake2b-160-generic",
BLAKE2B_160_HASH_SIZE),
BLAKE2B_ALG("blake2b-256", "blake2b-256-generic",
BLAKE2B_256_HASH_SIZE),
BLAKE2B_ALG("blake2b-384", "blake2b-384-generic",
BLAKE2B_384_HASH_SIZE),
BLAKE2B_ALG("blake2b-512", "blake2b-512-generic",
BLAKE2B_512_HASH_SIZE),
};
static int __init blake2b_mod_init(void)
{
return crypto_register_shashes(blake2b_algs, ARRAY_SIZE(blake2b_algs));
}
static void __exit blake2b_mod_fini(void)
{
crypto_unregister_shashes(blake2b_algs, ARRAY_SIZE(blake2b_algs));
}
module_init(blake2b_mod_init);
module_exit(blake2b_mod_fini);
MODULE_AUTHOR("David Sterba <kdave@kernel.org>");
MODULE_DESCRIPTION("BLAKE2b generic implementation");
MODULE_LICENSE("GPL");
MODULE_ALIAS_CRYPTO("blake2b-160");
MODULE_ALIAS_CRYPTO("blake2b-160-generic");
MODULE_ALIAS_CRYPTO("blake2b-256");
MODULE_ALIAS_CRYPTO("blake2b-256-generic");
MODULE_ALIAS_CRYPTO("blake2b-384");
MODULE_ALIAS_CRYPTO("blake2b-384-generic");
MODULE_ALIAS_CRYPTO("blake2b-512");
MODULE_ALIAS_CRYPTO("blake2b-512-generic");

View File

@@ -17,7 +17,6 @@
*/
#include <crypto/internal/cipher.h>
#include <crypto/internal/hash.h>
#include <crypto/internal/skcipher.h>
#include <crypto/polyval.h>
#include <crypto/scatterwalk.h>
@@ -37,23 +36,14 @@
struct hctr2_instance_ctx {
struct crypto_cipher_spawn blockcipher_spawn;
struct crypto_skcipher_spawn xctr_spawn;
struct crypto_shash_spawn polyval_spawn;
};
struct hctr2_tfm_ctx {
struct crypto_cipher *blockcipher;
struct crypto_skcipher *xctr;
struct crypto_shash *polyval;
struct polyval_key poly_key;
struct polyval_elem hashed_tweaklens[2];
u8 L[BLOCKCIPHER_BLOCK_SIZE];
int hashed_tweak_offset;
/*
* This struct is allocated with extra space for two exported hash
* states. Since the hash state size is not known at compile-time, we
* can't add these to the struct directly.
*
* hashed_tweaklen_divisible;
* hashed_tweaklen_remainder;
*/
};
struct hctr2_request_ctx {
@@ -63,39 +53,17 @@ struct hctr2_request_ctx {
struct scatterlist *bulk_part_src;
struct scatterlist sg_src[2];
struct scatterlist sg_dst[2];
struct polyval_elem hashed_tweak;
/*
* Sub-request sizes are unknown at compile-time, so they need to go
* after the members with known sizes.
* skcipher sub-request size is unknown at compile-time, so it needs to
* go after the members with known sizes.
*/
union {
struct shash_desc hash_desc;
struct polyval_ctx poly_ctx;
struct skcipher_request xctr_req;
} u;
/*
* This struct is allocated with extra space for one exported hash
* state. Since the hash state size is not known at compile-time, we
* can't add it to the struct directly.
*
* hashed_tweak;
*/
};
static inline u8 *hctr2_hashed_tweaklen(const struct hctr2_tfm_ctx *tctx,
bool has_remainder)
{
u8 *p = (u8 *)tctx + sizeof(*tctx);
if (has_remainder) /* For messages not a multiple of block length */
p += crypto_shash_statesize(tctx->polyval);
return p;
}
static inline u8 *hctr2_hashed_tweak(const struct hctr2_tfm_ctx *tctx,
struct hctr2_request_ctx *rctx)
{
return (u8 *)rctx + tctx->hashed_tweak_offset;
}
/*
* The input data for each HCTR2 hash step begins with a 16-byte block that
* contains the tweak length and a flag that indicates whether the input is evenly
@@ -106,24 +74,23 @@ static inline u8 *hctr2_hashed_tweak(const struct hctr2_tfm_ctx *tctx,
*
* These precomputed hashes are stored in hctr2_tfm_ctx.
*/
static int hctr2_hash_tweaklen(struct hctr2_tfm_ctx *tctx, bool has_remainder)
static void hctr2_hash_tweaklens(struct hctr2_tfm_ctx *tctx)
{
SHASH_DESC_ON_STACK(shash, tfm->polyval);
__le64 tweak_length_block[2];
int err;
struct polyval_ctx ctx;
shash->tfm = tctx->polyval;
memset(tweak_length_block, 0, sizeof(tweak_length_block));
for (int has_remainder = 0; has_remainder < 2; has_remainder++) {
const __le64 tweak_length_block[2] = {
cpu_to_le64(TWEAK_SIZE * 8 * 2 + 2 + has_remainder),
};
tweak_length_block[0] = cpu_to_le64(TWEAK_SIZE * 8 * 2 + 2 + has_remainder);
err = crypto_shash_init(shash);
if (err)
return err;
err = crypto_shash_update(shash, (u8 *)tweak_length_block,
POLYVAL_BLOCK_SIZE);
if (err)
return err;
return crypto_shash_export(shash, hctr2_hashed_tweaklen(tctx, has_remainder));
polyval_init(&ctx, &tctx->poly_key);
polyval_update(&ctx, (const u8 *)&tweak_length_block,
sizeof(tweak_length_block));
static_assert(sizeof(tweak_length_block) == POLYVAL_BLOCK_SIZE);
polyval_export_blkaligned(
&ctx, &tctx->hashed_tweaklens[has_remainder]);
}
memzero_explicit(&ctx, sizeof(ctx));
}
static int hctr2_setkey(struct crypto_skcipher *tfm, const u8 *key,
@@ -156,51 +123,42 @@ static int hctr2_setkey(struct crypto_skcipher *tfm, const u8 *key,
tctx->L[0] = 0x01;
crypto_cipher_encrypt_one(tctx->blockcipher, tctx->L, tctx->L);
crypto_shash_clear_flags(tctx->polyval, CRYPTO_TFM_REQ_MASK);
crypto_shash_set_flags(tctx->polyval, crypto_skcipher_get_flags(tfm) &
CRYPTO_TFM_REQ_MASK);
err = crypto_shash_setkey(tctx->polyval, hbar, BLOCKCIPHER_BLOCK_SIZE);
if (err)
return err;
static_assert(sizeof(hbar) == POLYVAL_BLOCK_SIZE);
polyval_preparekey(&tctx->poly_key, hbar);
memzero_explicit(hbar, sizeof(hbar));
return hctr2_hash_tweaklen(tctx, true) ?: hctr2_hash_tweaklen(tctx, false);
hctr2_hash_tweaklens(tctx);
return 0;
}
static int hctr2_hash_tweak(struct skcipher_request *req)
static void hctr2_hash_tweak(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
const struct hctr2_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
struct hctr2_request_ctx *rctx = skcipher_request_ctx(req);
struct shash_desc *hash_desc = &rctx->u.hash_desc;
int err;
struct polyval_ctx *poly_ctx = &rctx->u.poly_ctx;
bool has_remainder = req->cryptlen % POLYVAL_BLOCK_SIZE;
hash_desc->tfm = tctx->polyval;
err = crypto_shash_import(hash_desc, hctr2_hashed_tweaklen(tctx, has_remainder));
if (err)
return err;
err = crypto_shash_update(hash_desc, req->iv, TWEAK_SIZE);
if (err)
return err;
polyval_import_blkaligned(poly_ctx, &tctx->poly_key,
&tctx->hashed_tweaklens[has_remainder]);
polyval_update(poly_ctx, req->iv, TWEAK_SIZE);
// Store the hashed tweak, since we need it when computing both
// H(T || N) and H(T || V).
return crypto_shash_export(hash_desc, hctr2_hashed_tweak(tctx, rctx));
static_assert(TWEAK_SIZE % POLYVAL_BLOCK_SIZE == 0);
polyval_export_blkaligned(poly_ctx, &rctx->hashed_tweak);
}
static int hctr2_hash_message(struct skcipher_request *req,
struct scatterlist *sgl,
u8 digest[POLYVAL_DIGEST_SIZE])
static void hctr2_hash_message(struct skcipher_request *req,
struct scatterlist *sgl,
u8 digest[POLYVAL_DIGEST_SIZE])
{
static const u8 padding[BLOCKCIPHER_BLOCK_SIZE] = { 0x1 };
static const u8 padding = 0x1;
struct hctr2_request_ctx *rctx = skcipher_request_ctx(req);
struct shash_desc *hash_desc = &rctx->u.hash_desc;
struct polyval_ctx *poly_ctx = &rctx->u.poly_ctx;
const unsigned int bulk_len = req->cryptlen - BLOCKCIPHER_BLOCK_SIZE;
struct sg_mapping_iter miter;
unsigned int remainder = bulk_len % BLOCKCIPHER_BLOCK_SIZE;
int i;
int err = 0;
int n = 0;
sg_miter_start(&miter, sgl, sg_nents(sgl),
@@ -208,22 +166,13 @@ static int hctr2_hash_message(struct skcipher_request *req,
for (i = 0; i < bulk_len; i += n) {
sg_miter_next(&miter);
n = min_t(unsigned int, miter.length, bulk_len - i);
err = crypto_shash_update(hash_desc, miter.addr, n);
if (err)
break;
polyval_update(poly_ctx, miter.addr, n);
}
sg_miter_stop(&miter);
if (err)
return err;
if (remainder) {
err = crypto_shash_update(hash_desc, padding,
BLOCKCIPHER_BLOCK_SIZE - remainder);
if (err)
return err;
}
return crypto_shash_final(hash_desc, digest);
if (req->cryptlen % BLOCKCIPHER_BLOCK_SIZE)
polyval_update(poly_ctx, &padding, 1);
polyval_final(poly_ctx, digest);
}
static int hctr2_finish(struct skcipher_request *req)
@@ -231,19 +180,14 @@ static int hctr2_finish(struct skcipher_request *req)
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
const struct hctr2_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
struct hctr2_request_ctx *rctx = skcipher_request_ctx(req);
struct polyval_ctx *poly_ctx = &rctx->u.poly_ctx;
u8 digest[POLYVAL_DIGEST_SIZE];
struct shash_desc *hash_desc = &rctx->u.hash_desc;
int err;
// U = UU ^ H(T || V)
// or M = MM ^ H(T || N)
hash_desc->tfm = tctx->polyval;
err = crypto_shash_import(hash_desc, hctr2_hashed_tweak(tctx, rctx));
if (err)
return err;
err = hctr2_hash_message(req, rctx->bulk_part_dst, digest);
if (err)
return err;
polyval_import_blkaligned(poly_ctx, &tctx->poly_key,
&rctx->hashed_tweak);
hctr2_hash_message(req, rctx->bulk_part_dst, digest);
crypto_xor(rctx->first_block, digest, BLOCKCIPHER_BLOCK_SIZE);
// Copy U (or M) into dst scatterlist
@@ -269,7 +213,6 @@ static int hctr2_crypt(struct skcipher_request *req, bool enc)
struct hctr2_request_ctx *rctx = skcipher_request_ctx(req);
u8 digest[POLYVAL_DIGEST_SIZE];
int bulk_len = req->cryptlen - BLOCKCIPHER_BLOCK_SIZE;
int err;
// Requests must be at least one block
if (req->cryptlen < BLOCKCIPHER_BLOCK_SIZE)
@@ -287,12 +230,8 @@ static int hctr2_crypt(struct skcipher_request *req, bool enc)
// MM = M ^ H(T || N)
// or UU = U ^ H(T || V)
err = hctr2_hash_tweak(req);
if (err)
return err;
err = hctr2_hash_message(req, rctx->bulk_part_src, digest);
if (err)
return err;
hctr2_hash_tweak(req);
hctr2_hash_message(req, rctx->bulk_part_src, digest);
crypto_xor(digest, rctx->first_block, BLOCKCIPHER_BLOCK_SIZE);
// UU = E(MM)
@@ -338,8 +277,6 @@ static int hctr2_init_tfm(struct crypto_skcipher *tfm)
struct hctr2_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
struct crypto_skcipher *xctr;
struct crypto_cipher *blockcipher;
struct crypto_shash *polyval;
unsigned int subreq_size;
int err;
xctr = crypto_spawn_skcipher(&ictx->xctr_spawn);
@@ -352,31 +289,17 @@ static int hctr2_init_tfm(struct crypto_skcipher *tfm)
goto err_free_xctr;
}
polyval = crypto_spawn_shash(&ictx->polyval_spawn);
if (IS_ERR(polyval)) {
err = PTR_ERR(polyval);
goto err_free_blockcipher;
}
tctx->xctr = xctr;
tctx->blockcipher = blockcipher;
tctx->polyval = polyval;
BUILD_BUG_ON(offsetofend(struct hctr2_request_ctx, u) !=
sizeof(struct hctr2_request_ctx));
subreq_size = max(sizeof_field(struct hctr2_request_ctx, u.hash_desc) +
crypto_shash_descsize(polyval),
sizeof_field(struct hctr2_request_ctx, u.xctr_req) +
crypto_skcipher_reqsize(xctr));
tctx->hashed_tweak_offset = offsetof(struct hctr2_request_ctx, u) +
subreq_size;
crypto_skcipher_set_reqsize(tfm, tctx->hashed_tweak_offset +
crypto_shash_statesize(polyval));
crypto_skcipher_set_reqsize(
tfm, max(sizeof(struct hctr2_request_ctx),
offsetofend(struct hctr2_request_ctx, u.xctr_req) +
crypto_skcipher_reqsize(xctr)));
return 0;
err_free_blockcipher:
crypto_free_cipher(blockcipher);
err_free_xctr:
crypto_free_skcipher(xctr);
return err;
@@ -388,7 +311,6 @@ static void hctr2_exit_tfm(struct crypto_skcipher *tfm)
crypto_free_cipher(tctx->blockcipher);
crypto_free_skcipher(tctx->xctr);
crypto_free_shash(tctx->polyval);
}
static void hctr2_free_instance(struct skcipher_instance *inst)
@@ -397,21 +319,17 @@ static void hctr2_free_instance(struct skcipher_instance *inst)
crypto_drop_cipher(&ictx->blockcipher_spawn);
crypto_drop_skcipher(&ictx->xctr_spawn);
crypto_drop_shash(&ictx->polyval_spawn);
kfree(inst);
}
static int hctr2_create_common(struct crypto_template *tmpl,
struct rtattr **tb,
const char *xctr_name,
const char *polyval_name)
static int hctr2_create_common(struct crypto_template *tmpl, struct rtattr **tb,
const char *xctr_name)
{
struct skcipher_alg_common *xctr_alg;
u32 mask;
struct skcipher_instance *inst;
struct hctr2_instance_ctx *ictx;
struct crypto_alg *blockcipher_alg;
struct shash_alg *polyval_alg;
char blockcipher_name[CRYPTO_MAX_ALG_NAME];
int len;
int err;
@@ -457,19 +375,6 @@ static int hctr2_create_common(struct crypto_template *tmpl,
if (blockcipher_alg->cra_blocksize != BLOCKCIPHER_BLOCK_SIZE)
goto err_free_inst;
/* Polyval ε-∆U hash function */
err = crypto_grab_shash(&ictx->polyval_spawn,
skcipher_crypto_instance(inst),
polyval_name, 0, mask);
if (err)
goto err_free_inst;
polyval_alg = crypto_spawn_shash_alg(&ictx->polyval_spawn);
/* Ensure Polyval is being used */
err = -EINVAL;
if (strcmp(polyval_alg->base.cra_name, "polyval") != 0)
goto err_free_inst;
/* Instance fields */
err = -ENAMETOOLONG;
@@ -477,22 +382,16 @@ static int hctr2_create_common(struct crypto_template *tmpl,
blockcipher_alg->cra_name) >= CRYPTO_MAX_ALG_NAME)
goto err_free_inst;
if (snprintf(inst->alg.base.cra_driver_name, CRYPTO_MAX_ALG_NAME,
"hctr2_base(%s,%s)",
xctr_alg->base.cra_driver_name,
polyval_alg->base.cra_driver_name) >= CRYPTO_MAX_ALG_NAME)
"hctr2_base(%s,polyval-lib)",
xctr_alg->base.cra_driver_name) >= CRYPTO_MAX_ALG_NAME)
goto err_free_inst;
inst->alg.base.cra_blocksize = BLOCKCIPHER_BLOCK_SIZE;
inst->alg.base.cra_ctxsize = sizeof(struct hctr2_tfm_ctx) +
polyval_alg->statesize * 2;
inst->alg.base.cra_ctxsize = sizeof(struct hctr2_tfm_ctx);
inst->alg.base.cra_alignmask = xctr_alg->base.cra_alignmask;
/*
* The hash function is called twice, so it is weighted higher than the
* xctr and blockcipher.
*/
inst->alg.base.cra_priority = (2 * xctr_alg->base.cra_priority +
4 * polyval_alg->base.cra_priority +
blockcipher_alg->cra_priority) / 7;
blockcipher_alg->cra_priority) /
3;
inst->alg.setkey = hctr2_setkey;
inst->alg.encrypt = hctr2_encrypt;
@@ -525,8 +424,11 @@ static int hctr2_create_base(struct crypto_template *tmpl, struct rtattr **tb)
polyval_name = crypto_attr_alg_name(tb[2]);
if (IS_ERR(polyval_name))
return PTR_ERR(polyval_name);
if (strcmp(polyval_name, "polyval") != 0 &&
strcmp(polyval_name, "polyval-lib") != 0)
return -ENOENT;
return hctr2_create_common(tmpl, tb, xctr_name, polyval_name);
return hctr2_create_common(tmpl, tb, xctr_name);
}
static int hctr2_create(struct crypto_template *tmpl, struct rtattr **tb)
@@ -542,7 +444,7 @@ static int hctr2_create(struct crypto_template *tmpl, struct rtattr **tb)
blockcipher_name) >= CRYPTO_MAX_ALG_NAME)
return -ENAMETOOLONG;
return hctr2_create_common(tmpl, tb, xctr_name, "polyval");
return hctr2_create_common(tmpl, tb, xctr_name);
}
static struct crypto_template hctr2_tmpls[] = {

View File

@@ -48,7 +48,7 @@
#include "jitterentropy.h"
#define JENT_CONDITIONING_HASH "sha3-256-generic"
#define JENT_CONDITIONING_HASH "sha3-256"
/***************************************************************************
* Helper function
@@ -230,15 +230,7 @@ static int jent_kcapi_init(struct crypto_tfm *tfm)
spin_lock_init(&rng->jent_lock);
/*
* Use SHA3-256 as conditioner. We allocate only the generic
* implementation as we are not interested in high-performance. The
* execution time of the SHA3 operation is measured and adds to the
* Jitter RNG's unpredictable behavior. If we have a slower hash
* implementation, the execution timing variations are larger. When
* using a fast implementation, we would need to call it more often
* as its variations are lower.
*/
/* Use SHA3-256 as conditioner */
hash = crypto_alloc_shash(JENT_CONDITIONING_HASH, 0, 0);
if (IS_ERR(hash)) {
pr_err("Cannot allocate conditioning digest\n");

View File

@@ -1,205 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* POLYVAL: hash function for HCTR2.
*
* Copyright (c) 2007 Nokia Siemens Networks - Mikko Herranen <mh1@iki.fi>
* Copyright (c) 2009 Intel Corp.
* Author: Huang Ying <ying.huang@intel.com>
* Copyright 2021 Google LLC
*/
/*
* Code based on crypto/ghash-generic.c
*
* POLYVAL is a keyed hash function similar to GHASH. POLYVAL uses a different
* modulus for finite field multiplication which makes hardware accelerated
* implementations on little-endian machines faster. POLYVAL is used in the
* kernel to implement HCTR2, but was originally specified for AES-GCM-SIV
* (RFC 8452).
*
* For more information see:
* Length-preserving encryption with HCTR2:
* https://eprint.iacr.org/2021/1441.pdf
* AES-GCM-SIV: Nonce Misuse-Resistant Authenticated Encryption:
* https://datatracker.ietf.org/doc/html/rfc8452
*
* Like GHASH, POLYVAL is not a cryptographic hash function and should
* not be used outside of crypto modes explicitly designed to use POLYVAL.
*
* This implementation uses a convenient trick involving the GHASH and POLYVAL
* fields. This trick allows multiplication in the POLYVAL field to be
* implemented by using multiplication in the GHASH field as a subroutine. An
* element of the POLYVAL field can be converted to an element of the GHASH
* field by computing x*REVERSE(a), where REVERSE reverses the byte-ordering of
* a. Similarly, an element of the GHASH field can be converted back to the
* POLYVAL field by computing REVERSE(x^{-1}*a). For more information, see:
* https://datatracker.ietf.org/doc/html/rfc8452#appendix-A
*
* By using this trick, we do not need to implement the POLYVAL field for the
* generic implementation.
*
* Warning: this generic implementation is not intended to be used in practice
* and is not constant time. For practical use, a hardware accelerated
* implementation of POLYVAL should be used instead.
*
*/
#include <crypto/gf128mul.h>
#include <crypto/internal/hash.h>
#include <crypto/polyval.h>
#include <crypto/utils.h>
#include <linux/errno.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/string.h>
#include <linux/unaligned.h>
struct polyval_tfm_ctx {
struct gf128mul_4k *gf128;
};
struct polyval_desc_ctx {
union {
u8 buffer[POLYVAL_BLOCK_SIZE];
be128 buffer128;
};
};
static void copy_and_reverse(u8 dst[POLYVAL_BLOCK_SIZE],
const u8 src[POLYVAL_BLOCK_SIZE])
{
u64 a = get_unaligned((const u64 *)&src[0]);
u64 b = get_unaligned((const u64 *)&src[8]);
put_unaligned(swab64(a), (u64 *)&dst[8]);
put_unaligned(swab64(b), (u64 *)&dst[0]);
}
static int polyval_setkey(struct crypto_shash *tfm,
const u8 *key, unsigned int keylen)
{
struct polyval_tfm_ctx *ctx = crypto_shash_ctx(tfm);
be128 k;
if (keylen != POLYVAL_BLOCK_SIZE)
return -EINVAL;
gf128mul_free_4k(ctx->gf128);
BUILD_BUG_ON(sizeof(k) != POLYVAL_BLOCK_SIZE);
copy_and_reverse((u8 *)&k, key);
gf128mul_x_lle(&k, &k);
ctx->gf128 = gf128mul_init_4k_lle(&k);
memzero_explicit(&k, POLYVAL_BLOCK_SIZE);
if (!ctx->gf128)
return -ENOMEM;
return 0;
}
static int polyval_init(struct shash_desc *desc)
{
struct polyval_desc_ctx *dctx = shash_desc_ctx(desc);
memset(dctx, 0, sizeof(*dctx));
return 0;
}
static int polyval_update(struct shash_desc *desc,
const u8 *src, unsigned int srclen)
{
struct polyval_desc_ctx *dctx = shash_desc_ctx(desc);
const struct polyval_tfm_ctx *ctx = crypto_shash_ctx(desc->tfm);
u8 tmp[POLYVAL_BLOCK_SIZE];
do {
copy_and_reverse(tmp, src);
crypto_xor(dctx->buffer, tmp, POLYVAL_BLOCK_SIZE);
gf128mul_4k_lle(&dctx->buffer128, ctx->gf128);
src += POLYVAL_BLOCK_SIZE;
srclen -= POLYVAL_BLOCK_SIZE;
} while (srclen >= POLYVAL_BLOCK_SIZE);
return srclen;
}
static int polyval_finup(struct shash_desc *desc, const u8 *src,
unsigned int len, u8 *dst)
{
struct polyval_desc_ctx *dctx = shash_desc_ctx(desc);
if (len) {
u8 tmp[POLYVAL_BLOCK_SIZE] = {};
memcpy(tmp, src, len);
polyval_update(desc, tmp, POLYVAL_BLOCK_SIZE);
}
copy_and_reverse(dst, dctx->buffer);
return 0;
}
static int polyval_export(struct shash_desc *desc, void *out)
{
struct polyval_desc_ctx *dctx = shash_desc_ctx(desc);
copy_and_reverse(out, dctx->buffer);
return 0;
}
static int polyval_import(struct shash_desc *desc, const void *in)
{
struct polyval_desc_ctx *dctx = shash_desc_ctx(desc);
copy_and_reverse(dctx->buffer, in);
return 0;
}
static void polyval_exit_tfm(struct crypto_shash *tfm)
{
struct polyval_tfm_ctx *ctx = crypto_shash_ctx(tfm);
gf128mul_free_4k(ctx->gf128);
}
static struct shash_alg polyval_alg = {
.digestsize = POLYVAL_DIGEST_SIZE,
.init = polyval_init,
.update = polyval_update,
.finup = polyval_finup,
.setkey = polyval_setkey,
.export = polyval_export,
.import = polyval_import,
.exit_tfm = polyval_exit_tfm,
.statesize = sizeof(struct polyval_desc_ctx),
.descsize = sizeof(struct polyval_desc_ctx),
.base = {
.cra_name = "polyval",
.cra_driver_name = "polyval-generic",
.cra_priority = 100,
.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY,
.cra_blocksize = POLYVAL_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct polyval_tfm_ctx),
.cra_module = THIS_MODULE,
},
};
static int __init polyval_mod_init(void)
{
return crypto_register_shash(&polyval_alg);
}
static void __exit polyval_mod_exit(void)
{
crypto_unregister_shash(&polyval_alg);
}
module_init(polyval_mod_init);
module_exit(polyval_mod_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("POLYVAL hash function");
MODULE_ALIAS_CRYPTO("polyval");
MODULE_ALIAS_CRYPTO("polyval-generic");

166
crypto/sha3.c Normal file
View File

@@ -0,0 +1,166 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Crypto API support for SHA-3
* (https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf)
*/
#include <crypto/internal/hash.h>
#include <crypto/sha3.h>
#include <linux/kernel.h>
#include <linux/module.h>
#define SHA3_CTX(desc) ((struct sha3_ctx *)shash_desc_ctx(desc))
static int crypto_sha3_224_init(struct shash_desc *desc)
{
sha3_224_init(SHA3_CTX(desc));
return 0;
}
static int crypto_sha3_256_init(struct shash_desc *desc)
{
sha3_256_init(SHA3_CTX(desc));
return 0;
}
static int crypto_sha3_384_init(struct shash_desc *desc)
{
sha3_384_init(SHA3_CTX(desc));
return 0;
}
static int crypto_sha3_512_init(struct shash_desc *desc)
{
sha3_512_init(SHA3_CTX(desc));
return 0;
}
static int crypto_sha3_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
sha3_update(SHA3_CTX(desc), data, len);
return 0;
}
static int crypto_sha3_final(struct shash_desc *desc, u8 *out)
{
sha3_final(SHA3_CTX(desc), out);
return 0;
}
static int crypto_sha3_224_digest(struct shash_desc *desc,
const u8 *data, unsigned int len, u8 *out)
{
sha3_224(data, len, out);
return 0;
}
static int crypto_sha3_256_digest(struct shash_desc *desc,
const u8 *data, unsigned int len, u8 *out)
{
sha3_256(data, len, out);
return 0;
}
static int crypto_sha3_384_digest(struct shash_desc *desc,
const u8 *data, unsigned int len, u8 *out)
{
sha3_384(data, len, out);
return 0;
}
static int crypto_sha3_512_digest(struct shash_desc *desc,
const u8 *data, unsigned int len, u8 *out)
{
sha3_512(data, len, out);
return 0;
}
static int crypto_sha3_export_core(struct shash_desc *desc, void *out)
{
memcpy(out, SHA3_CTX(desc), sizeof(struct sha3_ctx));
return 0;
}
static int crypto_sha3_import_core(struct shash_desc *desc, const void *in)
{
memcpy(SHA3_CTX(desc), in, sizeof(struct sha3_ctx));
return 0;
}
static struct shash_alg algs[] = { {
.digestsize = SHA3_224_DIGEST_SIZE,
.init = crypto_sha3_224_init,
.update = crypto_sha3_update,
.final = crypto_sha3_final,
.digest = crypto_sha3_224_digest,
.export_core = crypto_sha3_export_core,
.import_core = crypto_sha3_import_core,
.descsize = sizeof(struct sha3_ctx),
.base.cra_name = "sha3-224",
.base.cra_driver_name = "sha3-224-lib",
.base.cra_blocksize = SHA3_224_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
}, {
.digestsize = SHA3_256_DIGEST_SIZE,
.init = crypto_sha3_256_init,
.update = crypto_sha3_update,
.final = crypto_sha3_final,
.digest = crypto_sha3_256_digest,
.export_core = crypto_sha3_export_core,
.import_core = crypto_sha3_import_core,
.descsize = sizeof(struct sha3_ctx),
.base.cra_name = "sha3-256",
.base.cra_driver_name = "sha3-256-lib",
.base.cra_blocksize = SHA3_256_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
}, {
.digestsize = SHA3_384_DIGEST_SIZE,
.init = crypto_sha3_384_init,
.update = crypto_sha3_update,
.final = crypto_sha3_final,
.digest = crypto_sha3_384_digest,
.export_core = crypto_sha3_export_core,
.import_core = crypto_sha3_import_core,
.descsize = sizeof(struct sha3_ctx),
.base.cra_name = "sha3-384",
.base.cra_driver_name = "sha3-384-lib",
.base.cra_blocksize = SHA3_384_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
}, {
.digestsize = SHA3_512_DIGEST_SIZE,
.init = crypto_sha3_512_init,
.update = crypto_sha3_update,
.final = crypto_sha3_final,
.digest = crypto_sha3_512_digest,
.export_core = crypto_sha3_export_core,
.import_core = crypto_sha3_import_core,
.descsize = sizeof(struct sha3_ctx),
.base.cra_name = "sha3-512",
.base.cra_driver_name = "sha3-512-lib",
.base.cra_blocksize = SHA3_512_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
} };
static int __init crypto_sha3_mod_init(void)
{
return crypto_register_shashes(algs, ARRAY_SIZE(algs));
}
module_init(crypto_sha3_mod_init);
static void __exit crypto_sha3_mod_exit(void)
{
crypto_unregister_shashes(algs, ARRAY_SIZE(algs));
}
module_exit(crypto_sha3_mod_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Crypto API support for SHA-3");
MODULE_ALIAS_CRYPTO("sha3-224");
MODULE_ALIAS_CRYPTO("sha3-224-lib");
MODULE_ALIAS_CRYPTO("sha3-256");
MODULE_ALIAS_CRYPTO("sha3-256-lib");
MODULE_ALIAS_CRYPTO("sha3-384");
MODULE_ALIAS_CRYPTO("sha3-384-lib");
MODULE_ALIAS_CRYPTO("sha3-512");
MODULE_ALIAS_CRYPTO("sha3-512-lib");

View File

@@ -1,290 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Cryptographic API.
*
* SHA-3, as specified in
* https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf
*
* SHA-3 code by Jeff Garzik <jeff@garzik.org>
* Ard Biesheuvel <ard.biesheuvel@linaro.org>
*/
#include <crypto/internal/hash.h>
#include <crypto/sha3.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/string.h>
#include <linux/unaligned.h>
/*
* On some 32-bit architectures (h8300), GCC ends up using
* over 1 KB of stack if we inline the round calculation into the loop
* in keccakf(). On the other hand, on 64-bit architectures with plenty
* of [64-bit wide] general purpose registers, not inlining it severely
* hurts performance. So let's use 64-bitness as a heuristic to decide
* whether to inline or not.
*/
#ifdef CONFIG_64BIT
#define SHA3_INLINE inline
#else
#define SHA3_INLINE noinline
#endif
#define KECCAK_ROUNDS 24
static const u64 keccakf_rndc[24] = {
0x0000000000000001ULL, 0x0000000000008082ULL, 0x800000000000808aULL,
0x8000000080008000ULL, 0x000000000000808bULL, 0x0000000080000001ULL,
0x8000000080008081ULL, 0x8000000000008009ULL, 0x000000000000008aULL,
0x0000000000000088ULL, 0x0000000080008009ULL, 0x000000008000000aULL,
0x000000008000808bULL, 0x800000000000008bULL, 0x8000000000008089ULL,
0x8000000000008003ULL, 0x8000000000008002ULL, 0x8000000000000080ULL,
0x000000000000800aULL, 0x800000008000000aULL, 0x8000000080008081ULL,
0x8000000000008080ULL, 0x0000000080000001ULL, 0x8000000080008008ULL
};
/* update the state with given number of rounds */
static SHA3_INLINE void keccakf_round(u64 st[25])
{
u64 t[5], tt, bc[5];
/* Theta */
bc[0] = st[0] ^ st[5] ^ st[10] ^ st[15] ^ st[20];
bc[1] = st[1] ^ st[6] ^ st[11] ^ st[16] ^ st[21];
bc[2] = st[2] ^ st[7] ^ st[12] ^ st[17] ^ st[22];
bc[3] = st[3] ^ st[8] ^ st[13] ^ st[18] ^ st[23];
bc[4] = st[4] ^ st[9] ^ st[14] ^ st[19] ^ st[24];
t[0] = bc[4] ^ rol64(bc[1], 1);
t[1] = bc[0] ^ rol64(bc[2], 1);
t[2] = bc[1] ^ rol64(bc[3], 1);
t[3] = bc[2] ^ rol64(bc[4], 1);
t[4] = bc[3] ^ rol64(bc[0], 1);
st[0] ^= t[0];
/* Rho Pi */
tt = st[1];
st[ 1] = rol64(st[ 6] ^ t[1], 44);
st[ 6] = rol64(st[ 9] ^ t[4], 20);
st[ 9] = rol64(st[22] ^ t[2], 61);
st[22] = rol64(st[14] ^ t[4], 39);
st[14] = rol64(st[20] ^ t[0], 18);
st[20] = rol64(st[ 2] ^ t[2], 62);
st[ 2] = rol64(st[12] ^ t[2], 43);
st[12] = rol64(st[13] ^ t[3], 25);
st[13] = rol64(st[19] ^ t[4], 8);
st[19] = rol64(st[23] ^ t[3], 56);
st[23] = rol64(st[15] ^ t[0], 41);
st[15] = rol64(st[ 4] ^ t[4], 27);
st[ 4] = rol64(st[24] ^ t[4], 14);
st[24] = rol64(st[21] ^ t[1], 2);
st[21] = rol64(st[ 8] ^ t[3], 55);
st[ 8] = rol64(st[16] ^ t[1], 45);
st[16] = rol64(st[ 5] ^ t[0], 36);
st[ 5] = rol64(st[ 3] ^ t[3], 28);
st[ 3] = rol64(st[18] ^ t[3], 21);
st[18] = rol64(st[17] ^ t[2], 15);
st[17] = rol64(st[11] ^ t[1], 10);
st[11] = rol64(st[ 7] ^ t[2], 6);
st[ 7] = rol64(st[10] ^ t[0], 3);
st[10] = rol64( tt ^ t[1], 1);
/* Chi */
bc[ 0] = ~st[ 1] & st[ 2];
bc[ 1] = ~st[ 2] & st[ 3];
bc[ 2] = ~st[ 3] & st[ 4];
bc[ 3] = ~st[ 4] & st[ 0];
bc[ 4] = ~st[ 0] & st[ 1];
st[ 0] ^= bc[ 0];
st[ 1] ^= bc[ 1];
st[ 2] ^= bc[ 2];
st[ 3] ^= bc[ 3];
st[ 4] ^= bc[ 4];
bc[ 0] = ~st[ 6] & st[ 7];
bc[ 1] = ~st[ 7] & st[ 8];
bc[ 2] = ~st[ 8] & st[ 9];
bc[ 3] = ~st[ 9] & st[ 5];
bc[ 4] = ~st[ 5] & st[ 6];
st[ 5] ^= bc[ 0];
st[ 6] ^= bc[ 1];
st[ 7] ^= bc[ 2];
st[ 8] ^= bc[ 3];
st[ 9] ^= bc[ 4];
bc[ 0] = ~st[11] & st[12];
bc[ 1] = ~st[12] & st[13];
bc[ 2] = ~st[13] & st[14];
bc[ 3] = ~st[14] & st[10];
bc[ 4] = ~st[10] & st[11];
st[10] ^= bc[ 0];
st[11] ^= bc[ 1];
st[12] ^= bc[ 2];
st[13] ^= bc[ 3];
st[14] ^= bc[ 4];
bc[ 0] = ~st[16] & st[17];
bc[ 1] = ~st[17] & st[18];
bc[ 2] = ~st[18] & st[19];
bc[ 3] = ~st[19] & st[15];
bc[ 4] = ~st[15] & st[16];
st[15] ^= bc[ 0];
st[16] ^= bc[ 1];
st[17] ^= bc[ 2];
st[18] ^= bc[ 3];
st[19] ^= bc[ 4];
bc[ 0] = ~st[21] & st[22];
bc[ 1] = ~st[22] & st[23];
bc[ 2] = ~st[23] & st[24];
bc[ 3] = ~st[24] & st[20];
bc[ 4] = ~st[20] & st[21];
st[20] ^= bc[ 0];
st[21] ^= bc[ 1];
st[22] ^= bc[ 2];
st[23] ^= bc[ 3];
st[24] ^= bc[ 4];
}
static void keccakf(u64 st[25])
{
int round;
for (round = 0; round < KECCAK_ROUNDS; round++) {
keccakf_round(st);
/* Iota */
st[0] ^= keccakf_rndc[round];
}
}
int crypto_sha3_init(struct shash_desc *desc)
{
struct sha3_state *sctx = shash_desc_ctx(desc);
memset(sctx->st, 0, sizeof(sctx->st));
return 0;
}
EXPORT_SYMBOL(crypto_sha3_init);
static int crypto_sha3_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
unsigned int rsiz = crypto_shash_blocksize(desc->tfm);
struct sha3_state *sctx = shash_desc_ctx(desc);
unsigned int rsizw = rsiz / 8;
do {
int i;
for (i = 0; i < rsizw; i++)
sctx->st[i] ^= get_unaligned_le64(data + 8 * i);
keccakf(sctx->st);
data += rsiz;
len -= rsiz;
} while (len >= rsiz);
return len;
}
static int crypto_sha3_finup(struct shash_desc *desc, const u8 *src,
unsigned int len, u8 *out)
{
unsigned int digest_size = crypto_shash_digestsize(desc->tfm);
unsigned int rsiz = crypto_shash_blocksize(desc->tfm);
struct sha3_state *sctx = shash_desc_ctx(desc);
__le64 block[SHA3_224_BLOCK_SIZE / 8] = {};
__le64 *digest = (__le64 *)out;
unsigned int rsizw = rsiz / 8;
u8 *p;
int i;
p = memcpy(block, src, len);
p[len++] = 0x06;
p[rsiz - 1] |= 0x80;
for (i = 0; i < rsizw; i++)
sctx->st[i] ^= le64_to_cpu(block[i]);
memzero_explicit(block, sizeof(block));
keccakf(sctx->st);
for (i = 0; i < digest_size / 8; i++)
put_unaligned_le64(sctx->st[i], digest++);
if (digest_size & 4)
put_unaligned_le32(sctx->st[i], (__le32 *)digest);
return 0;
}
static struct shash_alg algs[] = { {
.digestsize = SHA3_224_DIGEST_SIZE,
.init = crypto_sha3_init,
.update = crypto_sha3_update,
.finup = crypto_sha3_finup,
.descsize = SHA3_STATE_SIZE,
.base.cra_name = "sha3-224",
.base.cra_driver_name = "sha3-224-generic",
.base.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY,
.base.cra_blocksize = SHA3_224_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
}, {
.digestsize = SHA3_256_DIGEST_SIZE,
.init = crypto_sha3_init,
.update = crypto_sha3_update,
.finup = crypto_sha3_finup,
.descsize = SHA3_STATE_SIZE,
.base.cra_name = "sha3-256",
.base.cra_driver_name = "sha3-256-generic",
.base.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY,
.base.cra_blocksize = SHA3_256_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
}, {
.digestsize = SHA3_384_DIGEST_SIZE,
.init = crypto_sha3_init,
.update = crypto_sha3_update,
.finup = crypto_sha3_finup,
.descsize = SHA3_STATE_SIZE,
.base.cra_name = "sha3-384",
.base.cra_driver_name = "sha3-384-generic",
.base.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY,
.base.cra_blocksize = SHA3_384_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
}, {
.digestsize = SHA3_512_DIGEST_SIZE,
.init = crypto_sha3_init,
.update = crypto_sha3_update,
.finup = crypto_sha3_finup,
.descsize = SHA3_STATE_SIZE,
.base.cra_name = "sha3-512",
.base.cra_driver_name = "sha3-512-generic",
.base.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY,
.base.cra_blocksize = SHA3_512_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
} };
static int __init sha3_generic_mod_init(void)
{
return crypto_register_shashes(algs, ARRAY_SIZE(algs));
}
static void __exit sha3_generic_mod_fini(void)
{
crypto_unregister_shashes(algs, ARRAY_SIZE(algs));
}
module_init(sha3_generic_mod_init);
module_exit(sha3_generic_mod_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("SHA-3 Secure Hash Algorithm");
MODULE_ALIAS_CRYPTO("sha3-224");
MODULE_ALIAS_CRYPTO("sha3-224-generic");
MODULE_ALIAS_CRYPTO("sha3-256");
MODULE_ALIAS_CRYPTO("sha3-256-generic");
MODULE_ALIAS_CRYPTO("sha3-384");
MODULE_ALIAS_CRYPTO("sha3-384-generic");
MODULE_ALIAS_CRYPTO("sha3-512");
MODULE_ALIAS_CRYPTO("sha3-512-generic");

View File

@@ -1690,10 +1690,6 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
ret = min(ret, tcrypt_test("ccm(sm4)"));
break;
case 57:
ret = min(ret, tcrypt_test("polyval"));
break;
case 58:
ret = min(ret, tcrypt_test("gcm(aria)"));
break;

View File

@@ -4332,6 +4332,7 @@ static const struct alg_test_desc alg_test_descs[] = {
.fips_allowed = 1,
}, {
.alg = "blake2b-160",
.generic_driver = "blake2b-160-lib",
.test = alg_test_hash,
.fips_allowed = 0,
.suite = {
@@ -4339,6 +4340,7 @@ static const struct alg_test_desc alg_test_descs[] = {
}
}, {
.alg = "blake2b-256",
.generic_driver = "blake2b-256-lib",
.test = alg_test_hash,
.fips_allowed = 0,
.suite = {
@@ -4346,6 +4348,7 @@ static const struct alg_test_desc alg_test_descs[] = {
}
}, {
.alg = "blake2b-384",
.generic_driver = "blake2b-384-lib",
.test = alg_test_hash,
.fips_allowed = 0,
.suite = {
@@ -4353,6 +4356,7 @@ static const struct alg_test_desc alg_test_descs[] = {
}
}, {
.alg = "blake2b-512",
.generic_driver = "blake2b-512-lib",
.test = alg_test_hash,
.fips_allowed = 0,
.suite = {
@@ -5055,8 +5059,7 @@ static const struct alg_test_desc alg_test_descs[] = {
}
}, {
.alg = "hctr2(aes)",
.generic_driver =
"hctr2_base(xctr(aes-generic),polyval-generic)",
.generic_driver = "hctr2_base(xctr(aes-generic),polyval-lib)",
.test = alg_test_skcipher,
.suite = {
.cipher = __VECS(aes_hctr2_tv_template)
@@ -5100,6 +5103,7 @@ static const struct alg_test_desc alg_test_descs[] = {
}
}, {
.alg = "hmac(sha3-224)",
.generic_driver = "hmac(sha3-224-lib)",
.test = alg_test_hash,
.fips_allowed = 1,
.suite = {
@@ -5107,6 +5111,7 @@ static const struct alg_test_desc alg_test_descs[] = {
}
}, {
.alg = "hmac(sha3-256)",
.generic_driver = "hmac(sha3-256-lib)",
.test = alg_test_hash,
.fips_allowed = 1,
.suite = {
@@ -5114,6 +5119,7 @@ static const struct alg_test_desc alg_test_descs[] = {
}
}, {
.alg = "hmac(sha3-384)",
.generic_driver = "hmac(sha3-384-lib)",
.test = alg_test_hash,
.fips_allowed = 1,
.suite = {
@@ -5121,6 +5127,7 @@ static const struct alg_test_desc alg_test_descs[] = {
}
}, {
.alg = "hmac(sha3-512)",
.generic_driver = "hmac(sha3-512-lib)",
.test = alg_test_hash,
.fips_allowed = 1,
.suite = {
@@ -5363,12 +5370,6 @@ static const struct alg_test_desc alg_test_descs[] = {
.alg = "pkcs1pad(rsa)",
.test = alg_test_null,
.fips_allowed = 1,
}, {
.alg = "polyval",
.test = alg_test_hash,
.suite = {
.hash = __VECS(polyval_tv_template)
}
}, {
.alg = "rfc3686(ctr(aes))",
.test = alg_test_skcipher,
@@ -5474,6 +5475,7 @@ static const struct alg_test_desc alg_test_descs[] = {
}
}, {
.alg = "sha3-224",
.generic_driver = "sha3-224-lib",
.test = alg_test_hash,
.fips_allowed = 1,
.suite = {
@@ -5481,6 +5483,7 @@ static const struct alg_test_desc alg_test_descs[] = {
}
}, {
.alg = "sha3-256",
.generic_driver = "sha3-256-lib",
.test = alg_test_hash,
.fips_allowed = 1,
.suite = {
@@ -5488,6 +5491,7 @@ static const struct alg_test_desc alg_test_descs[] = {
}
}, {
.alg = "sha3-384",
.generic_driver = "sha3-384-lib",
.test = alg_test_hash,
.fips_allowed = 1,
.suite = {
@@ -5495,6 +5499,7 @@ static const struct alg_test_desc alg_test_descs[] = {
}
}, {
.alg = "sha3-512",
.generic_driver = "sha3-512-lib",
.test = alg_test_hash,
.fips_allowed = 1,
.suite = {

View File

@@ -36235,177 +36235,6 @@ static const struct cipher_testvec aes_xctr_tv_template[] = {
};
/*
* Test vectors generated using https://github.com/google/hctr2
*
* To ensure compatibility with RFC 8452, some tests were sourced from
* https://datatracker.ietf.org/doc/html/rfc8452
*/
static const struct hash_testvec polyval_tv_template[] = {
{ // From RFC 8452
.key = "\x31\x07\x28\xd9\x91\x1f\x1f\x38"
"\x37\xb2\x43\x16\xc3\xfa\xb9\xa0",
.plaintext = "\x65\x78\x61\x6d\x70\x6c\x65\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00"
"\x48\x65\x6c\x6c\x6f\x20\x77\x6f"
"\x72\x6c\x64\x00\x00\x00\x00\x00"
"\x38\x00\x00\x00\x00\x00\x00\x00"
"\x58\x00\x00\x00\x00\x00\x00\x00",
.digest = "\xad\x7f\xcf\x0b\x51\x69\x85\x16"
"\x62\x67\x2f\x3c\x5f\x95\x13\x8f",
.psize = 48,
.ksize = 16,
},
{ // From RFC 8452
.key = "\xd9\xb3\x60\x27\x96\x94\x94\x1a"
"\xc5\xdb\xc6\x98\x7a\xda\x73\x77",
.plaintext = "\x00\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00",
.digest = "\x00\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00",
.psize = 16,
.ksize = 16,
},
{ // From RFC 8452
.key = "\xd9\xb3\x60\x27\x96\x94\x94\x1a"
"\xc5\xdb\xc6\x98\x7a\xda\x73\x77",
.plaintext = "\x01\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00"
"\x40\x00\x00\x00\x00\x00\x00\x00",
.digest = "\xeb\x93\xb7\x74\x09\x62\xc5\xe4"
"\x9d\x2a\x90\xa7\xdc\x5c\xec\x74",
.psize = 32,
.ksize = 16,
},
{ // From RFC 8452
.key = "\xd9\xb3\x60\x27\x96\x94\x94\x1a"
"\xc5\xdb\xc6\x98\x7a\xda\x73\x77",
.plaintext = "\x01\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00"
"\x02\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00"
"\x03\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00"
"\x80\x01\x00\x00\x00\x00\x00\x00",
.digest = "\x81\x38\x87\x46\xbc\x22\xd2\x6b"
"\x2a\xbc\x3d\xcb\x15\x75\x42\x22",
.psize = 64,
.ksize = 16,
},
{ // From RFC 8452
.key = "\xd9\xb3\x60\x27\x96\x94\x94\x1a"
"\xc5\xdb\xc6\x98\x7a\xda\x73\x77",
.plaintext = "\x01\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00"
"\x02\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00"
"\x03\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00"
"\x04\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00"
"\x00\x02\x00\x00\x00\x00\x00\x00",
.digest = "\x1e\x39\xb6\xd3\x34\x4d\x34\x8f"
"\x60\x44\xf8\x99\x35\xd1\xcf\x78",
.psize = 80,
.ksize = 16,
},
{ // From RFC 8452
.key = "\xd9\xb3\x60\x27\x96\x94\x94\x1a"
"\xc5\xdb\xc6\x98\x7a\xda\x73\x77",
.plaintext = "\x01\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00"
"\x02\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00"
"\x03\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00"
"\x04\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00"
"\x05\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00"
"\x08\x00\x00\x00\x00\x00\x00\x00"
"\x00\x02\x00\x00\x00\x00\x00\x00",
.digest = "\xff\xcd\x05\xd5\x77\x0f\x34\xad"
"\x92\x67\xf0\xa5\x99\x94\xb1\x5a",
.psize = 96,
.ksize = 16,
},
{ // Random ( 1)
.key = "\x90\xcc\xac\xee\xba\xd7\xd4\x68"
"\x98\xa6\x79\x70\xdf\x66\x15\x6c",
.plaintext = "",
.digest = "\x00\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00",
.psize = 0,
.ksize = 16,
},
{ // Random ( 1)
.key = "\xc1\x45\x71\xf0\x30\x07\x94\xe7"
"\x3a\xdd\xe4\xc6\x19\x2d\x02\xa2",
.plaintext = "\xc1\x5d\x47\xc7\x4c\x7c\x5e\x07"
"\x85\x14\x8f\x79\xcc\x73\x83\xf7"
"\x35\xb8\xcb\x73\x61\xf0\x53\x31"
"\xbf\x84\xde\xb6\xde\xaf\xb0\xb8"
"\xb7\xd9\x11\x91\x89\xfd\x1e\x4c"
"\x84\x4a\x1f\x2a\x87\xa4\xaf\x62"
"\x8d\x7d\x58\xf6\x43\x35\xfc\x53"
"\x8f\x1a\xf6\x12\xe1\x13\x3f\x66"
"\x91\x4b\x13\xd6\x45\xfb\xb0\x7a"
"\xe0\x8b\x8e\x99\xf7\x86\x46\x37"
"\xd1\x22\x9e\x52\xf3\x3f\xd9\x75"
"\x2c\x2c\xc6\xbb\x0e\x08\x14\x29"
"\xe8\x50\x2f\xd8\xbe\xf4\xe9\x69"
"\x4a\xee\xf7\xae\x15\x65\x35\x1e",
.digest = "\x00\x4f\x5d\xe9\x3b\xc0\xd6\x50"
"\x3e\x38\x73\x86\xc6\xda\xca\x7f",
.psize = 112,
.ksize = 16,
},
{ // Random ( 1)
.key = "\x37\xbe\x68\x16\x50\xb9\x4e\xb0"
"\x47\xde\xe2\xbd\xde\xe4\x48\x09",
.plaintext = "\x87\xfc\x68\x9f\xff\xf2\x4a\x1e"
"\x82\x3b\x73\x8f\xc1\xb2\x1b\x7a"
"\x6c\x4f\x81\xbc\x88\x9b\x6c\xa3"
"\x9c\xc2\xa5\xbc\x14\x70\x4c\x9b"
"\x0c\x9f\x59\x92\x16\x4b\x91\x3d"
"\x18\x55\x22\x68\x12\x8c\x63\xb2"
"\x51\xcb\x85\x4b\xd2\xae\x0b\x1c"
"\x5d\x28\x9d\x1d\xb1\xc8\xf0\x77"
"\xe9\xb5\x07\x4e\x06\xc8\xee\xf8"
"\x1b\xed\x72\x2a\x55\x7d\x16\xc9"
"\xf2\x54\xe7\xe9\xe0\x44\x5b\x33"
"\xb1\x49\xee\xff\x43\xfb\x82\xcd"
"\x4a\x70\x78\x81\xa4\x34\x36\xe8"
"\x4c\x28\x54\xa6\x6c\xc3\x6b\x78"
"\xe7\xc0\x5d\xc6\x5d\x81\xab\x70"
"\x08\x86\xa1\xfd\xf4\x77\x55\xfd"
"\xa3\xe9\xe2\x1b\xdf\x99\xb7\x80"
"\xf9\x0a\x4f\x72\x4a\xd3\xaf\xbb"
"\xb3\x3b\xeb\x08\x58\x0f\x79\xce"
"\xa5\x99\x05\x12\x34\xd4\xf4\x86"
"\x37\x23\x1d\xc8\x49\xc0\x92\xae"
"\xa6\xac\x9b\x31\x55\xed\x15\xc6"
"\x05\x17\x37\x8d\x90\x42\xe4\x87"
"\x89\x62\x88\x69\x1c\x6a\xfd\xe3"
"\x00\x2b\x47\x1a\x73\xc1\x51\xc2"
"\xc0\x62\x74\x6a\x9e\xb2\xe5\x21"
"\xbe\x90\xb5\xb0\x50\xca\x88\x68"
"\xe1\x9d\x7a\xdf\x6c\xb7\xb9\x98"
"\xee\x28\x62\x61\x8b\xd1\x47\xf9"
"\x04\x7a\x0b\x5d\xcd\x2b\x65\xf5"
"\x12\xa3\xfe\x1a\xaa\x2c\x78\x42"
"\xb8\xbe\x7d\x74\xeb\x59\xba\xba",
.digest = "\xae\x11\xd4\x60\x2a\x5f\x9e\x42"
"\x89\x04\xc2\x34\x8d\x55\x94\x0a",
.psize = 256,
.ksize = 16,
},
};
/*
* Test vectors generated using https://github.com/google/hctr2
*/

View File

@@ -90,19 +90,18 @@ static int acpi_tad_set_real_time(struct device *dev, struct acpi_tad_rt *rt)
args[0].buffer.pointer = (u8 *)rt;
args[0].buffer.length = sizeof(*rt);
pm_runtime_get_sync(dev);
PM_RUNTIME_ACQUIRE(dev, pm);
if (PM_RUNTIME_ACQUIRE_ERR(&pm))
return -ENXIO;
status = acpi_evaluate_integer(handle, "_SRT", &arg_list, &retval);
pm_runtime_put_sync(dev);
if (ACPI_FAILURE(status) || retval)
return -EIO;
return 0;
}
static int acpi_tad_get_real_time(struct device *dev, struct acpi_tad_rt *rt)
static int acpi_tad_evaluate_grt(struct device *dev, struct acpi_tad_rt *rt)
{
acpi_handle handle = ACPI_HANDLE(dev);
struct acpi_buffer output = { ACPI_ALLOCATE_BUFFER };
@@ -111,12 +110,7 @@ static int acpi_tad_get_real_time(struct device *dev, struct acpi_tad_rt *rt)
acpi_status status;
int ret = -EIO;
pm_runtime_get_sync(dev);
status = acpi_evaluate_object(handle, "_GRT", NULL, &output);
pm_runtime_put_sync(dev);
if (ACPI_FAILURE(status))
goto out_free;
@@ -139,6 +133,21 @@ out_free:
return ret;
}
static int acpi_tad_get_real_time(struct device *dev, struct acpi_tad_rt *rt)
{
int ret;
PM_RUNTIME_ACQUIRE(dev, pm);
if (PM_RUNTIME_ACQUIRE_ERR(&pm))
return -ENXIO;
ret = acpi_tad_evaluate_grt(dev, rt);
if (ret)
return ret;
return 0;
}
static char *acpi_tad_rt_next_field(char *s, int *val)
{
char *p;
@@ -266,12 +275,11 @@ static int acpi_tad_wake_set(struct device *dev, char *method, u32 timer_id,
args[0].integer.value = timer_id;
args[1].integer.value = value;
pm_runtime_get_sync(dev);
PM_RUNTIME_ACQUIRE(dev, pm);
if (PM_RUNTIME_ACQUIRE_ERR(&pm))
return -ENXIO;
status = acpi_evaluate_integer(handle, method, &arg_list, &retval);
pm_runtime_put_sync(dev);
if (ACPI_FAILURE(status) || retval)
return -EIO;
@@ -314,12 +322,11 @@ static ssize_t acpi_tad_wake_read(struct device *dev, char *buf, char *method,
args[0].integer.value = timer_id;
pm_runtime_get_sync(dev);
PM_RUNTIME_ACQUIRE(dev, pm);
if (PM_RUNTIME_ACQUIRE_ERR(&pm))
return -ENXIO;
status = acpi_evaluate_integer(handle, method, &arg_list, &retval);
pm_runtime_put_sync(dev);
if (ACPI_FAILURE(status))
return -EIO;
@@ -370,12 +377,11 @@ static int acpi_tad_clear_status(struct device *dev, u32 timer_id)
args[0].integer.value = timer_id;
pm_runtime_get_sync(dev);
PM_RUNTIME_ACQUIRE(dev, pm);
if (PM_RUNTIME_ACQUIRE_ERR(&pm))
return -ENXIO;
status = acpi_evaluate_integer(handle, "_CWS", &arg_list, &retval);
pm_runtime_put_sync(dev);
if (ACPI_FAILURE(status) || retval)
return -EIO;
@@ -411,12 +417,11 @@ static ssize_t acpi_tad_status_read(struct device *dev, char *buf, u32 timer_id)
args[0].integer.value = timer_id;
pm_runtime_get_sync(dev);
PM_RUNTIME_ACQUIRE(dev, pm);
if (PM_RUNTIME_ACQUIRE_ERR(&pm))
return -ENXIO;
status = acpi_evaluate_integer(handle, "_GWS", &arg_list, &retval);
pm_runtime_put_sync(dev);
if (ACPI_FAILURE(status))
return -EIO;
@@ -563,8 +568,6 @@ static void acpi_tad_remove(struct platform_device *pdev)
device_init_wakeup(dev, false);
pm_runtime_get_sync(dev);
if (dd->capabilities & ACPI_TAD_RT)
sysfs_remove_group(&dev->kobj, &acpi_tad_time_attr_group);
@@ -573,14 +576,16 @@ static void acpi_tad_remove(struct platform_device *pdev)
sysfs_remove_group(&dev->kobj, &acpi_tad_attr_group);
acpi_tad_disable_timer(dev, ACPI_TAD_AC_TIMER);
acpi_tad_clear_status(dev, ACPI_TAD_AC_TIMER);
if (dd->capabilities & ACPI_TAD_DC_WAKE) {
acpi_tad_disable_timer(dev, ACPI_TAD_DC_TIMER);
acpi_tad_clear_status(dev, ACPI_TAD_DC_TIMER);
scoped_guard(pm_runtime_noresume, dev) {
acpi_tad_disable_timer(dev, ACPI_TAD_AC_TIMER);
acpi_tad_clear_status(dev, ACPI_TAD_AC_TIMER);
if (dd->capabilities & ACPI_TAD_DC_WAKE) {
acpi_tad_disable_timer(dev, ACPI_TAD_DC_TIMER);
acpi_tad_clear_status(dev, ACPI_TAD_DC_TIMER);
}
}
pm_runtime_put_sync(dev);
pm_runtime_suspend(dev);
pm_runtime_disable(dev);
acpi_remove_cmos_rtc_space_handler(handle);
}

View File

@@ -169,9 +169,12 @@ acpi_ns_walk_namespace(acpi_object_type type,
if (start_node == ACPI_ROOT_OBJECT) {
start_node = acpi_gbl_root_node;
if (!start_node) {
return_ACPI_STATUS(AE_NO_NAMESPACE);
}
}
/* Avoid walking the namespace if the StartNode is NULL */
if (!start_node) {
return_ACPI_STATUS(AE_NO_NAMESPACE);
}
/* Null child means "get first node" */

View File

@@ -91,7 +91,6 @@ enum {
};
struct acpi_battery {
struct mutex lock;
struct mutex update_lock;
struct power_supply *bat;
struct power_supply_desc bat_desc;
@@ -535,11 +534,9 @@ static int acpi_battery_get_info(struct acpi_battery *battery)
struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
acpi_status status = AE_ERROR;
mutex_lock(&battery->lock);
status = acpi_evaluate_object(battery->device->handle,
use_bix ? "_BIX":"_BIF",
NULL, &buffer);
mutex_unlock(&battery->lock);
if (ACPI_FAILURE(status)) {
acpi_handle_info(battery->device->handle,
@@ -576,11 +573,8 @@ static int acpi_battery_get_state(struct acpi_battery *battery)
msecs_to_jiffies(cache_time)))
return 0;
mutex_lock(&battery->lock);
status = acpi_evaluate_object(battery->device->handle, "_BST",
NULL, &buffer);
mutex_unlock(&battery->lock);
if (ACPI_FAILURE(status)) {
acpi_handle_info(battery->device->handle,
"_BST evaluation failed: %s",
@@ -628,11 +622,8 @@ static int acpi_battery_set_alarm(struct acpi_battery *battery)
!test_bit(ACPI_BATTERY_ALARM_PRESENT, &battery->flags))
return -ENODEV;
mutex_lock(&battery->lock);
status = acpi_execute_simple_method(battery->device->handle, "_BTP",
battery->alarm);
mutex_unlock(&battery->lock);
if (ACPI_FAILURE(status))
return -ENODEV;
@@ -1235,9 +1226,6 @@ static int acpi_battery_add(struct acpi_device *device)
strscpy(acpi_device_name(device), ACPI_BATTERY_DEVICE_NAME);
strscpy(acpi_device_class(device), ACPI_BATTERY_CLASS);
device->driver_data = battery;
result = devm_mutex_init(&device->dev, &battery->lock);
if (result)
return result;
result = devm_mutex_init(&device->dev, &battery->update_lock);
if (result)

View File

@@ -1,4 +1,3 @@
# SPDX-License-Identifier: GPL-2.0-only
obj-$(CONFIG_ACPI) += int340x_thermal.o
obj-$(CONFIG_DPTF_POWER) += dptf_power.o
obj-$(CONFIG_DPTF_PCH_FIVR) += dptf_pch_fivr.o

View File

@@ -41,7 +41,7 @@ static int pch_fivr_read(acpi_handle handle, char *method, struct pch_fivr_resp
ret = 0;
release_buffer:
kfree(buffer.pointer);
ACPI_FREE(buffer.pointer);
return ret;
}

View File

@@ -240,6 +240,8 @@ static const struct acpi_device_id int3407_device_ids[] = {
{"INTC10D9", 0},
{"INTC1100", 0},
{"INTC1101", 0},
{"INTC10F7", 0},
{"INTC10F8", 0},
{"", 0},
};
MODULE_DEVICE_TABLE(acpi, int3407_device_ids);

View File

@@ -1,94 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* ACPI support for int340x thermal drivers
*
* Copyright (C) 2014, Intel Corporation
* Authors: Zhang Rui <rui.zhang@intel.com>
*/
#include <linux/acpi.h>
#include <linux/module.h>
#include "../internal.h"
#define INT3401_DEVICE 0X01
static const struct acpi_device_id int340x_thermal_device_ids[] = {
{"INT3400"},
{"INT3401", INT3401_DEVICE},
{"INT3402"},
{"INT3403"},
{"INT3404"},
{"INT3406"},
{"INT3407"},
{"INT3408"},
{"INT3409"},
{"INT340A"},
{"INT340B"},
{"INT3532"},
{"INTC1040"},
{"INTC1041"},
{"INTC1042"},
{"INTC1043"},
{"INTC1044"},
{"INTC1045"},
{"INTC1046"},
{"INTC1047"},
{"INTC1048"},
{"INTC1049"},
{"INTC1050"},
{"INTC1060"},
{"INTC1061"},
{"INTC1062"},
{"INTC1063"},
{"INTC1064"},
{"INTC1065"},
{"INTC1066"},
{"INTC1068"},
{"INTC1069"},
{"INTC106A"},
{"INTC106B"},
{"INTC106C"},
{"INTC106D"},
{"INTC10A0"},
{"INTC10A1"},
{"INTC10A2"},
{"INTC10A3"},
{"INTC10A4"},
{"INTC10A5"},
{"INTC10D4"},
{"INTC10D5"},
{"INTC10D6"},
{"INTC10D7"},
{"INTC10D8"},
{"INTC10D9"},
{"INTC10FC"},
{"INTC10FD"},
{"INTC10FE"},
{"INTC10FF"},
{"INTC1100"},
{"INTC1101"},
{"INTC1102"},
{""},
};
static int int340x_thermal_handler_attach(struct acpi_device *adev,
const struct acpi_device_id *id)
{
if (IS_ENABLED(CONFIG_INT340X_THERMAL))
acpi_create_platform_device(adev, NULL);
/* Intel SoC DTS thermal driver needs INT3401 to set IRQ descriptor */
else if (IS_ENABLED(CONFIG_INTEL_SOC_DTS_THERMAL) &&
id->driver_data == INT3401_DEVICE)
acpi_create_platform_device(adev, NULL);
return 1;
}
static struct acpi_scan_handler int340x_thermal_handler = {
.ids = int340x_thermal_device_ids,
.attach = int340x_thermal_handler_attach,
};
void __init acpi_int340x_thermal_init(void)
{
acpi_scan_add_handler(&int340x_thermal_handler);
}

View File

@@ -2294,7 +2294,8 @@ static int acpi_ec_init_workqueues(void)
ec_wq = alloc_ordered_workqueue("kec", 0);
if (!ec_query_wq)
ec_query_wq = alloc_workqueue("kec_query", 0, ec_max_queries);
ec_query_wq = alloc_workqueue("kec_query", WQ_PERCPU,
ec_max_queries);
if (!ec_wq || !ec_query_wq) {
acpi_ec_destroy_workqueues();

View File

@@ -11,6 +11,7 @@
#define _ACPI_FAN_H_
#include <linux/kconfig.h>
#include <linux/limits.h>
#define ACPI_FAN_DEVICE_IDS \
{"INT3404", }, /* Fan */ \
@@ -21,6 +22,7 @@
{"INTC10A2", }, /* Fan for Raptor Lake generation */ \
{"INTC10D6", }, /* Fan for Panther Lake generation */ \
{"INTC10FE", }, /* Fan for Wildcat Lake generation */ \
{"INTC10F5", }, /* Fan for Nova Lake generation */ \
{"PNP0C0B", } /* Generic ACPI fan */
#define ACPI_FPS_NAME_LEN 20
@@ -55,19 +57,58 @@ struct acpi_fan {
struct acpi_fan_fif fif;
struct acpi_fan_fps *fps;
int fps_count;
/* A value of 0 means that trippoint-related functions are not supported */
u32 fan_trip_granularity;
#if IS_REACHABLE(CONFIG_HWMON)
struct device *hdev;
#endif
struct thermal_cooling_device *cdev;
struct device_attribute fst_speed;
struct device_attribute fine_grain_control;
};
/**
* acpi_fan_speed_valid - Check if fan speed value is valid
* @speeed: Speed value returned by the ACPI firmware
*
* Check if the fan speed value returned by the ACPI firmware is valid. This function is
* necessary as ACPI firmware implementations can return 0xFFFFFFFF to signal that the
* ACPI fan does not support speed reporting. Additionally, some buggy ACPI firmware
* implementations return a value larger than the 32-bit integer value defined by
* the ACPI specification when using placeholder values. Such invalid values are also
* detected by this function.
*
* Returns: True if the fan speed value is valid, false otherwise.
*/
static inline bool acpi_fan_speed_valid(u64 speed)
{
return speed < U32_MAX;
}
/**
* acpi_fan_power_valid - Check if fan power value is valid
* @power: Power value returned by the ACPI firmware
*
* Check if the fan power value returned by the ACPI firmware is valid.
* See acpi_fan_speed_valid() for details.
*
* Returns: True if the fan power value is valid, false otherwise.
*/
static inline bool acpi_fan_power_valid(u64 power)
{
return power < U32_MAX;
}
int acpi_fan_get_fst(acpi_handle handle, struct acpi_fan_fst *fst);
int acpi_fan_create_attributes(struct acpi_device *device);
void acpi_fan_delete_attributes(struct acpi_device *device);
#if IS_REACHABLE(CONFIG_HWMON)
int devm_acpi_fan_create_hwmon(struct device *dev);
void acpi_fan_notify_hwmon(struct device *dev);
#else
static inline int devm_acpi_fan_create_hwmon(struct device *dev) { return 0; };
static inline void acpi_fan_notify_hwmon(struct device *dev) { };
#endif
#endif

View File

@@ -7,11 +7,16 @@
* Copyright (C) 2022 Intel Corporation. All rights reserved.
*/
#include <linux/bits.h>
#include <linux/kernel.h>
#include <linux/limits.h>
#include <linux/math.h>
#include <linux/math64.h>
#include <linux/module.h>
#include <linux/init.h>
#include <linux/types.h>
#include <linux/uaccess.h>
#include <linux/uuid.h>
#include <linux/thermal.h>
#include <linux/acpi.h>
#include <linux/platform_device.h>
@@ -19,6 +24,26 @@
#include "fan.h"
#define ACPI_FAN_NOTIFY_STATE_CHANGED 0x80
/*
* Defined inside the "Fan Noise Signal" section at
* https://learn.microsoft.com/en-us/windows-hardware/design/device-experiences/design-guide.
*/
static const guid_t acpi_fan_microsoft_guid = GUID_INIT(0xA7611840, 0x99FE, 0x41AE, 0xA4, 0x88,
0x35, 0xC7, 0x59, 0x26, 0xC8, 0xEB);
#define ACPI_FAN_DSM_GET_TRIP_POINT_GRANULARITY 1
#define ACPI_FAN_DSM_SET_TRIP_POINTS 2
#define ACPI_FAN_DSM_GET_OPERATING_RANGES 3
/*
* Ensures that fans with a very low trip point granularity
* do not send too many notifications.
*/
static uint min_trip_distance = 100;
module_param(min_trip_distance, uint, 0);
MODULE_PARM_DESC(min_trip_distance, "Minimum distance between fan speed trip points in RPM");
static const struct acpi_device_id fan_device_ids[] = {
ACPI_FAN_DEVICE_IDS,
{"", 0},
@@ -308,6 +333,182 @@ err:
return status;
}
static int acpi_fan_dsm_init(struct device *dev)
{
union acpi_object dummy = {
.package = {
.type = ACPI_TYPE_PACKAGE,
.count = 0,
.elements = NULL,
},
};
struct acpi_fan *fan = dev_get_drvdata(dev);
union acpi_object *obj;
int ret = 0;
if (!acpi_check_dsm(fan->handle, &acpi_fan_microsoft_guid, 0,
BIT(ACPI_FAN_DSM_GET_TRIP_POINT_GRANULARITY) |
BIT(ACPI_FAN_DSM_SET_TRIP_POINTS)))
return 0;
dev_info(dev, "Using Microsoft fan extensions\n");
obj = acpi_evaluate_dsm_typed(fan->handle, &acpi_fan_microsoft_guid, 0,
ACPI_FAN_DSM_GET_TRIP_POINT_GRANULARITY, &dummy,
ACPI_TYPE_INTEGER);
if (!obj)
return -EIO;
if (obj->integer.value > U32_MAX)
ret = -EOVERFLOW;
else
fan->fan_trip_granularity = obj->integer.value;
kfree(obj);
return ret;
}
static int acpi_fan_dsm_set_trip_points(struct device *dev, u64 upper, u64 lower)
{
union acpi_object args[2] = {
{
.integer = {
.type = ACPI_TYPE_INTEGER,
.value = lower,
},
},
{
.integer = {
.type = ACPI_TYPE_INTEGER,
.value = upper,
},
},
};
struct acpi_fan *fan = dev_get_drvdata(dev);
union acpi_object in = {
.package = {
.type = ACPI_TYPE_PACKAGE,
.count = ARRAY_SIZE(args),
.elements = args,
},
};
union acpi_object *obj;
obj = acpi_evaluate_dsm(fan->handle, &acpi_fan_microsoft_guid, 0,
ACPI_FAN_DSM_SET_TRIP_POINTS, &in);
kfree(obj);
return 0;
}
static int acpi_fan_dsm_start(struct device *dev)
{
struct acpi_fan *fan = dev_get_drvdata(dev);
int ret;
if (!fan->fan_trip_granularity)
return 0;
/*
* Some firmware implementations only update the values returned by the
* _FST control method when a notification is received. This usually
* works with Microsoft Windows as setting up trip points will keep
* triggering said notifications, but will cause issues when using _FST
* without the Microsoft-specific trip point extension.
*
* Because of this, an initial notification needs to be triggered to
* start the cycle of trip points updates. This is achieved by setting
* the trip points sequencially to two separate ranges. As by the
* Microsoft specification the firmware should trigger a notification
* immediately if the fan speed is outside the trip point range. This
* _should_ result in at least one notification as both ranges do not
* overlap, meaning that the current fan speed needs to be outside at
* least one range.
*/
ret = acpi_fan_dsm_set_trip_points(dev, fan->fan_trip_granularity, 0);
if (ret < 0)
return ret;
return acpi_fan_dsm_set_trip_points(dev, fan->fan_trip_granularity * 3,
fan->fan_trip_granularity * 2);
}
static int acpi_fan_dsm_update_trips_points(struct device *dev, struct acpi_fan_fst *fst)
{
struct acpi_fan *fan = dev_get_drvdata(dev);
u64 upper, lower;
if (!fan->fan_trip_granularity)
return 0;
if (!acpi_fan_speed_valid(fst->speed))
return -EINVAL;
upper = roundup_u64(fst->speed + min_trip_distance, fan->fan_trip_granularity);
if (fst->speed <= min_trip_distance) {
lower = 0;
} else {
/*
* Valid fan speed values cannot be larger than 32 bit, so
* we can safely assume that no overflow will happen here.
*/
lower = rounddown((u32)fst->speed - min_trip_distance, fan->fan_trip_granularity);
}
return acpi_fan_dsm_set_trip_points(dev, upper, lower);
}
static void acpi_fan_notify_handler(acpi_handle handle, u32 event, void *context)
{
struct device *dev = context;
struct acpi_fan_fst fst;
int ret;
switch (event) {
case ACPI_FAN_NOTIFY_STATE_CHANGED:
/*
* The ACPI specification says that we must evaluate _FST when we
* receive an ACPI event indicating that the fan state has changed.
*/
ret = acpi_fan_get_fst(handle, &fst);
if (ret < 0) {
dev_err(dev, "Error retrieving current fan status: %d\n", ret);
} else {
ret = acpi_fan_dsm_update_trips_points(dev, &fst);
if (ret < 0)
dev_err(dev, "Failed to update trip points: %d\n", ret);
}
acpi_fan_notify_hwmon(dev);
acpi_bus_generate_netlink_event("fan", dev_name(dev), event, 0);
break;
default:
dev_dbg(dev, "Unsupported ACPI notification 0x%x\n", event);
break;
}
}
static void acpi_fan_notify_remove(void *data)
{
struct acpi_fan *fan = data;
acpi_remove_notify_handler(fan->handle, ACPI_DEVICE_NOTIFY, acpi_fan_notify_handler);
}
static int devm_acpi_fan_notify_init(struct device *dev)
{
struct acpi_fan *fan = dev_get_drvdata(dev);
acpi_status status;
status = acpi_install_notify_handler(fan->handle, ACPI_DEVICE_NOTIFY,
acpi_fan_notify_handler, dev);
if (ACPI_FAILURE(status))
return -EIO;
return devm_add_action_or_reset(dev, acpi_fan_notify_remove, fan);
}
static int acpi_fan_probe(struct platform_device *pdev)
{
int result = 0;
@@ -347,10 +548,24 @@ static int acpi_fan_probe(struct platform_device *pdev)
}
if (fan->has_fst) {
result = acpi_fan_dsm_init(&pdev->dev);
if (result)
return result;
result = devm_acpi_fan_create_hwmon(&pdev->dev);
if (result)
return result;
result = devm_acpi_fan_notify_init(&pdev->dev);
if (result)
return result;
result = acpi_fan_dsm_start(&pdev->dev);
if (result) {
dev_err(&pdev->dev, "Failed to start Microsoft fan extensions\n");
return result;
}
result = acpi_fan_create_attributes(device);
if (result)
return result;
@@ -436,8 +651,14 @@ static int acpi_fan_suspend(struct device *dev)
static int acpi_fan_resume(struct device *dev)
{
int result;
struct acpi_fan *fan = dev_get_drvdata(dev);
int result;
if (fan->has_fst) {
result = acpi_fan_dsm_start(dev);
if (result)
dev_err(dev, "Failed to start Microsoft fan extensions: %d\n", result);
}
if (fan->acpi4)
return 0;

View File

@@ -15,10 +15,6 @@
#include "fan.h"
/* Returned when the ACPI fan does not support speed reporting */
#define FAN_SPEED_UNAVAILABLE U32_MAX
#define FAN_POWER_UNAVAILABLE U32_MAX
static struct acpi_fan_fps *acpi_fan_get_current_fps(struct acpi_fan *fan, u64 control)
{
unsigned int i;
@@ -77,7 +73,7 @@ static umode_t acpi_fan_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_
* when the associated attribute should not be created.
*/
for (i = 0; i < fan->fps_count; i++) {
if (fan->fps[i].power != FAN_POWER_UNAVAILABLE)
if (acpi_fan_power_valid(fan->fps[i].power))
return 0444;
}
@@ -106,7 +102,7 @@ static int acpi_fan_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
case hwmon_fan:
switch (attr) {
case hwmon_fan_input:
if (fst.speed == FAN_SPEED_UNAVAILABLE)
if (!acpi_fan_speed_valid(fst.speed))
return -ENODEV;
if (fst.speed > LONG_MAX)
@@ -134,7 +130,7 @@ static int acpi_fan_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
if (!fps)
return -EIO;
if (fps->power == FAN_POWER_UNAVAILABLE)
if (!acpi_fan_power_valid(fps->power))
return -ENODEV;
if (fps->power > LONG_MAX / MICROWATT_PER_MILLIWATT)
@@ -166,12 +162,19 @@ static const struct hwmon_chip_info acpi_fan_hwmon_chip_info = {
.info = acpi_fan_hwmon_info,
};
void acpi_fan_notify_hwmon(struct device *dev)
{
struct acpi_fan *fan = dev_get_drvdata(dev);
hwmon_notify_event(fan->hdev, hwmon_fan, hwmon_fan_input, 0);
}
int devm_acpi_fan_create_hwmon(struct device *dev)
{
struct acpi_fan *fan = dev_get_drvdata(dev);
struct device *hdev;
hdev = devm_hwmon_device_register_with_info(dev, "acpi_fan", fan, &acpi_fan_hwmon_chip_info,
NULL);
return PTR_ERR_OR_ZERO(hdev);
fan->hdev = devm_hwmon_device_register_with_info(dev, "acpi_fan", fan,
&acpi_fan_hwmon_chip_info, NULL);
return PTR_ERR_OR_ZERO(fan->hdev);
}

View File

@@ -27,7 +27,6 @@ static inline void acpi_pci_link_init(void) {}
void acpi_processor_init(void);
void acpi_platform_init(void);
void acpi_pnp_init(void);
void acpi_int340x_thermal_init(void);
int acpi_sysfs_init(void);
void acpi_gpe_apply_masked_gpes(void);
void acpi_container_init(void);

View File

@@ -398,7 +398,7 @@ static void acpi_os_drop_map_ref(struct acpi_ioremap *map)
list_del_rcu(&map->list);
INIT_RCU_WORK(&map->track.rwork, acpi_os_map_remove);
queue_rcu_work(system_wq, &map->track.rwork);
queue_rcu_work(system_percpu_wq, &map->track.rwork);
}
/**
@@ -1694,8 +1694,8 @@ acpi_status __init acpi_os_initialize(void)
acpi_status __init acpi_os_initialize1(void)
{
kacpid_wq = alloc_workqueue("kacpid", 0, 1);
kacpi_notify_wq = alloc_workqueue("kacpi_notify", 0, 0);
kacpid_wq = alloc_workqueue("kacpid", WQ_PERCPU, 1);
kacpi_notify_wq = alloc_workqueue("kacpi_notify", WQ_PERCPU, 0);
kacpi_hotplug_wq = alloc_ordered_workqueue("kacpi_hotplug", 0);
BUG_ON(!kacpid_wq);
BUG_ON(!kacpi_notify_wq);

View File

@@ -54,7 +54,7 @@ static int map_x2apic_id(struct acpi_subtable_header *entry,
if (!(apic->lapic_flags & ACPI_MADT_ENABLED))
return -ENODEV;
if (device_declaration && (apic->uid == acpi_id)) {
if (apic->uid == acpi_id && (device_declaration || acpi_id < 255)) {
*apic_id = apic->local_apic_id;
return 0;
}

View File

@@ -732,18 +732,16 @@ static int __cpuidle acpi_idle_enter_s2idle(struct cpuidle_device *dev,
return 0;
}
static int acpi_processor_setup_cpuidle_cx(struct acpi_processor *pr,
struct cpuidle_device *dev)
static void acpi_processor_setup_cpuidle_cx(struct acpi_processor *pr,
struct cpuidle_device *dev)
{
int i, count = ACPI_IDLE_STATE_START;
struct acpi_processor_cx *cx;
struct cpuidle_state *state;
if (max_cstate == 0)
max_cstate = 1;
for (i = 1; i < ACPI_PROCESSOR_MAX_POWER && i <= max_cstate; i++) {
state = &acpi_idle_driver.states[count];
cx = &pr->power.states[i];
if (!cx->valid)
@@ -751,27 +749,13 @@ static int acpi_processor_setup_cpuidle_cx(struct acpi_processor *pr,
per_cpu(acpi_cstate[count], dev->cpu) = cx;
if (lapic_timer_needs_broadcast(pr, cx))
state->flags |= CPUIDLE_FLAG_TIMER_STOP;
if (cx->type == ACPI_STATE_C3) {
state->flags |= CPUIDLE_FLAG_TLB_FLUSHED;
if (pr->flags.bm_check)
state->flags |= CPUIDLE_FLAG_RCU_IDLE;
}
count++;
if (count == CPUIDLE_STATE_MAX)
break;
}
if (!count)
return -EINVAL;
return 0;
}
static int acpi_processor_setup_cstates(struct acpi_processor *pr)
static void acpi_processor_setup_cstates(struct acpi_processor *pr)
{
int i, count;
struct acpi_processor_cx *cx;
@@ -818,17 +802,21 @@ static int acpi_processor_setup_cstates(struct acpi_processor *pr)
if (cx->type != ACPI_STATE_C1 && !acpi_idle_fallback_to_c1(pr))
state->enter_s2idle = acpi_idle_enter_s2idle;
if (lapic_timer_needs_broadcast(pr, cx))
state->flags |= CPUIDLE_FLAG_TIMER_STOP;
if (cx->type == ACPI_STATE_C3) {
state->flags |= CPUIDLE_FLAG_TLB_FLUSHED;
if (pr->flags.bm_check)
state->flags |= CPUIDLE_FLAG_RCU_IDLE;
}
count++;
if (count == CPUIDLE_STATE_MAX)
break;
}
drv->state_count = count;
if (!count)
return -EINVAL;
return 0;
}
static inline void acpi_processor_cstate_first_run_checks(void)
@@ -1243,7 +1231,8 @@ static int acpi_processor_setup_cpuidle_states(struct acpi_processor *pr)
if (pr->flags.has_lpi)
return acpi_processor_setup_lpi_states(pr);
return acpi_processor_setup_cstates(pr);
acpi_processor_setup_cstates(pr);
return 0;
}
/**
@@ -1263,7 +1252,8 @@ static int acpi_processor_setup_cpuidle_dev(struct acpi_processor *pr,
if (pr->flags.has_lpi)
return acpi_processor_ffh_lpi_probe(pr->id);
return acpi_processor_setup_cpuidle_cx(pr, dev);
acpi_processor_setup_cpuidle_cx(pr, dev);
return 0;
}
static int acpi_processor_get_power_info(struct acpi_processor *pr)

View File

@@ -1280,7 +1280,7 @@ static int acpi_data_prop_read(const struct acpi_device_data *data,
ret = acpi_copy_property_array_uint(items, (u64 *)val, nval);
break;
case DEV_PROP_STRING:
nval = min_t(u32, nval, obj->package.count);
nval = min(nval, obj->package.count);
if (nval == 0)
return -ENODATA;
@@ -1329,13 +1329,14 @@ static int stop_on_next(struct acpi_device *adev, void *data)
return 0;
}
/**
/*
* acpi_get_next_subnode - Return the next child node handle for a fwnode
* @fwnode: Firmware node to find the next child node for.
* @child: Handle to one of the device's child nodes or a null handle.
*/
struct fwnode_handle *acpi_get_next_subnode(const struct fwnode_handle *fwnode,
struct fwnode_handle *child)
static struct fwnode_handle *
acpi_get_next_subnode(const struct fwnode_handle *fwnode,
struct fwnode_handle *child)
{
struct acpi_device *adev = to_acpi_device_node(fwnode);
@@ -1472,7 +1473,7 @@ static struct fwnode_handle *acpi_graph_get_next_endpoint(
if (!prev) {
do {
port = fwnode_get_next_child_node(fwnode, port);
port = acpi_get_next_subnode(fwnode, port);
/*
* The names of the port nodes begin with "port@"
* followed by the number of the port node and they also
@@ -1490,14 +1491,17 @@ static struct fwnode_handle *acpi_graph_get_next_endpoint(
if (!port)
return NULL;
endpoint = fwnode_get_next_child_node(port, prev);
while (!endpoint) {
port = fwnode_get_next_child_node(fwnode, port);
if (!port)
do {
endpoint = acpi_get_next_subnode(port, prev);
if (endpoint)
break;
if (is_acpi_graph_node(port, "port"))
endpoint = fwnode_get_next_child_node(port, NULL);
}
prev = NULL;
do {
port = acpi_get_next_subnode(fwnode, port);
} while (port && !is_acpi_graph_node(port, "port"));
} while (port);
/*
* The names of the endpoint nodes begin with "endpoint@" followed by
@@ -1714,6 +1718,7 @@ static int acpi_fwnode_graph_parse_endpoint(const struct fwnode_handle *fwnode,
if (fwnode_property_read_u32(fwnode, "reg", &endpoint->id))
fwnode_property_read_u32(fwnode, "endpoint", &endpoint->id);
fwnode_handle_put(port_fwnode);
return 0;
}

View File

@@ -2397,7 +2397,7 @@ static bool acpi_scan_clear_dep_queue(struct acpi_device *adev)
* initial enumeration of devices is complete, put it into the unbound
* workqueue.
*/
queue_work(system_unbound_wq, &cdw->work);
queue_work(system_dfl_wq, &cdw->work);
return true;
}
@@ -2711,7 +2711,6 @@ void __init acpi_scan_init(void)
acpi_watchdog_init();
acpi_pnp_init();
acpi_power_resources_init();
acpi_int340x_thermal_init();
acpi_init_lpit();
acpi_scan_add_handler(&generic_device_handler);

View File

@@ -642,7 +642,7 @@ static int acpi_suspend_enter(suspend_state_t pm_state)
/*
* Disable all GPE and clear their status bits before interrupts are
* enabled. Some GPEs (like wakeup GPEs) have no handlers and this can
* prevent them from producing spurious interrups.
* prevent them from producing spurious interrupts.
*
* acpi_leave_sleep_state() will reenable specific GPEs later.
*

View File

@@ -17,10 +17,7 @@ static inline acpi_status acpi_set_waking_vector(u32 wakeup_address)
extern int acpi_s2idle_begin(void);
extern int acpi_s2idle_prepare(void);
extern int acpi_s2idle_prepare_late(void);
extern void acpi_s2idle_check(void);
extern bool acpi_s2idle_wake(void);
extern void acpi_s2idle_restore_early(void);
extern void acpi_s2idle_restore(void);
extern void acpi_s2idle_end(void);

View File

@@ -1060,7 +1060,8 @@ static int __init acpi_thermal_init(void)
}
acpi_thermal_pm_queue = alloc_workqueue("acpi_thermal_pm",
WQ_HIGHPRI | WQ_MEM_RECLAIM, 0);
WQ_HIGHPRI | WQ_MEM_RECLAIM | WQ_PERCPU,
0);
if (!acpi_thermal_pm_queue)
return -ENODEV;

View File

@@ -181,7 +181,7 @@ static void byt_i2c_setup(struct lpss_private_data *pdata)
acpi_status status;
u64 uid;
/* Expected to always be successfull, but better safe then sorry */
/* Expected to always be successful, but better safe then sorry */
if (!acpi_dev_uid_to_integer(pdata->adev, &uid) && uid) {
/* Detect I2C bus shared with PUNIT and ignore its d3 status */
status = acpi_evaluate_integer(handle, "_SEM", NULL, &shared_host);

View File

@@ -299,34 +299,13 @@ free_acpi_buffer:
ACPI_FREE(out_obj);
}
/**
* acpi_get_lps0_constraint - Get the LPS0 constraint for a device.
* @adev: Device to get the constraint for.
*
* The LPS0 constraint is the shallowest (minimum) power state in which the
* device can be so as to allow the platform as a whole to achieve additional
* energy conservation by utilizing a system-wide low-power state.
*
* Returns:
* - ACPI power state value of the constraint for @adev on success.
* - Otherwise, ACPI_STATE_UNKNOWN.
*/
int acpi_get_lps0_constraint(struct acpi_device *adev)
{
struct lpi_constraints *entry;
for_each_lpi_constraint(entry) {
if (adev->handle == entry->handle)
return entry->min_dstate;
}
return ACPI_STATE_UNKNOWN;
}
static void lpi_check_constraints(void)
{
struct lpi_constraints *entry;
if (IS_ERR_OR_NULL(lpi_constraints_table))
return;
for_each_lpi_constraint(entry) {
struct acpi_device *adev = acpi_fetch_acpi_dev(entry->handle);
@@ -508,11 +487,6 @@ static int lps0_device_attach(struct acpi_device *adev,
lps0_device_handle = adev->handle;
if (acpi_s2idle_vendor_amd())
lpi_device_get_constraints_amd();
else
lpi_device_get_constraints();
/*
* Use suspend-to-idle by default if ACPI_FADT_LOW_POWER_S0 is set in
* the FADT and the default suspend mode was not set from the command
@@ -539,7 +513,26 @@ static struct acpi_scan_handler lps0_handler = {
.attach = lps0_device_attach,
};
int acpi_s2idle_prepare_late(void)
static int acpi_s2idle_begin_lps0(void)
{
if (pm_debug_messages_on && !lpi_constraints_table) {
if (acpi_s2idle_vendor_amd())
lpi_device_get_constraints_amd();
else
lpi_device_get_constraints();
/*
* Try to retrieve the constraints only once because failures
* to do so usually are sticky.
*/
if (!lpi_constraints_table)
lpi_constraints_table = ERR_PTR(-ENODATA);
}
return acpi_s2idle_begin();
}
static int acpi_s2idle_prepare_late_lps0(void)
{
struct acpi_s2idle_dev_ops *handler;
@@ -585,7 +578,7 @@ int acpi_s2idle_prepare_late(void)
return 0;
}
void acpi_s2idle_check(void)
static void acpi_s2idle_check_lps0(void)
{
struct acpi_s2idle_dev_ops *handler;
@@ -598,7 +591,7 @@ void acpi_s2idle_check(void)
}
}
void acpi_s2idle_restore_early(void)
static void acpi_s2idle_restore_early_lps0(void)
{
struct acpi_s2idle_dev_ops *handler;
@@ -636,12 +629,12 @@ void acpi_s2idle_restore_early(void)
}
static const struct platform_s2idle_ops acpi_s2idle_ops_lps0 = {
.begin = acpi_s2idle_begin,
.begin = acpi_s2idle_begin_lps0,
.prepare = acpi_s2idle_prepare,
.prepare_late = acpi_s2idle_prepare_late,
.check = acpi_s2idle_check,
.prepare_late = acpi_s2idle_prepare_late_lps0,
.check = acpi_s2idle_check_lps0,
.wake = acpi_s2idle_wake,
.restore_early = acpi_s2idle_restore_early,
.restore_early = acpi_s2idle_restore_early_lps0,
.restore = acpi_s2idle_restore,
.end = acpi_s2idle_end,
};

View File

@@ -8,6 +8,13 @@
#include <linux/pm_runtime.h>
#include <linux/export.h>
#define CALL_PM_OP(dev, op) \
({ \
struct device *_dev = (dev); \
const struct dev_pm_ops *pm = _dev->driver ? _dev->driver->pm : NULL; \
pm && pm->op ? pm->op(_dev) : 0; \
})
#ifdef CONFIG_PM
/**
* pm_generic_runtime_suspend - Generic runtime suspend callback for subsystems.
@@ -19,12 +26,7 @@
*/
int pm_generic_runtime_suspend(struct device *dev)
{
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
int ret;
ret = pm && pm->runtime_suspend ? pm->runtime_suspend(dev) : 0;
return ret;
return CALL_PM_OP(dev, runtime_suspend);
}
EXPORT_SYMBOL_GPL(pm_generic_runtime_suspend);
@@ -38,12 +40,7 @@ EXPORT_SYMBOL_GPL(pm_generic_runtime_suspend);
*/
int pm_generic_runtime_resume(struct device *dev)
{
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
int ret;
ret = pm && pm->runtime_resume ? pm->runtime_resume(dev) : 0;
return ret;
return CALL_PM_OP(dev, runtime_resume);
}
EXPORT_SYMBOL_GPL(pm_generic_runtime_resume);
#endif /* CONFIG_PM */
@@ -72,9 +69,7 @@ int pm_generic_prepare(struct device *dev)
*/
int pm_generic_suspend_noirq(struct device *dev)
{
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
return pm && pm->suspend_noirq ? pm->suspend_noirq(dev) : 0;
return CALL_PM_OP(dev, suspend_noirq);
}
EXPORT_SYMBOL_GPL(pm_generic_suspend_noirq);
@@ -84,9 +79,7 @@ EXPORT_SYMBOL_GPL(pm_generic_suspend_noirq);
*/
int pm_generic_suspend_late(struct device *dev)
{
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
return pm && pm->suspend_late ? pm->suspend_late(dev) : 0;
return CALL_PM_OP(dev, suspend_late);
}
EXPORT_SYMBOL_GPL(pm_generic_suspend_late);
@@ -96,9 +89,7 @@ EXPORT_SYMBOL_GPL(pm_generic_suspend_late);
*/
int pm_generic_suspend(struct device *dev)
{
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
return pm && pm->suspend ? pm->suspend(dev) : 0;
return CALL_PM_OP(dev, suspend);
}
EXPORT_SYMBOL_GPL(pm_generic_suspend);
@@ -108,9 +99,7 @@ EXPORT_SYMBOL_GPL(pm_generic_suspend);
*/
int pm_generic_freeze_noirq(struct device *dev)
{
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
return pm && pm->freeze_noirq ? pm->freeze_noirq(dev) : 0;
return CALL_PM_OP(dev, freeze_noirq);
}
EXPORT_SYMBOL_GPL(pm_generic_freeze_noirq);
@@ -120,9 +109,7 @@ EXPORT_SYMBOL_GPL(pm_generic_freeze_noirq);
*/
int pm_generic_freeze(struct device *dev)
{
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
return pm && pm->freeze ? pm->freeze(dev) : 0;
return CALL_PM_OP(dev, freeze);
}
EXPORT_SYMBOL_GPL(pm_generic_freeze);
@@ -132,9 +119,7 @@ EXPORT_SYMBOL_GPL(pm_generic_freeze);
*/
int pm_generic_poweroff_noirq(struct device *dev)
{
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
return pm && pm->poweroff_noirq ? pm->poweroff_noirq(dev) : 0;
return CALL_PM_OP(dev, poweroff_noirq);
}
EXPORT_SYMBOL_GPL(pm_generic_poweroff_noirq);
@@ -144,9 +129,7 @@ EXPORT_SYMBOL_GPL(pm_generic_poweroff_noirq);
*/
int pm_generic_poweroff_late(struct device *dev)
{
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
return pm && pm->poweroff_late ? pm->poweroff_late(dev) : 0;
return CALL_PM_OP(dev, poweroff_late);
}
EXPORT_SYMBOL_GPL(pm_generic_poweroff_late);
@@ -156,9 +139,7 @@ EXPORT_SYMBOL_GPL(pm_generic_poweroff_late);
*/
int pm_generic_poweroff(struct device *dev)
{
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
return pm && pm->poweroff ? pm->poweroff(dev) : 0;
return CALL_PM_OP(dev, poweroff);
}
EXPORT_SYMBOL_GPL(pm_generic_poweroff);
@@ -168,9 +149,7 @@ EXPORT_SYMBOL_GPL(pm_generic_poweroff);
*/
int pm_generic_thaw_noirq(struct device *dev)
{
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
return pm && pm->thaw_noirq ? pm->thaw_noirq(dev) : 0;
return CALL_PM_OP(dev, thaw_noirq);
}
EXPORT_SYMBOL_GPL(pm_generic_thaw_noirq);
@@ -180,9 +159,7 @@ EXPORT_SYMBOL_GPL(pm_generic_thaw_noirq);
*/
int pm_generic_thaw(struct device *dev)
{
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
return pm && pm->thaw ? pm->thaw(dev) : 0;
return CALL_PM_OP(dev, thaw);
}
EXPORT_SYMBOL_GPL(pm_generic_thaw);
@@ -192,9 +169,7 @@ EXPORT_SYMBOL_GPL(pm_generic_thaw);
*/
int pm_generic_resume_noirq(struct device *dev)
{
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
return pm && pm->resume_noirq ? pm->resume_noirq(dev) : 0;
return CALL_PM_OP(dev, resume_noirq);
}
EXPORT_SYMBOL_GPL(pm_generic_resume_noirq);
@@ -204,9 +179,7 @@ EXPORT_SYMBOL_GPL(pm_generic_resume_noirq);
*/
int pm_generic_resume_early(struct device *dev)
{
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
return pm && pm->resume_early ? pm->resume_early(dev) : 0;
return CALL_PM_OP(dev, resume_early);
}
EXPORT_SYMBOL_GPL(pm_generic_resume_early);
@@ -216,9 +189,7 @@ EXPORT_SYMBOL_GPL(pm_generic_resume_early);
*/
int pm_generic_resume(struct device *dev)
{
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
return pm && pm->resume ? pm->resume(dev) : 0;
return CALL_PM_OP(dev, resume);
}
EXPORT_SYMBOL_GPL(pm_generic_resume);
@@ -228,9 +199,7 @@ EXPORT_SYMBOL_GPL(pm_generic_resume);
*/
int pm_generic_restore_noirq(struct device *dev)
{
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
return pm && pm->restore_noirq ? pm->restore_noirq(dev) : 0;
return CALL_PM_OP(dev, restore_noirq);
}
EXPORT_SYMBOL_GPL(pm_generic_restore_noirq);
@@ -240,9 +209,7 @@ EXPORT_SYMBOL_GPL(pm_generic_restore_noirq);
*/
int pm_generic_restore_early(struct device *dev)
{
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
return pm && pm->restore_early ? pm->restore_early(dev) : 0;
return CALL_PM_OP(dev, restore_early);
}
EXPORT_SYMBOL_GPL(pm_generic_restore_early);
@@ -252,9 +219,7 @@ EXPORT_SYMBOL_GPL(pm_generic_restore_early);
*/
int pm_generic_restore(struct device *dev)
{
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
return pm && pm->restore ? pm->restore(dev) : 0;
return CALL_PM_OP(dev, restore);
}
EXPORT_SYMBOL_GPL(pm_generic_restore);

View File

@@ -34,6 +34,7 @@
#include <linux/cpufreq.h>
#include <linux/devfreq.h>
#include <linux/timer.h>
#include <linux/nmi.h>
#include "../base.h"
#include "power.h"
@@ -95,6 +96,8 @@ static const char *pm_verb(int event)
return "restore";
case PM_EVENT_RECOVER:
return "recover";
case PM_EVENT_POWEROFF:
return "poweroff";
default:
return "(unknown PM event)";
}
@@ -367,6 +370,7 @@ static pm_callback_t pm_op(const struct dev_pm_ops *ops, pm_message_t state)
case PM_EVENT_FREEZE:
case PM_EVENT_QUIESCE:
return ops->freeze;
case PM_EVENT_POWEROFF:
case PM_EVENT_HIBERNATE:
return ops->poweroff;
case PM_EVENT_THAW:
@@ -401,6 +405,7 @@ static pm_callback_t pm_late_early_op(const struct dev_pm_ops *ops,
case PM_EVENT_FREEZE:
case PM_EVENT_QUIESCE:
return ops->freeze_late;
case PM_EVENT_POWEROFF:
case PM_EVENT_HIBERNATE:
return ops->poweroff_late;
case PM_EVENT_THAW:
@@ -435,6 +440,7 @@ static pm_callback_t pm_noirq_op(const struct dev_pm_ops *ops, pm_message_t stat
case PM_EVENT_FREEZE:
case PM_EVENT_QUIESCE:
return ops->freeze_noirq;
case PM_EVENT_POWEROFF:
case PM_EVENT_HIBERNATE:
return ops->poweroff_noirq;
case PM_EVENT_THAW:
@@ -515,6 +521,11 @@ struct dpm_watchdog {
#define DECLARE_DPM_WATCHDOG_ON_STACK(wd) \
struct dpm_watchdog wd
static bool __read_mostly dpm_watchdog_all_cpu_backtrace;
module_param(dpm_watchdog_all_cpu_backtrace, bool, 0644);
MODULE_PARM_DESC(dpm_watchdog_all_cpu_backtrace,
"Backtrace all CPUs on DPM watchdog timeout");
/**
* dpm_watchdog_handler - Driver suspend / resume watchdog handler.
* @t: The timer that PM watchdog depends on.
@@ -530,8 +541,12 @@ static void dpm_watchdog_handler(struct timer_list *t)
unsigned int time_left;
if (wd->fatal) {
unsigned int this_cpu = smp_processor_id();
dev_emerg(wd->dev, "**** DPM device timeout ****\n");
show_stack(wd->tsk, NULL, KERN_EMERG);
if (dpm_watchdog_all_cpu_backtrace)
trigger_allbutcpu_cpu_backtrace(this_cpu);
panic("%s %s: unrecoverable failure\n",
dev_driver_string(wd->dev), dev_name(wd->dev));
}

View File

@@ -90,7 +90,7 @@ static void update_pm_runtime_accounting(struct device *dev)
/*
* Because ktime_get_mono_fast_ns() is not monotonic during
* timekeeping updates, ensure that 'now' is after the last saved
* timesptamp.
* timestamp.
*/
if (now < last)
return;
@@ -217,7 +217,7 @@ static int dev_memalloc_noio(struct device *dev, void *data)
* resume/suspend callback of any one of its ancestors(or the
* block device itself), the deadlock may be triggered inside the
* memory allocation since it might not complete until the block
* device becomes active and the involed page I/O finishes. The
* device becomes active and the involved page I/O finishes. The
* situation is pointed out first by Alan Stern. Network device
* are involved in iSCSI kind of situation.
*
@@ -1210,7 +1210,7 @@ EXPORT_SYMBOL_GPL(__pm_runtime_resume);
*
* Otherwise, if its runtime PM status is %RPM_ACTIVE and (1) @ign_usage_count
* is set, or (2) @dev is not ignoring children and its active child count is
* nonero, or (3) the runtime PM usage counter of @dev is not zero, increment
* nonzero, or (3) the runtime PM usage counter of @dev is not zero, increment
* the usage counter of @dev and return 1.
*
* Otherwise, return 0 without changing the usage counter.
@@ -1664,9 +1664,12 @@ EXPORT_SYMBOL_GPL(devm_pm_runtime_get_noresume);
* pm_runtime_forbid - Block runtime PM of a device.
* @dev: Device to handle.
*
* Increase the device's usage count and clear its power.runtime_auto flag,
* so that it cannot be suspended at run time until pm_runtime_allow() is called
* for it.
* Resume @dev if already suspended and block runtime suspend of @dev in such
* a way that it can be unblocked via the /sys/devices/.../power/control
* interface, or otherwise by calling pm_runtime_allow().
*
* Calling this function many times in a row has the same effect as calling it
* once.
*/
void pm_runtime_forbid(struct device *dev)
{
@@ -1687,7 +1690,13 @@ EXPORT_SYMBOL_GPL(pm_runtime_forbid);
* pm_runtime_allow - Unblock runtime PM of a device.
* @dev: Device to handle.
*
* Decrease the device's usage count and set its power.runtime_auto flag.
* Unblock runtime suspend of @dev after it has been blocked by
* pm_runtime_forbid() (for instance, if it has been blocked via the
* /sys/devices/.../power/control interface), check if @dev can be
* suspended and suspend it in that case.
*
* Calling this function many times in a row has the same effect as calling it
* once.
*/
void pm_runtime_allow(struct device *dev)
{

Some files were not shown because too many files have changed in this diff Show More