mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2026-01-11 17:10:13 +00:00
Add documentation for Intel thermal throttling reporting events. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Tested-by: Randy Dunlap <rdunlap@infradead.org> [ rjw: Subject adjustment, file name change, minor edits ] Link: https://patch.msgid.link/20251113212104.221632-1-srinivas.pandruvada@linux.intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
92 lines
4.1 KiB
ReStructuredText
92 lines
4.1 KiB
ReStructuredText
.. SPDX-License-Identifier: GPL-2.0
|
||
.. include:: <isonum.txt>
|
||
|
||
=======================================
|
||
Intel thermal throttle events reporting
|
||
=======================================
|
||
|
||
:Author: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
||
|
||
Introduction
|
||
------------
|
||
|
||
Intel processors have built in automatic and adaptive thermal monitoring
|
||
mechanisms that force the processor to reduce its power consumption in order
|
||
to operate within predetermined temperature limits.
|
||
|
||
Refer to section "THERMAL MONITORING AND PROTECTION" in the "Intel® 64 and
|
||
IA-32 Architectures Software Developer’s Manual Volume 3 (3A, 3B, 3C, & 3D):
|
||
System Programming Guide" for more details.
|
||
|
||
In general, there are two mechanisms to control the core temperature of the
|
||
processor. They are called "Thermal Monitor 1 (TM1) and Thermal Monitor 2 (TM2)".
|
||
|
||
The status of the temperature sensor that triggers the thermal monitor (TM1/TM2)
|
||
is indicated through the "thermal status flag" and "thermal status log flag" in
|
||
MSR_IA32_THERM_STATUS for core level and MSR_IA32_PACKAGE_THERM_STATUS for
|
||
package level.
|
||
|
||
Thermal Status flag, bit 0 — When set, indicates that the processor core
|
||
temperature is currently at the trip temperature of the thermal monitor and that
|
||
the processor power consumption is being reduced via either TM1 or TM2, depending
|
||
on which is enabled. When clear, the flag indicates that the core temperature is
|
||
below the thermal monitor trip temperature. This flag is read only.
|
||
|
||
Thermal Status Log flag, bit 1 — When set, indicates that the thermal sensor has
|
||
tripped since the last power-up or reset or since the last time that software
|
||
cleared this flag. This flag is a sticky bit; once set it remains set until
|
||
cleared by software or until a power-up or reset of the processor. The default
|
||
state is clear.
|
||
|
||
It is possible that when user reads MSR_IA32_THERM_STATUS or
|
||
MSR_IA32_PACKAGE_THERM_STATUS, TM1/TM2 is not active. In this case,
|
||
"Thermal Status flag" will read "0" and the "Thermal Status Log flag" will be set
|
||
to show any previous "TM1/TM2" activation. But since it needs to be cleared by
|
||
the software, it can't show the number of occurrences of "TM1/TM2" activations.
|
||
|
||
Hence, Linux provides counters of how many times the "Thermal Status flag" was
|
||
set. Also presents how long the "Thermal Status flag" was active in milliseconds.
|
||
Using these counters, users can check if the performance was limited because of
|
||
thermal events. It is recommended to read from sysfs instead of directly reading
|
||
MSRs as the "Thermal Status Log flag" is reset by the driver to implement rate
|
||
control.
|
||
|
||
Sysfs Interface
|
||
---------------
|
||
|
||
Thermal throttling events are presented for each CPU under
|
||
"/sys/devices/system/cpu/cpuX/thermal_throttle/", where "X" is the CPU number.
|
||
|
||
All these counters are read-only. They can't be reset to 0. So, they can potentially
|
||
overflow after reaching the maximum 64 bit unsigned integer.
|
||
|
||
``core_throttle_count``
|
||
Shows the number of times "Thermal Status flag" changed from 0 to 1 for this
|
||
CPU since OS boot and thermal vector is initialized. This is a 64 bit counter.
|
||
|
||
``package_throttle_count``
|
||
Shows the number of times "Thermal Status flag" changed from 0 to 1 for the
|
||
package containing this CPU since OS boot and thermal vector is initialized.
|
||
Package status is broadcast to all CPUs; all CPUs in the package increment
|
||
this count. This is a 64-bit counter.
|
||
|
||
``core_throttle_max_time_ms``
|
||
Shows the maximum amount of time for which "Thermal Status flag" has been
|
||
set to 1 for this CPU at the core level since OS boot and thermal vector
|
||
is initialized.
|
||
|
||
``package_throttle_max_time_ms``
|
||
Shows the maximum amount of time for which "Thermal Status flag" has been
|
||
set to 1 for the package containing this CPU since OS boot and thermal
|
||
vector is initialized.
|
||
|
||
``core_throttle_total_time_ms``
|
||
Shows the cumulative time for which "Thermal Status flag" has been
|
||
set to 1 for this CPU for core level since OS boot and thermal vector
|
||
is initialized.
|
||
|
||
``package_throttle_total_time_ms``
|
||
Shows the cumulative time for which "Thermal Status flag" has been set
|
||
to 1 for the package containing this CPU since OS boot and thermal vector
|
||
is initialized.
|