1
0
mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-01-11 09:00:12 +00:00

ASoC: Fixes for v6.19

We've been quite busy with fixes since the merge window, though not in
 any particularly exciting ways - the standout thing is the fix for _SX
 controls which were broken by a change to how we do clamping, otherwise
 it's all fairly run of the mill fixes and quirks.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmlFbkgACgkQJNaLcl1U
 h9Bkvgf/adbAIa/nkcOb8FEba3Dc1eh8eDlosexH1AzEX11dUphTDV7NYFNsa9zt
 InrETvxHFmqFSN6ZmWGR3mFpPRwLVpbe8MTuazjDurwLZV5Y6XCWTiL/4m+8m+/E
 7N0eVCccS5v4vrd+2zNGG+1+lI8QOXzY5egGe5v3yhmhUmFXZP5dBO9EzRZX60Ks
 FUPMXlS37TdSXc+uiZaemkunKZqFOIwq+Pl+9wMt/d5QB9xFWMxb+k/cQONPt0Po
 wS7DynO4iAVh27B9kFu+UDrW2JR3kTCEZTjL+ImL10dOZkuiNeyU9SFOfmR6tIwD
 uODoFGR9ny6zUzwABlyDJzeipf3JUw==
 =7AM/
 -----END PGP SIGNATURE-----

Merge tag 'asoc-fix-v6.19-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v6.19

We've been quite busy with fixes since the merge window, though not in
any particularly exciting ways - the standout thing is the fix for _SX
controls which were broken by a change to how we do clamping, otherwise
it's all fairly run of the mill fixes and quirks.
This commit is contained in:
Takashi Iwai 2025-12-21 11:11:11 +01:00
commit 24f171c7e1
11897 changed files with 606798 additions and 184020 deletions

View File

@ -140,8 +140,8 @@ ForEachMacros:
- 'damon_for_each_scheme_safe'
- 'damon_for_each_target'
- 'damon_for_each_target_safe'
- 'damos_for_each_filter'
- 'damos_for_each_filter_safe'
- 'damos_for_each_core_filter'
- 'damos_for_each_core_filter_safe'
- 'damos_for_each_ops_filter'
- 'damos_for_each_ops_filter_safe'
- 'damos_for_each_quota_goal'
@ -167,7 +167,7 @@ ForEachMacros:
- 'drm_connector_for_each_possible_encoder'
- 'drm_exec_for_each_locked_object'
- 'drm_exec_for_each_locked_object_reverse'
- 'drm_for_each_bridge_in_chain'
- 'drm_for_each_bridge_in_chain_scoped'
- 'drm_for_each_connector_iter'
- 'drm_for_each_crtc'
- 'drm_for_each_crtc_reverse'
@ -415,6 +415,7 @@ ForEachMacros:
- 'for_each_prop_dlc_cpus'
- 'for_each_prop_dlc_platforms'
- 'for_each_property_of_node'
- 'for_each_pt_level_entry'
- 'for_each_rdt_resource'
- 'for_each_reg'
- 'for_each_reg_filtered'

1
.gitignore vendored
View File

@ -41,6 +41,7 @@
*.o.*
*.patch
*.pyc
*.rlib
*.rmeta
*.rpm
*.rsi

View File

@ -174,6 +174,7 @@ Carlos Bilbao <carlos.bilbao@kernel.org> <bilbao@vt.edu>
Changbin Du <changbin.du@intel.com> <changbin.du@gmail.com>
Chao Yu <chao@kernel.org> <chao2.yu@samsung.com>
Chao Yu <chao@kernel.org> <yuchao0@huawei.com>
Chen-Yu Tsai <wens@kernel.org> <wens@csie.org>
Chester Lin <chester62515@gmail.com> <clin@suse.com>
Chris Chiu <chris.chiu@canonical.com> <chiu@endlessm.com>
Chris Chiu <chris.chiu@canonical.com> <chiu@endlessos.org>
@ -185,6 +186,9 @@ Christian Brauner <brauner@kernel.org> <christian@brauner.io>
Christian Brauner <brauner@kernel.org> <christian.brauner@canonical.com>
Christian Brauner <brauner@kernel.org> <christian.brauner@ubuntu.com>
Christian Marangi <ansuelsmth@gmail.com>
Christophe Leroy <chleroy@kernel.org> <christophe.leroy@c-s.fr>
Christophe Leroy <chleroy@kernel.org> <christophe.leroy@csgroup.eu>
Christophe Leroy <chleroy@kernel.org> <christophe.leroy2@cs-soprasteria.com>
Christophe Ricard <christophe.ricard@gmail.com>
Christopher Obbard <christopher.obbard@linaro.org> <chris.obbard@collabora.com>
Christoph Hellwig <hch@lst.de>
@ -299,6 +303,7 @@ Hans de Goede <hansg@kernel.org> <hdegoede@redhat.com>
Hans Verkuil <hverkuil@kernel.org> <hverkuil@xs4all.nl>
Hans Verkuil <hverkuil@kernel.org> <hverkuil-cisco@xs4all.nl>
Hans Verkuil <hverkuil@kernel.org> <hansverk@cisco.com>
Hao Ge <hao.ge@linux.dev> <gehao@kylinos.cn>
Harry Yoo <harry.yoo@oracle.com> <42.hyeyoo@gmail.com>
Heiko Carstens <hca@linux.ibm.com> <h.carstens@de.ibm.com>
Heiko Carstens <hca@linux.ibm.com> <heiko.carstens@de.ibm.com>
@ -344,7 +349,8 @@ Jayachandran C <c.jayachandran@gmail.com> <jayachandranc@netlogicmicro.com>
Jayachandran C <c.jayachandran@gmail.com> <jchandra@broadcom.com>
Jayachandran C <c.jayachandran@gmail.com> <jchandra@digeo.com>
Jayachandran C <c.jayachandran@gmail.com> <jnair@caviumnetworks.com>
<jean-philippe@linaro.org> <jean-philippe.brucker@arm.com>
Jean-Philippe Brucker <jpb@kernel.org> <jean-philippe.brucker@arm.com>
Jean-Philippe Brucker <jpb@kernel.org> <jean-philippe@linaro.org>
Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> <jeanmichel.hautbois@ideasonboard.com>
Jean Tourrilhes <jt@hpl.hp.com>
Jeevan Shriram <quic_jshriram@quicinc.com> <jshriram@codeaurora.org>
@ -498,9 +504,7 @@ Mark Brown <broonie@sirena.org.uk>
Mark Starovoytov <mstarovo@pm.me> <mstarovoitov@marvell.com>
Markus Schneider-Pargmann <msp@baylibre.com> <mpa@pengutronix.de>
Mark Yao <markyao0591@gmail.com> <mark.yao@rock-chips.com>
Martin Kepplinger <martink@posteo.de> <martin.kepplinger@ginzinger.com>
Martin Kepplinger <martink@posteo.de> <martin.kepplinger@puri.sm>
Martin Kepplinger <martink@posteo.de> <martin.kepplinger@theobroma-systems.com>
Martin Kepplinger-Novakovic <martink@posteo.de> <martin.kepplinger-novakovic@ginzinger.com>
Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@linux.intel.com> <martyna.szapar-mudlaw@intel.com>
Mathieu Othacehe <othacehe@gnu.org> <m.othacehe@gmail.com>
Mat Martineau <martineau@kernel.org> <mathew.j.martineau@linux.intel.com>
@ -589,8 +593,8 @@ Nicolas Pitre <nico@fluxnic.net> <nicolas.pitre@linaro.org>
Nicolas Pitre <nico@fluxnic.net> <nico@linaro.org>
Nicolas Saenz Julienne <nsaenz@kernel.org> <nsaenzjulienne@suse.de>
Nicolas Saenz Julienne <nsaenz@kernel.org> <nsaenzjulienne@suse.com>
Nicolas Schier <nicolas.schier@linux.dev> <n.schier@avm.de>
Nicolas Schier <nicolas.schier@linux.dev> <nicolas@fjasle.eu>
Nicolas Schier <nsc@kernel.org> <n.schier@avm.de>
Nicolas Schier <nsc@kernel.org> <nicolas@fjasle.eu>
Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Nikolay Aleksandrov <razor@blackwall.org> <naleksan@redhat.com>
Nikolay Aleksandrov <razor@blackwall.org> <nikolay@redhat.com>
@ -637,6 +641,7 @@ Peter Oruba <peter.oruba@amd.com>
Peter Oruba <peter@oruba.de>
Pierre-Louis Bossart <pierre-louis.bossart@linux.dev> <pierre-louis.bossart@linux.intel.com>
Pratyush Anand <pratyush.anand@gmail.com> <pratyush.anand@st.com>
Pratyush Yadav <pratyush@kernel.org> <ptyadav@amazon.de>
Praveen BP <praveenbp@ti.com>
Pradeep Kumar Chitrapu <quic_pradeepc@quicinc.com> <pradeepc@codeaurora.org>
Prasad Sodagudi <quic_psodagud@quicinc.com> <psodagud@codeaurora.org>
@ -691,7 +696,10 @@ Sachin Mokashi <sachin.mokashi@intel.com> <sachinx.mokashi@intel.com>
Sachin P Sant <ssant@in.ibm.com>
Sai Prakash Ranjan <quic_saipraka@quicinc.com> <saiprakash.ranjan@codeaurora.org>
Sakari Ailus <sakari.ailus@linux.intel.com> <sakari.ailus@iki.fi>
Sam Protsenko <semen.protsenko@linaro.org>
Sam Protsenko <semen.protsenko@linaro.org> <semen.protsenko@globallogic.com>
Sam Ravnborg <sam@mars.ravnborg.org>
Samuel Kayode <samkay014@gmail.com> <samuel.kayode@savoirfairelinux.com>
Sankeerth Billakanti <quic_sbillaka@quicinc.com> <sbillaka@codeaurora.org>
Santosh Shilimkar <santosh.shilimkar@oracle.org>
Santosh Shilimkar <ssantosh@kernel.org>
@ -847,6 +855,9 @@ Vivien Didelot <vivien.didelot@gmail.com> <vivien.didelot@savoirfairelinux.com>
Vlad Dogaru <ddvlad@gmail.com> <vlad.dogaru@intel.com>
Vladimir Davydov <vdavydov.dev@gmail.com> <vdavydov@parallels.com>
Vladimir Davydov <vdavydov.dev@gmail.com> <vdavydov@virtuozzo.com>
WangYuli <wangyuli@aosc.io> <wangyl5933@chinaunicom.cn>
WangYuli <wangyuli@aosc.io> <wangyuli@deepin.org>
WangYuli <wangyuli@aosc.io> <wangyuli@uniontech.com>
Weiwen Hu <huweiwen@linux.alibaba.com> <sehuww@mail.scut.edu.cn>
WeiXiong Liao <gmpy.liaowx@gmail.com> <liaoweixiong@allwinnertech.com>
Wen Gong <quic_wgong@quicinc.com> <wgong@codeaurora.org>
@ -858,6 +869,7 @@ Yakir Yang <kuankuan.y@gmail.com> <ykk@rock-chips.com>
Yanteng Si <si.yanteng@linux.dev> <siyanteng@loongson.cn>
Ying Huang <huang.ying.caritas@gmail.com> <ying.huang@intel.com>
Yosry Ahmed <yosry.ahmed@linux.dev> <yosryahmed@google.com>
Yu-Chun Lin <eleanor.lin@realtek.com> <eleanor15x@gmail.com>
Yusuke Goda <goda.yusuke@renesas.com>
Zack Rusin <zack.rusin@broadcom.com> <zackr@vmware.com>
Zhu Yanjun <zyjzyj2000@gmail.com> <yanjunz@nvidia.com>

View File

@ -1,2 +1,2 @@
[MASTER]
init-hook='import sys; sys.path += ["scripts/lib/kdoc", "scripts/lib/abi", "tools/docs/lib"]'
init-hook='import sys; sys.path += ["tools/lib/python"]'

13
CREDITS
View File

@ -16,6 +16,10 @@ D: One of assisting postmasters for vger.kernel.org's lists
S: (ask for current address)
S: Finland
N: Kishon Vijay Abraham I
E: kishon@kernel.org
D: Generic Phy Framework
N: Thomas Abraham
E: thomas.ab@samsung.com
D: Samsung pin controller driver
@ -2056,16 +2060,15 @@ S: Korte Heul 95
S: 1403 ND BUSSUM
S: The Netherlands
N: Martin Kepplinger
N: Martin Kepplinger-Novakovic
E: martink@posteo.de
E: martin.kepplinger@puri.sm
W: http://www.martinkepplinger.com
P: 4096R/5AB387D3 F208 2B88 0F9E 4239 3468 6E3F 5003 98DF 5AB3 87D3
D: mma8452 accelerators iio driver
D: pegasus_notetaker input driver
D: imx8m media and hi846 sensor driver
D: Kernel fixes and cleanups
S: Garnisonstraße 26
S: 4020 Linz
S: Keplerstr. 6
S: 4050 Traun
S: Austria
N: Karl Keyte

View File

@ -0,0 +1,71 @@
NOTE: all the ABIs listed in this file are deprecated and will be removed after 2028.
Here are the alternative ABIs:
+------------------------------------+-----------------------------------------+
| Deprecated | Alternative |
+------------------------------------+-----------------------------------------+
| /sys/kernel/kexec_loaded | /sys/kernel/kexec/loaded |
+------------------------------------+-----------------------------------------+
| /sys/kernel/kexec_crash_loaded | /sys/kernel/kexec/crash_loaded |
+------------------------------------+-----------------------------------------+
| /sys/kernel/kexec_crash_size | /sys/kernel/kexec/crash_size |
+------------------------------------+-----------------------------------------+
| /sys/kernel/crash_elfcorehdr_size | /sys/kernel/kexec/crash_elfcorehdr_size |
+------------------------------------+-----------------------------------------+
| /sys/kernel/kexec_crash_cma_ranges | /sys/kernel/kexec/crash_cma_ranges |
+------------------------------------+-----------------------------------------+
What: /sys/kernel/kexec_loaded
Date: Jun 2006
Contact: kexec@lists.infradead.org
Description: read only
Indicates whether a new kernel image has been loaded
into memory using the kexec system call. It shows 1 if
a kexec image is present and ready to boot, or 0 if none
is loaded.
User: kexec tools, kdump service
What: /sys/kernel/kexec_crash_loaded
Date: Jun 2006
Contact: kexec@lists.infradead.org
Description: read only
Indicates whether a crash (kdump) kernel is currently
loaded into memory. It shows 1 if a crash kernel has been
successfully loaded for panic handling, or 0 if no crash
kernel is present.
User: Kexec tools, Kdump service
What: /sys/kernel/kexec_crash_size
Date: Dec 2009
Contact: kexec@lists.infradead.org
Description: read/write
Shows the amount of memory reserved for loading the crash
(kdump) kernel. It reports the size, in bytes, of the
crash kernel area defined by the crashkernel= parameter.
This interface also allows reducing the crashkernel
reservation by writing a smaller value, and the reclaimed
space is added back to the system RAM.
User: Kdump service
What: /sys/kernel/crash_elfcorehdr_size
Date: Aug 2023
Contact: kexec@lists.infradead.org
Description: read only
Indicates the preferred size of the memory buffer for the
ELF core header used by the crash (kdump) kernel. It defines
how much space is needed to hold metadata about the crashed
system, including CPU and memory information. This information
is used by the user space utility kexec to support updating the
in-kernel kdump image during hotplug operations.
User: Kexec tools
What: /sys/kernel/kexec_crash_cma_ranges
Date: Nov 2025
Contact: kexec@lists.infradead.org
Description: read only
Provides information about the memory ranges reserved from
the Contiguous Memory Allocator (CMA) area that are allocated
to the crash (kdump) kernel. It lists the start and end physical
addresses of CMA regions assigned for crashkernel use.
User: kdump service

View File

@ -0,0 +1,19 @@
What: /sys/bus/pci/drivers/qaic/XXXX:XX:XX.X/accel/accel<minor_nr>/dbc<N>_state
Date: October 2025
KernelVersion: 6.19
Contact: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Description: Represents the current state of DMA Bridge channel (DBC). Below are the possible
states:
=================== ==========================================================
IDLE (0) DBC is free and can be activated
ASSIGNED (1) DBC is activated and a workload is running on device
BEFORE_SHUTDOWN (2) Sub-system associated with this workload has crashed and
it will shutdown soon
AFTER_SHUTDOWN (3) Sub-system associated with this workload has crashed and
it has shutdown
BEFORE_POWER_UP (4) Sub-system associated with this workload is shutdown and
it will be powered up soon
AFTER_POWER_UP (5) Sub-system associated with this workload is now powered up
=================== ==========================================================
Users: Any userspace application or clients interested in DBC state.

View File

@ -20,9 +20,10 @@ Description:
rule format: action [condition ...]
action: measure | dont_measure | appraise | dont_appraise |
audit | hash | dont_hash
audit | dont_audit | hash | dont_hash
condition:= base | lsm [option]
base: [[func=] [mask=] [fsmagic=] [fsuuid=] [fsname=]
[fs_subtype=]
[uid=] [euid=] [gid=] [egid=]
[fowner=] [fgroup=]]
lsm: [[subj_user=] [subj_role=] [subj_type=]

View File

@ -14,7 +14,7 @@ Description:
for RTCs that support alarms
* RTC_ALM_READ, RTC_ALM_SET: Read or set the alarm time for
RTCs that support alarms. Can be set upto 24 hours in the
RTCs that support alarms. Can be set up to 24 hours in the
future. Requires a separate RTC_AIE_ON call to enable the
alarm interrupt. (Prefer to use RTC_WKALM_*)

View File

@ -0,0 +1,90 @@
What: /sys/.../message
Date: October 2021
KernelVersion: 5.16
Description:
Controls the text message displayed on character line displays.
Reading returns the current message with a trailing newline.
Writing updates the displayed message. Messages longer than the
display width will automatically scroll. Trailing newlines in
input are automatically trimmed.
Writing an empty string clears the display.
Example:
echo "Hello World" > message
cat message # Returns "Hello World\n"
What: /sys/.../num_chars
Date: November 2025
KernelVersion: 6.18
Contact: Jean-François Lessard <jefflessard3@gmail.com>
Description:
Read-only attribute showing the character width capacity of
the line display device. Messages longer than this will scroll.
Example:
cat num_chars # Returns "16\n" for 16-char display
What: /sys/.../scroll_step_ms
Date: October 2021
KernelVersion: 5.16
Description:
Controls the scrolling speed for messages longer than the display
width, specified in milliseconds per scroll step.
Setting to 0 disables scrolling. Default is 500ms.
Example:
echo "250" > scroll_step_ms # 4Hz scrolling
cat scroll_step_ms # Returns "250\n"
What: /sys/.../map_seg7
Date: January 2024
KernelVersion: 6.9
Description:
Read/write binary blob representing the ASCII-to-7-segment
display conversion table used by the linedisp driver, as defined
by struct seg7_conversion_map in <linux/map_to_7segment.h>.
Only visible on displays with 7-segment capability.
This attribute is not human-readable. Writes must match the
struct size exactly, else -EINVAL is returned; reads return the
entire mapping as a binary blob.
This interface and its implementation match existing conventions
used in segment-mapped display drivers since 2005.
ABI note: This style of binary sysfs attribute *is an exception*
to current "one value per file, text only" sysfs rules, for
historical compatibility and driver uniformity. New drivers are
discouraged from introducing additional binary sysfs ABIs.
Reference interface guidance:
- include/uapi/linux/map_to_7segment.h
What: /sys/.../map_seg14
Date: January 2024
KernelVersion: 6.9
Description:
Read/write binary blob representing the ASCII-to-14-segment
display conversion table used by the linedisp driver, as defined
by struct seg14_conversion_map in <linux/map_to_14segment.h>.
Only visible on displays with 14-segment capability.
This attribute is not human-readable. Writes must match the
struct size exactly, else -EINVAL is returned; reads return the
entire mapping as a binary blob.
This interface and its implementation match existing conventions
used by segment-mapped display drivers since 2005.
ABI note: This style of binary sysfs attribute *is an exception*
to current "one value per file, text only" sysfs rules, for
historical compatibility and driver uniformity. New drivers are
discouraged from introducing additional binary sysfs ABIs.
Reference interface guidance:
- include/uapi/linux/map_to_14segment.h

View File

@ -106,13 +106,6 @@ Description:
will be discarded from the cache. Should not be turned off with
writeback caching enabled.
What: /sys/block/<disk>/bcache/discard
Date: November 2010
Contact: Kent Overstreet <kent.overstreet@gmail.com>
Description:
For a cache, a boolean allowing discard/TRIM to be turned off
or back on if the device supports it.
What: /sys/block/<disk>/bcache/bucket_size
Date: November 2010
Contact: Kent Overstreet <kent.overstreet@gmail.com>

View File

@ -496,8 +496,17 @@ Description:
changed, only freed by writing 0. The kernel makes no guarantees
that data is maintained over an address space freeing event, and
there is no guarantee that a free followed by an allocate
results in the same address being allocated.
results in the same address being allocated. If extended linear
cache is present, the size indicates extended linear cache size
plus the CXL region size.
What: /sys/bus/cxl/devices/regionZ/extended_linear_cache_size
Date: October, 2025
KernelVersion: v6.19
Contact: linux-cxl@vger.kernel.org
Description:
(RO) The size of extended linear cache, if there is an extended
linear cache. Otherwise the attribute will not be visible.
What: /sys/bus/cxl/devices/regionZ/mode
Date: January, 2023

View File

@ -898,6 +898,7 @@ What: /sys/.../iio:deviceX/events/in_tempY_thresh_rising_en
What: /sys/.../iio:deviceX/events/in_tempY_thresh_falling_en
What: /sys/.../iio:deviceX/events/in_capacitanceY_thresh_rising_en
What: /sys/.../iio:deviceX/events/in_capacitanceY_thresh_falling_en
What: /sys/.../iio:deviceX/events/in_pressure_thresh_rising_en
KernelVersion: 2.6.37
Contact: linux-iio@vger.kernel.org
Description:
@ -926,6 +927,7 @@ What: /sys/.../iio:deviceX/events/in_accel_y_roc_rising_en
What: /sys/.../iio:deviceX/events/in_accel_y_roc_falling_en
What: /sys/.../iio:deviceX/events/in_accel_z_roc_rising_en
What: /sys/.../iio:deviceX/events/in_accel_z_roc_falling_en
What: /sys/.../iio:deviceX/events/in_accel_x&y&z_roc_rising_en
What: /sys/.../iio:deviceX/events/in_anglvel_x_roc_rising_en
What: /sys/.../iio:deviceX/events/in_anglvel_x_roc_falling_en
What: /sys/.../iio:deviceX/events/in_anglvel_y_roc_rising_en
@ -1001,6 +1003,7 @@ Description:
to the raw signal, allowing slow tracking to resume and the
adaptive threshold event detection to function as expected.
What: /sys/.../events/in_accel_mag_adaptive_rising_value
What: /sys/.../events/in_accel_thresh_rising_value
What: /sys/.../events/in_accel_thresh_falling_value
What: /sys/.../events/in_accel_x_raw_thresh_rising_value
@ -1045,6 +1048,7 @@ What: /sys/.../events/in_capacitanceY_thresh_rising_value
What: /sys/.../events/in_capacitanceY_thresh_falling_value
What: /sys/.../events/in_capacitanceY_thresh_adaptive_rising_value
What: /sys/.../events/in_capacitanceY_thresh_falling_rising_value
What: /sys/.../events/in_pressure_thresh_rising_value
KernelVersion: 2.6.37
Contact: linux-iio@vger.kernel.org
Description:
@ -1147,6 +1151,7 @@ Description:
will get activated once in_voltage0_raw goes above 1200 and will become
deactivated again once the value falls below 1150.
What: /sys/.../events/in_accel_roc_rising_value
What: /sys/.../events/in_accel_x_raw_roc_rising_value
What: /sys/.../events/in_accel_x_raw_roc_falling_value
What: /sys/.../events/in_accel_y_raw_roc_rising_value
@ -1193,6 +1198,8 @@ Description:
value is in raw device units or in processed units (as _raw
and _input do on sysfs direct channel read attributes).
What: /sys/.../events/in_accel_mag_adaptive_rising_period
What: /sys/.../events/in_accel_roc_rising_period
What: /sys/.../events/in_accel_x_thresh_rising_period
What: /sys/.../events/in_accel_x_thresh_falling_period
What: /sys/.../events/in_accel_x_roc_rising_period
@ -1362,6 +1369,15 @@ Description:
number or direction is not specified, applies to all channels of
this type.
What: /sys/.../iio:deviceX/events/in_accel_x_mag_adaptive_rising_en
What: /sys/.../iio:deviceX/events/in_accel_y_mag_adaptive_rising_en
What: /sys/.../iio:deviceX/events/in_accel_z_mag_adaptive_rising_en
KernelVersion: 2.6.37
Contact: linux-iio@vger.kernel.org
Description:
Similar to in_accel_x_thresh[_rising|_falling]_en, but here the
magnitude of the channel is compared to the adaptive threshold.
What: /sys/.../iio:deviceX/events/in_accel_mag_referenced_en
What: /sys/.../iio:deviceX/events/in_accel_mag_referenced_rising_en
What: /sys/.../iio:deviceX/events/in_accel_mag_referenced_falling_en
@ -2422,3 +2438,23 @@ Description:
Value representing the user's attention to the system expressed
in units as percentage. This usually means if the user is
looking at the screen or not.
What: /sys/.../events/in_accel_value_available
KernelVersion: 6.18
Contact: linux-iio@vger.kernel.org
Description:
List of available threshold values for acceleration event
generation. Applies to all event types on in_accel channels.
Units after application of scale and offset are m/s^2.
Expressed as:
- a range specified as "[min step max]"
What: /sys/.../events/in_accel_period_available
KernelVersion: 6.18
Contact: linux-iio@vger.kernel.org
Description:
List of available periods for accelerometer event detection in
seconds, expressed as:
- a range specified as "[min step max]"

View File

@ -621,3 +621,84 @@ Description:
number extended capability. The file is read only and due to
the possible sensitivity of accessible serial numbers, admin
only.
What: /sys/bus/pci/devices/.../tsm/
Contact: linux-coco@lists.linux.dev
Description:
This directory only appears if a physical device function
supports authentication (PCIe CMA-SPDM), interface security
(PCIe TDISP), and is accepted for secure operation by the
platform TSM driver. This attribute directory appears
dynamically after the platform TSM driver loads. So, only after
the /sys/class/tsm/tsm0 device arrives can tools assume that
devices without a tsm/ attribute directory will never have one;
before that, the security capabilities of the device relative to
the platform TSM are unknown. See
Documentation/ABI/testing/sysfs-class-tsm.
What: /sys/bus/pci/devices/.../tsm/connect
Contact: linux-coco@lists.linux.dev
Description:
(RW) Write the name of a TSM (TEE Security Manager) device from
/sys/class/tsm to this file to establish a connection with the
device. This typically includes an SPDM (DMTF Security
Protocols and Data Models) session over PCIe DOE (Data Object
Exchange) and may also include PCIe IDE (Integrity and Data
Encryption) establishment. Reads from this attribute return the
name of the connected TSM or the empty string if not
connected. A TSM device signals its readiness to accept PCI
connection via a KOBJ_CHANGE event.
What: /sys/bus/pci/devices/.../tsm/disconnect
Contact: linux-coco@lists.linux.dev
Description:
(WO) Write the name of the TSM device that was specified
to 'connect' to teardown the connection.
What: /sys/bus/pci/devices/.../tsm/dsm
Contact: linux-coco@lists.linux.dev
Description: (RO) Return PCI device name of this device's DSM (Device
Security Manager). When a device is in the connected state it
indicates that the platform TSM (TEE Security Manager) has made
a secure-session connection with a device's DSM. A DSM is always
physical function 0 and when the device supports TDISP (TEE
Device Interface Security Protocol) its managed functions also
populate this tsm/dsm attribute. The managed functions of a DSM
are SR-IOV (Single Root I/O Virtualization) virtual functions,
non-zero functions of a multi-function device, or downstream
endpoints depending on whether the DSM is an SR-IOV physical
function, function0 of a multi-function device, or an upstream
PCIe switch port. This is a "link" TSM attribute, see
Documentation/ABI/testing/sysfs-class-tsm.
What: /sys/bus/pci/devices/.../tsm/bound
Contact: linux-coco@lists.linux.dev
Description: (RO) Return the device name of the TSM when the device is in a
TDISP (TEE Device Interface Security Protocol) operational state
(LOCKED, RUN, or ERROR, not UNLOCKED). Bound devices consume
platform TSM resources and depend on the device's configuration
(e.g. BME (Bus Master Enable) and MSE (Memory Space Enable)
among other settings) to remain stable for the duration of the
bound state. This attribute is only visible for devices that
support TDISP operation, and it is only populated after
successful connect and TSM bind. The TSM bind operation is
initiated by VFIO/IOMMUFD. This is a "link" TSM attribute, see
Documentation/ABI/testing/sysfs-class-tsm.
What: /sys/bus/pci/devices/.../authenticated
Contact: linux-pci@vger.kernel.org
Description:
When the device's tsm/ directory is present device
authentication (PCIe CMA-SPDM) and link encryption (PCIe IDE)
are handled by the platform TSM (TEE Security Manager). When the
tsm/ directory is not present this attribute reflects only the
native CMA-SPDM authentication state with the kernel's
certificate store.
If the attribute is not present, it indicates that
authentication is unsupported by the device, or the TSM has no
available authentication methods for the device.
When present and the tsm/ attribute directory is present, the
authenticated attribute is an alias for the device 'connect'
state. See the 'tsm/connect' attribute for more details.

View File

@ -23,6 +23,8 @@ Description: This file contains a space-separated list of profiles supported
power consumption with a slight bias
towards performance
performance High performance operation
max-power Higher performance operation that may exceed
internal battery draw limits when on AC power
custom Driver defined custom profile
==================== ========================================

View File

@ -0,0 +1,30 @@
What: /sys/class/power_supply/rt9756-*/watchdog_timer
Date: Dec 2025
KernelVersion: 6.19
Contact: ChiYuan Huang <cy_huang@richtek.com>
Description:
This entry shows and sets the watchdog timer when rt9756 charger
operates in charging mode. When the timer expires, the device
will disable the charging. To prevent the timer expires, any
host communication can make the timer restarted.
Access: Read, Write
Valid values:
- 500, 1000, 5000, 30000, 40000, 80000, 128000 or 255000 (milliseconds),
- 0: disabled
What: /sys/class/power_supply/rt9756-*/operation_mode
Date: Dec 2025
KernelVersion: 6.19
Contact: ChiYuan Huang <cy_huang@richtek.com>
Description:
This entry shows and set the operation mode when rt9756 charger
operates in charging phase. If 'bypass' mode is used, internal
path will connect vbus directly to vbat. Else, default 'div2'
mode for the switch-cap charging.
Access: Read, Write
Valid values:
- 'bypass' or 'div2'

View File

@ -0,0 +1,19 @@
What: /sys/class/tsm/tsmN
Contact: linux-coco@lists.linux.dev
Description:
"tsmN" is a device that represents the generic attributes of a
platform TEE Security Manager. It is typically a child of a
platform enumerated TSM device. /sys/class/tsm/tsmN/uevent
signals when the PCI layer is able to support establishment of
link encryption and other device-security features coordinated
through a platform tsm.
What: /sys/class/tsm/tsmN/streamH.R.E
Contact: linux-pci@vger.kernel.org
Description:
(RO) When a host bridge has established a secure connection via
the platform TSM, symlink appears. The primary function of this
is have a system global review of TSM resource consumption
across host bridges. The link points to the endpoint PCI device
and matches the same link published by the host bridge. See
Documentation/ABI/testing/sysfs-devices-pci-host-bridge.

View File

@ -254,3 +254,31 @@ Contact: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Description:
The PPS Power Limited bit indicates whether or not the source
supply will exceed the rated output power if requested.
Standard Power Range (SPR) Adjustable Voltage Supplies
What: /sys/class/usb_power_delivery/.../<capability>/<position>:spr_adjustable_voltage_supply
Date: Oct 2025
Contact: Badhri Jagan Sridharan <badhri@google.com>
Description:
Adjustable Voltage Supply (AVS) Augmented PDO (APDO).
What: /sys/class/usb_power_delivery/.../<capability>/<position>:spr_adjustable_voltage_supply/maximum_current_9V_to_15V
Date: Oct 2025
Contact: Badhri Jagan Sridharan <badhri@google.com>
Description:
Maximum Current for 9V to 15V range in milliamperes.
What: /sys/class/usb_power_delivery/.../<capability>/<position>:spr_adjustable_voltage_supply/maximum_current_15V_to_20V
Date: Oct 2025
Contact: Badhri Jagan Sridharan <badhri@google.com>
Description:
Maximum Current for greater than 15V till 20V range in
milliamperes.
What: /sys/class/usb_power_delivery/.../<capability>/<position>:spr_adjustable_voltage_supply/peak_current
Date: Oct 2025
Contact: Badhri Jagan Sridharan <badhri@google.com>
Description:
This file shows the value of the Adjustable Voltage Supply Peak Current
Capability field.

View File

@ -0,0 +1,45 @@
What: /sys/devices/pciDDDD:BB
/sys/devices/.../pciDDDD:BB
Contact: linux-pci@vger.kernel.org
Description:
A PCI host bridge device parents a PCI bus device topology. PCI
controllers may also parent host bridges. The DDDD:BB format
conveys the PCI domain (ACPI segment) number and root bus number
(in hexadecimal) of the host bridge. Note that the domain number
may be larger than the 16-bits that the "DDDD" format implies
for emulated host-bridges.
What: pciDDDD:BB/firmware_node
Contact: linux-pci@vger.kernel.org
Description:
(RO) Symlink to the platform firmware device object "companion"
of the host bridge. For example, an ACPI device with an _HID of
PNP0A08 (/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00). See
/sys/devices/pciDDDD:BB entry for details about the DDDD:BB
format.
What: pciDDDD:BB/streamH.R.E
Contact: linux-pci@vger.kernel.org
Description:
(RO) When a platform has established a secure connection, PCIe
IDE, between two Partner Ports, this symlink appears. A stream
consumes a Stream ID slot in each of the Host bridge (H), Root
Port (R) and Endpoint (E). The link points to the Endpoint PCI
device in the Selective IDE Stream pairing. Specifically, "R"
and "E" represent the assigned Selective IDE Stream Register
Block in the Root Port and Endpoint, and "H" represents a
platform specific pool of stream resources shared by the Root
Ports in a host bridge. See /sys/devices/pciDDDD:BB entry for
details about the DDDD:BB format.
What: pciDDDD:BB/available_secure_streams
Contact: linux-pci@vger.kernel.org
Description:
(RO) When a host bridge has Root Ports that support PCIe IDE
(link encryption and integrity protection) there may be a
limited number of Selective IDE Streams that can be used for
establishing new end-to-end secure links. This attribute
decrements upon secure link setup, and increments upon secure
link teardown. The in-use stream count is determined by counting
stream symlinks. See /sys/devices/pciDDDD:BB entry for details
about the DDDD:BB format.

View File

@ -764,6 +764,17 @@ Description:
participate in load balancing. These CPUs are set by
boot parameter "isolcpus=".
What: /sys/devices/system/cpu/housekeeping
Date: Oct 2025
Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
Description:
(RO) the list of logical CPUs that are designated by the kernel as
"housekeeping". Each CPU are responsible for handling essential
system-wide background tasks, including RCU callbacks, delayed
timer callbacks, and unbound workqueues, minimizing scheduling
jitter on low-latency, isolated CPUs. These CPUs are set when boot
parameter "isolcpus=nohz" or "nohz_full=" is specified.
What: /sys/devices/system/cpu/crash_hotplug
Date: Aug 2023
Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>

View File

@ -0,0 +1,159 @@
What: /sys/bus/pci/drivers/xe/.../sriov_admin/
Date: October 2025
KernelVersion: 6.19
Contact: intel-xe@lists.freedesktop.org
Description:
This directory appears for the particular Intel Xe device when:
- device supports SR-IOV, and
- device is a Physical Function (PF), and
- driver support for the SR-IOV PF is enabled on given device.
This directory is used as a root for all attributes required to
manage both Physical Function (PF) and Virtual Functions (VFs).
What: /sys/bus/pci/drivers/xe/.../sriov_admin/pf/
Date: October 2025
KernelVersion: 6.19
Contact: intel-xe@lists.freedesktop.org
Description:
This directory holds attributes related to the SR-IOV Physical
Function (PF).
What: /sys/bus/pci/drivers/xe/.../sriov_admin/vf1/
What: /sys/bus/pci/drivers/xe/.../sriov_admin/vf2/
What: /sys/bus/pci/drivers/xe/.../sriov_admin/vf<N>/
Date: October 2025
KernelVersion: 6.19
Contact: intel-xe@lists.freedesktop.org
Description:
These directories hold attributes related to the SR-IOV Virtual
Functions (VFs).
Note that the VF number <N> is 1-based as described in PCI SR-IOV
specification as the Xe driver follows that naming schema.
There could be "vf1", "vf2" and so on, up to "vf<N>", where <N>
matches the value of the "sriov_totalvfs" attribute.
What: /sys/bus/pci/drivers/xe/.../sriov_admin/pf/profile/exec_quantum_ms
What: /sys/bus/pci/drivers/xe/.../sriov_admin/pf/profile/preempt_timeout_us
What: /sys/bus/pci/drivers/xe/.../sriov_admin/pf/profile/sched_priority
What: /sys/bus/pci/drivers/xe/.../sriov_admin/vf<n>/profile/exec_quantum_ms
What: /sys/bus/pci/drivers/xe/.../sriov_admin/vf<n>/profile/preempt_timeout_us
What: /sys/bus/pci/drivers/xe/.../sriov_admin/vf<n>/profile/sched_priority
Date: October 2025
KernelVersion: 6.19
Contact: intel-xe@lists.freedesktop.org
Description:
These files expose scheduling parameters for the PF and its VFs, and
are visible only on Intel Xe platforms that use time-sliced GPU sharing.
They can be changed even if VFs are enabled and running and reflect the
settings of all tiles/GTs assigned to the given function.
exec_quantum_ms: (RW) unsigned integer
The GT execution quantum (EQ) in [ms] for the given function.
Actual quantum value might be aligned per HW/FW requirements.
Default is 0 (unlimited).
preempt_timeout_us: (RW) unsigned integer
The GT preemption timeout in [us] of the given function.
Actual timeout value might be aligned per HW/FW requirements.
Default is 0 (unlimited).
sched_priority: (RW/RO) string
The GT scheduling priority of the given function.
"low" - function will be scheduled on the GPU for its EQ/PT
only if function has any work already submitted.
"normal" - functions will be scheduled on the GPU for its EQ/PT
irrespective of whether it has submitted a work or not.
"high" - function will be scheduled on the GPU for its EQ/PT
in the next time-slice after the current one completes
and function has a work submitted.
Default is "low".
When read, this file will display the current and available
scheduling priorities. The currently active priority level will
be enclosed in square brackets, like:
[low] normal high
This file can be read-only if changing the priority is not
supported.
Writes to these attributes may fail with errors like:
-EINVAL if provided input is malformed or not recognized,
-EPERM if change is not applicable on given HW/FW,
-EIO if FW refuses to change the provisioning.
Reads from these attributes may fail with:
-EUCLEAN if value is not consistent across all tiles/GTs.
What: /sys/bus/pci/drivers/xe/.../sriov_admin/.bulk_profile/exec_quantum_ms
What: /sys/bus/pci/drivers/xe/.../sriov_admin/.bulk_profile/preempt_timeout_us
What: /sys/bus/pci/drivers/xe/.../sriov_admin/.bulk_profile/sched_priority
Date: October 2025
KernelVersion: 6.19
Contact: intel-xe@lists.freedesktop.org
Description:
These files allows bulk reconfiguration of the scheduling parameters
of the PF or VFs and are available only for Intel Xe platforms with
GPU sharing based on the time-slice basis. These scheduling parameters
can be changed even if VFs are enabled and running.
exec_quantum_ms: (WO) unsigned integer
The GT execution quantum (EQ) in [ms] to be applied to all functions.
See sriov_admin/{pf,vf<N>}/profile/exec_quantum_ms for more details.
preempt_timeout_us: (WO) unsigned integer
The GT preemption timeout (PT) in [us] to be applied to all functions.
See sriov_admin/{pf,vf<N>}/profile/preempt_timeout_us for more details.
sched_priority: (RW/RO) string
The GT scheduling priority to be applied for all functions.
See sriov_admin/{pf,vf<N>}/profile/sched_priority for more details.
Writes to these attributes may fail with errors like:
-EINVAL if provided input is malformed or not recognized,
-EPERM if change is not applicable on given HW/FW,
-EIO if FW refuses to change the provisioning.
What: /sys/bus/pci/drivers/xe/.../sriov_admin/vf<n>/stop
Date: October 2025
KernelVersion: 6.19
Contact: intel-xe@lists.freedesktop.org
Description:
This file allows to control scheduling of the VF on the Intel Xe GPU
platforms. It allows to implement custom policy mechanism in case VFs
are misbehaving or triggering adverse events above defined thresholds.
stop: (WO) bool
All GT executions of given function shall be immediately stopped.
To allow scheduling this VF again, the VF FLR must be triggered.
Writes to this attribute may fail with errors like:
-EINVAL if provided input is malformed or not recognized,
-EPERM if change is not applicable on given HW/FW,
-EIO if FW refuses to change the scheduling.
What: /sys/bus/pci/drivers/xe/.../sriov_admin/pf/device
What: /sys/bus/pci/drivers/xe/.../sriov_admin/vf<n>/device
Date: October 2025
KernelVersion: 6.19
Contact: intel-xe@lists.freedesktop.org
Description:
These are symlinks to the underlying PCI device entry representing
given Xe SR-IOV function. For the PF, this link is always present.
For VFs, this link is present only for currently enabled VFs.

View File

@ -0,0 +1,29 @@
What: /sys/bus/pci/drivers/uio_pci_sva/<pci_dev>/pasid
Date: September 2025
Contact: Yaxing Guo <guoyaxing@bosc.ac.cn>
Description:
Process Address Space ID (PASID) assigned by IOMMU driver to
the device for use with Shared Virtual Addressing (SVA).
This read-only attribute exposes the PASID (A 20-bit identifier
used in PCIe Address Translation Services and iommu table walks)
allocated by the IOMMU driver during sva device binding.
User-space UIO applications must read this attribute to obtain
the PASID and program it into the device's configuration registers.
This enables the device to perform DMA using user-space virtual
address, with address translation handled by IOMMU.
UIO User-space applications must:
- Opening device and Mapping the device's register space via /dev/uioX
(This triggers the IOMMU driver to allocate the PASID)
- Reading the PASID from sysfs
- Writing the PASID to a device-specific register (with example offset)
The code may be like:
map = mmap(..., "/dev/uio0", ...);
f = fopen("/sys/.../pasid", "r");
fscanf(f, "%d", &pasid);
map[REG_PASID_OFFSET] = pasid;

View File

@ -0,0 +1,53 @@
What: /sys/bus/platform/devices/INOU0000:XX/fn_lock_toggle_enable
Date: November 2025
KernelVersion: 6.19
Contact: Armin Wolf <W_Armin@gmx.de>
Description:
Allows userspace applications to enable/disable the FN lock feature
of the integrated keyboard by writing "1"/"0" into this file.
Reading this file returns the current enable status of the FN lock functionality.
What: /sys/bus/platform/devices/INOU0000:XX/super_key_toggle_enable
Date: November 2025
KernelVersion: 6.19
Contact: Armin Wolf <W_Armin@gmx.de>
Description:
Allows userspace applications to enable/disable the super key functionality
of the integrated keyboard by writing "1"/"0" into this file.
Reading this file returns the current enable status of the super key functionality.
What: /sys/bus/platform/devices/INOU0000:XX/touchpad_toggle_enable
Date: November 2025
KernelVersion: 6.19
Contact: Armin Wolf <W_Armin@gmx.de>
Description:
Allows userspace applications to enable/disable the touchpad toggle functionality
of the integrated touchpad by writing "1"/"0" into this file.
Reading this file returns the current enable status of the touchpad toggle
functionality.
What: /sys/bus/platform/devices/INOU0000:XX/rainbow_animation
Date: November 2025
KernelVersion: 6.19
Contact: Armin Wolf <W_Armin@gmx.de>
Description:
Forces the integrated lightbar to display a rainbow animation when the machine
is not suspended. Writing "1"/"0" into this file enables/disables this
functionality.
Reading this file returns the current status of the rainbow animation functionality.
What: /sys/bus/platform/devices/INOU0000:XX/breathing_in_suspend
Date: November 2025
KernelVersion: 6.19
Contact: Armin Wolf <W_Armin@gmx.de>
Description:
Causes the integrated lightbar to display a breathing animation when the machine
has been suspended and is running on AC power. Writing "1"/"0" into this file
enables/disables this functionality.
Reading this file returns the current status of the breathing animation
functionality.

View File

@ -643,6 +643,12 @@ Contact: "Jaegeuk Kim" <jaegeuk@kernel.org>
Description: Shows the number of unusable blocks in a section which was defined by
the zone capacity reported by underlying zoned device.
What: /sys/fs/f2fs/<disk>/max_open_zones
Date: November 2025
Contact: "Yongpeng Yang" <yangyongpeng@xiaomi.com>
Description: Shows the max number of zones that F2FS can write concurrently when a zoned
device is mounted.
What: /sys/fs/f2fs/<disk>/current_atomic_write
Date: July 2022
Contact: "Daeho Jeong" <daehojeong@google.com>

View File

@ -0,0 +1,61 @@
What: /sys/kernel/kexec/*
Date: Nov 2025
Contact: kexec@lists.infradead.org
Description:
The /sys/kernel/kexec/* directory contains sysfs files
that provide information about the configuration status
of kexec and kdump.
What: /sys/kernel/kexec/loaded
Date: Nov 2025
Contact: kexec@lists.infradead.org
Description: read only
Indicates whether a new kernel image has been loaded
into memory using the kexec system call. It shows 1 if
a kexec image is present and ready to boot, or 0 if none
is loaded.
User: kexec tools, kdump service
What: /sys/kernel/kexec/crash_loaded
Date: Nov 2025
Contact: kexec@lists.infradead.org
Description: read only
Indicates whether a crash (kdump) kernel is currently
loaded into memory. It shows 1 if a crash kernel has been
successfully loaded for panic handling, or 0 if no crash
kernel is present.
User: Kexec tools, Kdump service
What: /sys/kernel/kexec/crash_size
Date: Nov 2025
Contact: kexec@lists.infradead.org
Description: read/write
Shows the amount of memory reserved for loading the crash
(kdump) kernel. It reports the size, in bytes, of the
crash kernel area defined by the crashkernel= parameter.
This interface also allows reducing the crashkernel
reservation by writing a smaller value, and the reclaimed
space is added back to the system RAM.
User: Kdump service
What: /sys/kernel/kexec/crash_elfcorehdr_size
Date: Nov 2025
Contact: kexec@lists.infradead.org
Description: read only
Indicates the preferred size of the memory buffer for the
ELF core header used by the crash (kdump) kernel. It defines
how much space is needed to hold metadata about the crashed
system, including CPU and memory information. This information
is used by the user space utility kexec to support updating the
in-kernel kdump image during hotplug operations.
User: Kexec tools
What: /sys/kernel/kexec/crash_cma_ranges
Date: Nov 2025
Contact: kexec@lists.infradead.org
Description: read only
Provides information about the memory ranges reserved from
the Contiguous Memory Allocator (CMA) area that are allocated
to the crash (kdump) kernel. It lists the start and end physical
addresses of CMA regions assigned for crashkernel use.
User: kdump service

View File

@ -164,6 +164,13 @@ Description: Writing to and reading from this file sets and gets the pid of
the target process if the context is for virtual address spaces
monitoring, respectively.
What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/targets/<T>/obsolete_target
Date: Oct 2025
Contact: SeongJae Park <sj@kernel.org>
Description: Writing to and reading from this file sets and gets the
obsoleteness of the matching parameters commit destination
target.
What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/targets/<T>/regions/nr_regions
Date: Mar 2022
Contact: SeongJae Park <sj@kernel.org>
@ -303,6 +310,12 @@ Contact: SeongJae Park <sj@kernel.org>
Description: Writing to and reading from this file sets and gets the nid
parameter of the goal.
What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/quotas/goals/<G>/path
Date: Oct 2025
Contact: SeongJae Park <sj@kernel.org>
Description: Writing to and reading from this file sets and gets the path
parameter of the goal.
What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/quotas/weights/sz_permil
Date: Mar 2022
Contact: SeongJae Park <sj@kernel.org>

View File

@ -59,6 +59,8 @@ Description: Module taint flags:
F force-loaded module
C staging driver module
E unsigned module
K livepatch module
N in-kernel test module
== =====================
What: /sys/module/grant_table/parameters/free_per_iteration

View File

@ -63,6 +63,7 @@ Date: Aug 2022
KernelVersion: 6.1
Contact: "Luke Jones" <luke@ljones.dev>
Description:
DEPRECATED, WILL BE REMOVED SOON: please use asus-armoury
Switch the GPU hardware MUX mode. Laptops with this feature can
can be toggled to boot with only the dGPU (discrete mode) or in
standard Optimus/Hybrid mode. On switch a reboot is required:
@ -75,6 +76,7 @@ Date: Aug 2022
KernelVersion: 5.17
Contact: "Luke Jones" <luke@ljones.dev>
Description:
DEPRECATED, WILL BE REMOVED SOON: please use asus-armoury
Disable discrete GPU:
* 0 - Enable dGPU,
* 1 - Disable dGPU
@ -84,6 +86,7 @@ Date: Aug 2022
KernelVersion: 5.17
Contact: "Luke Jones" <luke@ljones.dev>
Description:
DEPRECATED, WILL BE REMOVED SOON: please use asus-armoury
Enable the external GPU paired with ROG X-Flow laptops.
Toggling this setting will also trigger ACPI to disable the dGPU:
@ -95,6 +98,7 @@ Date: Aug 2022
KernelVersion: 5.17
Contact: "Luke Jones" <luke@ljones.dev>
Description:
DEPRECATED, WILL BE REMOVED SOON: please use asus-armoury
Enable an LCD response-time boost to reduce or remove ghosting:
* 0 - Disable,
* 1 - Enable
@ -104,6 +108,7 @@ Date: Jun 2023
KernelVersion: 6.5
Contact: "Luke Jones" <luke@ljones.dev>
Description:
DEPRECATED, WILL BE REMOVED SOON: please use asus-armoury
Get the current charging mode being used:
* 1 - Barrel connected charger,
* 2 - USB-C charging
@ -114,6 +119,7 @@ Date: Jun 2023
KernelVersion: 6.5
Contact: "Luke Jones" <luke@ljones.dev>
Description:
DEPRECATED, WILL BE REMOVED SOON: please use asus-armoury
Show if the egpu (XG Mobile) is correctly connected:
* 0 - False,
* 1 - True
@ -123,6 +129,7 @@ Date: Jun 2023
KernelVersion: 6.5
Contact: "Luke Jones" <luke@ljones.dev>
Description:
DEPRECATED, WILL BE REMOVED SOON: please use asus-armoury
Change the mini-LED mode:
* 0 - Single-zone,
* 1 - Multi-zone
@ -133,6 +140,7 @@ Date: Apr 2024
KernelVersion: 6.10
Contact: "Luke Jones" <luke@ljones.dev>
Description:
DEPRECATED, WILL BE REMOVED SOON: please use asus-armoury
List the available mini-led modes.
What: /sys/devices/platform/<platform>/ppt_pl1_spl
@ -140,6 +148,7 @@ Date: Jun 2023
KernelVersion: 6.5
Contact: "Luke Jones" <luke@ljones.dev>
Description:
DEPRECATED, WILL BE REMOVED SOON: please use asus-armoury
Set the Package Power Target total of CPU: PL1 on Intel, SPL on AMD.
Shown on Intel+Nvidia or AMD+Nvidia based systems:
@ -150,6 +159,7 @@ Date: Jun 2023
KernelVersion: 6.5
Contact: "Luke Jones" <luke@ljones.dev>
Description:
DEPRECATED, WILL BE REMOVED SOON: please use asus-armoury
Set the Slow Package Power Tracking Limit of CPU: PL2 on Intel, SPPT,
on AMD. Shown on Intel+Nvidia or AMD+Nvidia based systems:
@ -160,6 +170,7 @@ Date: Jun 2023
KernelVersion: 6.5
Contact: "Luke Jones" <luke@ljones.dev>
Description:
DEPRECATED, WILL BE REMOVED SOON: please use asus-armoury
Set the Fast Package Power Tracking Limit of CPU. AMD+Nvidia only:
* min=5, max=250
@ -168,6 +179,7 @@ Date: Jun 2023
KernelVersion: 6.5
Contact: "Luke Jones" <luke@ljones.dev>
Description:
DEPRECATED, WILL BE REMOVED SOON: please use asus-armoury
Set the APU SPPT limit. Shown on full AMD systems only:
* min=5, max=130
@ -176,6 +188,7 @@ Date: Jun 2023
KernelVersion: 6.5
Contact: "Luke Jones" <luke@ljones.dev>
Description:
DEPRECATED, WILL BE REMOVED SOON: please use asus-armoury
Set the platform SPPT limit. Shown on full AMD systems only:
* min=5, max=130
@ -184,6 +197,7 @@ Date: Jun 2023
KernelVersion: 6.5
Contact: "Luke Jones" <luke@ljones.dev>
Description:
DEPRECATED, WILL BE REMOVED SOON: please use asus-armoury
Set the dynamic boost limit of the Nvidia dGPU:
* min=5, max=25
@ -192,6 +206,7 @@ Date: Jun 2023
KernelVersion: 6.5
Contact: "Luke Jones" <luke@ljones.dev>
Description:
DEPRECATED, WILL BE REMOVED SOON: please use asus-armoury
Set the target temperature limit of the Nvidia dGPU:
* min=75, max=87
@ -200,6 +215,7 @@ Date: Apr 2024
KernelVersion: 6.10
Contact: "Luke Jones" <luke@ljones.dev>
Description:
DEPRECATED, WILL BE REMOVED SOON: please use asus-armoury
Set if the BIOS POST sound is played on boot.
* 0 - False,
* 1 - True
@ -209,6 +225,7 @@ Date: Apr 2024
KernelVersion: 6.10
Contact: "Luke Jones" <luke@ljones.dev>
Description:
DEPRECATED, WILL BE REMOVED SOON: please use asus-armoury
Set if the MCU can go in to low-power mode on system sleep
* 0 - False,
* 1 - True

View File

@ -0,0 +1,19 @@
What: /sys/devices/platform/ayaneo-ec/controller_power
Date: Nov 2025
KernelVersion: 6.19
Contact: "Antheas Kapenekakis" <lkml@antheas.dev>
Description:
Current controller power state. Allows turning on and off
the controller power (e.g. for power savings). Write 1 to
turn on, 0 to turn off. File is readable and writable.
What: /sys/devices/platform/ayaneo-ec/controller_modules
Date: Nov 2025
KernelVersion: 6.19
Contact: "Antheas Kapenekakis" <lkml@antheas.dev>
Description:
Shows which controller modules are currently connected to
the device. Possible values are "left", "right" and "both".
File is read-only. The Windows software for this device
will only set controller power to 1 if both module sides
are connected (i.e. this file returns "both").

View File

@ -454,3 +454,19 @@ Description:
disables it. Reads from the file return the current value.
The default is "1" if the build-time "SUSPEND_SKIP_SYNC" config
flag is unset, or "0" otherwise.
What: /sys/power/hibernate_compression_threads
Date: October 2025
Contact: <luoxueqin@kylinos.cn>
Description:
Controls the number of threads used for compression
and decompression of hibernation images.
The value can be adjusted at runtime to balance
performance and CPU utilization.
The change takes effect on the next hibernation or
resume operation.
Minimum value: 1
Default value: 3

View File

@ -19,7 +19,7 @@ config WARN_ABI_ERRORS
described at Documentation/ABI/README. Yet, as they're manually
written, it would be possible that some of those files would
have errors that would break them for being parsed by
scripts/get_abi.pl. Add a check to verify them.
tools/docs/get_abi.py. Add a check to verify them.
If unsure, select 'N'.

View File

@ -8,12 +8,12 @@ subdir- := devicetree/bindings
ifneq ($(MAKECMDGOALS),cleandocs)
# Check for broken documentation file references
ifeq ($(CONFIG_WARN_MISSING_DOCUMENTS),y)
$(shell $(srctree)/scripts/documentation-file-ref-check --warn)
$(shell $(srctree)/tools/docs/documentation-file-ref-check --warn)
endif
# Check for broken ABI files
ifeq ($(CONFIG_WARN_ABI_ERRORS),y)
$(shell $(srctree)/scripts/get_abi.py --dir $(srctree)/Documentation/ABI validate)
$(shell $(srctree)/tools/docs/get_abi.py --dir $(srctree)/Documentation/ABI validate)
endif
endif
@ -23,21 +23,22 @@ SPHINXOPTS =
SPHINXDIRS = .
DOCS_THEME =
DOCS_CSS =
_SPHINXDIRS = $(sort $(patsubst $(srctree)/Documentation/%/index.rst,%,$(wildcard $(srctree)/Documentation/*/index.rst)))
SPHINX_CONF = conf.py
RUSTDOC =
PAPER =
BUILDDIR = $(obj)/output
PDFLATEX = xelatex
LATEXOPTS = -interaction=batchmode -no-shell-escape
PYTHONPYCACHEPREFIX ?= $(abspath $(BUILDDIR)/__pycache__)
# Wrapper for sphinx-build
BUILD_WRAPPER = $(srctree)/tools/docs/sphinx-build-wrapper
# For denylisting "variable font" files
# Can be overridden by setting as an env variable
FONTS_CONF_DENY_VF ?= $(HOME)/deny-vf
ifeq ($(findstring 1, $(KBUILD_VERBOSE)),)
SPHINXOPTS += "-q"
endif
# User-friendly check for sphinx-build
HAVE_SPHINX := $(shell if which $(SPHINXBUILD) >/dev/null 2>&1; then echo 1; else echo 0; fi)
@ -46,141 +47,46 @@ ifeq ($(HAVE_SPHINX),0)
.DEFAULT:
$(warning The '$(SPHINXBUILD)' command was not found. Make sure you have Sphinx installed and in PATH, or set the SPHINXBUILD make variable to point to the full path of the '$(SPHINXBUILD)' executable.)
@echo
@$(srctree)/scripts/sphinx-pre-install
@$(srctree)/tools/docs/sphinx-pre-install
@echo " SKIP Sphinx $@ target."
else # HAVE_SPHINX
# User-friendly check for pdflatex and latexmk
HAVE_PDFLATEX := $(shell if which $(PDFLATEX) >/dev/null 2>&1; then echo 1; else echo 0; fi)
HAVE_LATEXMK := $(shell if which latexmk >/dev/null 2>&1; then echo 1; else echo 0; fi)
# Common documentation targets
htmldocs mandocs infodocs texinfodocs latexdocs epubdocs xmldocs pdfdocs linkcheckdocs:
$(Q)PYTHONPYCACHEPREFIX="$(PYTHONPYCACHEPREFIX)" \
$(srctree)/tools/docs/sphinx-pre-install --version-check
+$(Q)PYTHONPYCACHEPREFIX="$(PYTHONPYCACHEPREFIX)" \
$(PYTHON3) $(BUILD_WRAPPER) $@ \
--sphinxdirs="$(SPHINXDIRS)" $(RUSTDOC) \
--builddir="$(BUILDDIR)" --deny-vf=$(FONTS_CONF_DENY_VF) \
--theme=$(DOCS_THEME) --css=$(DOCS_CSS) --paper=$(PAPER)
ifeq ($(HAVE_LATEXMK),1)
PDFLATEX := latexmk -$(PDFLATEX)
endif #HAVE_LATEXMK
# Internal variables.
PAPEROPT_a4 = -D latex_elements.papersize=a4paper
PAPEROPT_letter = -D latex_elements.papersize=letterpaper
ALLSPHINXOPTS = -D kerneldoc_srctree=$(srctree) -D kerneldoc_bin=$(KERNELDOC)
ALLSPHINXOPTS += $(PAPEROPT_$(PAPER)) $(SPHINXOPTS)
ifneq ($(wildcard $(srctree)/.config),)
ifeq ($(CONFIG_RUST),y)
# Let Sphinx know we will include rustdoc
ALLSPHINXOPTS += -t rustdoc
endif
endif
# the i18n builder cannot share the environment and doctrees with the others
I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) .
# commands; the 'cmd' from scripts/Kbuild.include is not *loopable*
loop_cmd = $(echo-cmd) $(cmd_$(1)) || exit;
# $2 sphinx builder e.g. "html"
# $3 name of the build subfolder / e.g. "userspace-api/media", used as:
# * dest folder relative to $(BUILDDIR) and
# * cache folder relative to $(BUILDDIR)/.doctrees
# $4 dest subfolder e.g. "man" for man pages at userspace-api/media/man
# $5 reST source folder relative to $(src),
# e.g. "userspace-api/media" for the linux-tv book-set at ./Documentation/userspace-api/media
PYTHONPYCACHEPREFIX ?= $(abspath $(BUILDDIR)/__pycache__)
quiet_cmd_sphinx = SPHINX $@ --> file://$(abspath $(BUILDDIR)/$3/$4)
cmd_sphinx = \
PYTHONPYCACHEPREFIX="$(PYTHONPYCACHEPREFIX)" \
BUILDDIR=$(abspath $(BUILDDIR)) SPHINX_CONF=$(abspath $(src)/$5/$(SPHINX_CONF)) \
$(PYTHON3) $(srctree)/scripts/jobserver-exec \
$(CONFIG_SHELL) $(srctree)/Documentation/sphinx/parallel-wrapper.sh \
$(SPHINXBUILD) \
-b $2 \
-c $(abspath $(src)) \
-d $(abspath $(BUILDDIR)/.doctrees/$3) \
-D version=$(KERNELVERSION) -D release=$(KERNELRELEASE) \
$(ALLSPHINXOPTS) \
$(abspath $(src)/$5) \
$(abspath $(BUILDDIR)/$3/$4) && \
if [ "x$(DOCS_CSS)" != "x" ]; then \
cp $(if $(patsubst /%,,$(DOCS_CSS)),$(abspath $(srctree)/$(DOCS_CSS)),$(DOCS_CSS)) $(BUILDDIR)/$3/_static/; \
fi
htmldocs:
@$(srctree)/scripts/sphinx-pre-install --version-check
@+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,html,$(var),,$(var)))
htmldocs-redirects: $(srctree)/Documentation/.renames.txt
@tools/docs/gen-redirects.py --output $(BUILDDIR) < $<
# If Rust support is available and .config exists, add rustdoc generated contents.
# If there are any, the errors from this make rustdoc will be displayed but
# won't stop the execution of htmldocs
ifneq ($(wildcard $(srctree)/.config),)
ifeq ($(CONFIG_RUST),y)
$(Q)$(MAKE) rustdoc || true
endif
endif
texinfodocs:
@$(srctree)/scripts/sphinx-pre-install --version-check
@+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,texinfo,$(var),texinfo,$(var)))
# Note: the 'info' Make target is generated by sphinx itself when
# running the texinfodocs target define above.
infodocs: texinfodocs
$(MAKE) -C $(BUILDDIR)/texinfo info
linkcheckdocs:
@$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,linkcheck,$(var),,$(var)))
latexdocs:
@$(srctree)/scripts/sphinx-pre-install --version-check
@+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,latex,$(var),latex,$(var)))
ifeq ($(HAVE_PDFLATEX),0)
pdfdocs:
$(warning The '$(PDFLATEX)' command was not found. Make sure you have it installed and in PATH to produce PDF output.)
@echo " SKIP Sphinx $@ target."
else # HAVE_PDFLATEX
pdfdocs: DENY_VF = XDG_CONFIG_HOME=$(FONTS_CONF_DENY_VF)
pdfdocs: latexdocs
@$(srctree)/scripts/sphinx-pre-install --version-check
$(foreach var,$(SPHINXDIRS), \
$(MAKE) PDFLATEX="$(PDFLATEX)" LATEXOPTS="$(LATEXOPTS)" $(DENY_VF) -C $(BUILDDIR)/$(var)/latex || sh $(srctree)/scripts/check-variable-fonts.sh || exit; \
mkdir -p $(BUILDDIR)/$(var)/pdf; \
mv $(subst .tex,.pdf,$(wildcard $(BUILDDIR)/$(var)/latex/*.tex)) $(BUILDDIR)/$(var)/pdf/; \
)
endif # HAVE_PDFLATEX
epubdocs:
@$(srctree)/scripts/sphinx-pre-install --version-check
@+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,epub,$(var),epub,$(var)))
xmldocs:
@$(srctree)/scripts/sphinx-pre-install --version-check
@+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,xml,$(var),xml,$(var)))
endif # HAVE_SPHINX
# The following targets are independent of HAVE_SPHINX, and the rules should
# work or silently pass without Sphinx.
htmldocs-redirects: $(srctree)/Documentation/.renames.txt
@tools/docs/gen-redirects.py --output $(BUILDDIR) < $<
refcheckdocs:
$(Q)cd $(srctree);scripts/documentation-file-ref-check
$(Q)cd $(srctree); tools/docs/documentation-file-ref-check
cleandocs:
$(Q)rm -rf $(BUILDDIR)
# Used only on help
_SPHINXDIRS = $(shell printf "%s\n" $(patsubst $(srctree)/Documentation/%/index.rst,%,$(wildcard $(srctree)/Documentation/*/index.rst)) | sort -f)
dochelp:
@echo ' Linux kernel internal documentation in different formats from ReST:'
@echo ' htmldocs - HTML'
@echo ' htmldocs-redirects - generate HTML redirects for moved pages'
@echo ' texinfodocs - Texinfo'
@echo ' infodocs - Info'
@echo ' mandocs - Man pages'
@echo ' latexdocs - LaTeX'
@echo ' pdfdocs - PDF'
@echo ' epubdocs - EPUB'
@ -192,13 +98,17 @@ dochelp:
@echo ' cleandocs - clean all generated files'
@echo
@echo ' make SPHINXDIRS="s1 s2" [target] Generate only docs of folder s1, s2'
@echo ' valid values for SPHINXDIRS are: $(_SPHINXDIRS)'
@echo
@echo ' make SPHINX_CONF={conf-file} [target] use *additional* sphinx-build'
@echo ' configuration. This is e.g. useful to build with nit-picking config.'
@echo ' top level values for SPHINXDIRS are: $(_SPHINXDIRS)'
@echo ' you may also use a subdirectory like SPHINXDIRS=userspace-api/media,'
@echo ' provided that there is an index.rst file at the subdirectory.'
@echo
@echo ' make DOCS_THEME={sphinx-theme} selects a different Sphinx theme.'
@echo
@echo ' make DOCS_CSS={a .css file} adds a DOCS_CSS override file for html/epub output.'
@echo
@echo ' make PAPER={a4|letter} Specifies the paper size used for LaTeX/PDF output.'
@echo
@echo ' make FONTS_CONF_DENY_VF={path} sets a deny list to block variable Noto CJK fonts'
@echo ' for PDF build. See tools/lib/python/kdoc/latex_fonts.py for more details'
@echo
@echo ' Default location for the generated documents is Documentation/output'

View File

@ -326,6 +326,21 @@ be recovered, there is nothing more that can be done; the platform
will typically report a "permanent failure" in such a case. The
device will be considered "dead" in this case.
Drivers typically need to call pci_restore_state() after reset to
re-initialize the device's config space registers and thereby
bring it from D0\ :sub:`uninitialized` into D0\ :sub:`active` state
(PCIe r7.0 sec 5.3.1.1). The PCI core invokes pci_save_state()
on enumeration after initializing config space to ensure that a
saved state is available for subsequent error recovery.
Drivers which modify config space on probe may need to invoke
pci_save_state() afterwards to record those changes for later
error recovery. When going into system suspend, pci_save_state()
is called for every PCI device and that state will be restored
not only on resume, but also on any subsequent error recovery.
In the unlikely event that the saved state recorded on suspend
is unsuitable for error recovery, drivers should call
pci_save_state() on resume.
Drivers for multi-function cards will need to coordinate among
themselves as to which driver instance will perform any "one-shot"
or global device initialization. For example, the Symbios sym53cxx2

View File

@ -2637,15 +2637,16 @@ synchronize_srcu() for some other domain ``ss1``, and if an
that was held across as ``ss``-domain synchronize_srcu(), deadlock
would again be possible. Such a deadlock cycle could extend across an
arbitrarily large number of different SRCU domains. Again, with great
power comes great responsibility.
power comes great responsibility, though lockdep is now able to detect
this sort of deadlock.
Unlike the other RCU flavors, SRCU read-side critical sections can run
on idle and even offline CPUs. This ability requires that
srcu_read_lock() and srcu_read_unlock() contain memory barriers,
which means that SRCU readers will run a bit slower than would RCU
readers. It also motivates the smp_mb__after_srcu_read_unlock() API,
which, in combination with srcu_read_unlock(), guarantees a full
memory barrier.
Unlike the other RCU flavors, SRCU read-side critical sections can run on
idle and even offline CPUs, with the exception of srcu_read_lock_fast()
and friends. This ability requires that srcu_read_lock() and
srcu_read_unlock() contain memory barriers, which means that SRCU
readers will run a bit slower than would RCU readers. It also motivates
the smp_mb__after_srcu_read_unlock() API, which, in combination with
srcu_read_unlock(), guarantees a full memory barrier.
Also unlike other RCU flavors, synchronize_srcu() may **not** be
invoked from CPU-hotplug notifiers, due to the fact that SRCU grace
@ -2681,15 +2682,15 @@ run some tests first. SRCU just might need a few adjustment to deal with
that sort of load. Of course, your mileage may vary based on the speed
of your CPUs and the size of your memory.
The `SRCU
API <https://lwn.net/Articles/609973/#RCU%20Per-Flavor%20API%20Table>`__
The `SRCU API
<https://lwn.net/Articles/609973/#RCU%20Per-Flavor%20API%20Table>`__
includes srcu_read_lock(), srcu_read_unlock(),
srcu_dereference(), srcu_dereference_check(),
synchronize_srcu(), synchronize_srcu_expedited(),
call_srcu(), srcu_barrier(), and srcu_read_lock_held(). It
also includes DEFINE_SRCU(), DEFINE_STATIC_SRCU(), and
init_srcu_struct() APIs for defining and initializing
``srcu_struct`` structures.
srcu_dereference(), srcu_dereference_check(), synchronize_srcu(),
synchronize_srcu_expedited(), call_srcu(), srcu_barrier(),
and srcu_read_lock_held(). It also includes DEFINE_SRCU(),
DEFINE_STATIC_SRCU(), DEFINE_SRCU_FAST(), DEFINE_STATIC_SRCU_FAST(),
init_srcu_struct(), and init_srcu_struct_fast() APIs for defining and
initializing ``srcu_struct`` structures.
More recently, the SRCU API has added polling interfaces:

View File

@ -417,11 +417,13 @@ over a rather long period of time, but improvements are always welcome!
you should be using RCU rather than SRCU, because RCU is almost
always faster and easier to use than is SRCU.
Also unlike other forms of RCU, explicit initialization and
cleanup is required either at build time via DEFINE_SRCU()
or DEFINE_STATIC_SRCU() or at runtime via init_srcu_struct()
and cleanup_srcu_struct(). These last two are passed a
"struct srcu_struct" that defines the scope of a given
Also unlike other forms of RCU, explicit initialization
and cleanup is required either at build time via
DEFINE_SRCU(), DEFINE_STATIC_SRCU(), DEFINE_SRCU_FAST(),
or DEFINE_STATIC_SRCU_FAST() or at runtime via either
init_srcu_struct() or init_srcu_struct_fast() and
cleanup_srcu_struct(). These last three are passed a
`struct srcu_struct` that defines the scope of a given
SRCU domain. Once initialized, the srcu_struct is passed
to srcu_read_lock(), srcu_read_unlock() synchronize_srcu(),
synchronize_srcu_expedited(), and call_srcu(). A given

View File

@ -1227,7 +1227,10 @@ SRCU: Initialization/cleanup/ordering::
DEFINE_SRCU
DEFINE_STATIC_SRCU
DEFINE_SRCU_FAST // for srcu_read_lock_fast() and friends
DEFINE_STATIC_SRCU_FAST // for srcu_read_lock_fast() and friends
init_srcu_struct
init_srcu_struct_fast
cleanup_srcu_struct
smp_mb__after_srcu_read_unlock

View File

@ -487,8 +487,8 @@ one user crashes, the fallout of that should be limited to that workload and not
impact other workloads. SSR accomplishes this.
If a particular workload crashes, QSM notifies the host via the QAIC_SSR MHI
channel. This notification identifies the workload by it's assigned DBC. A
multi-stage recovery process is then used to cleanup both sides, and get the
channel. This notification identifies the workload by its assigned DBC. A
multi-stage recovery process is then used to cleanup both sides, and gets the
DBC/NSPs into a working state.
When SSR occurs, any state in the workload is lost. Any inputs that were in
@ -496,6 +496,27 @@ process, or queued by not yet serviced, are lost. The loaded artifacts will
remain in on-card DDR, but the host will need to re-activate the workload if
it desires to recover the workload.
When SSR occurs for a specific NSP, the assigned DBC goes through the
following state transactions in order:
DBC_STATE_BEFORE_SHUTDOWN
Indicates that the affected NSP was found in an unrecoverable error
condition.
DBC_STATE_AFTER_SHUTDOWN
Indicates that the NSP is under reset.
DBC_STATE_BEFORE_POWER_UP
Indicates that the NSP's debug information has been collected, and is
ready to be collected by the host (if desired). At that stage the NSP
is restarted by QSM.
DBC_STATE_AFTER_POWER_UP
Indicates that the NSP has been restarted, fully operational and is
in idle state.
SSR also has an optional crashdump collection feature. If enabled, the host can
collect the memory dump for the crashed NSP and dump it to the user space via
the dev_coredump subsystem. The host can also decline the crashdump collection
request from the device.
Reliability, Accessibility, Serviceability (RAS)
================================================

View File

@ -36,7 +36,7 @@ polling mode and reenables the IRQ line.
This mitigation in QAIC is very effective. The same lprnet usecase that
generates 100k IRQs per second (per /proc/interrupts) is reduced to roughly 64
IRQs over 5 minutes while keeping the host system stable, and having the same
workload throughput performance (within run to run noise variation).
workload throughput performance (within run-to-run noise variation).
Single MSI Mode
---------------
@ -49,7 +49,7 @@ useful to be able to fall back to a single MSI when needed.
To support this fallback, we allow the case where only one MSI is able to be
allocated, and share that one MSI between MHI and the DBCs. The device detects
when only one MSI has been configured and directs the interrupts for the DBCs
to the interrupt normally used for MHI. Unfortunately this means that the
to the interrupt normally used for MHI. Unfortunately, this means that the
interrupt handlers for every DBC and MHI wake up for every interrupt that
arrives; however, the DBC threaded irq handlers only are started when work to be
done is detected (MHI will always start its threaded handler).
@ -62,9 +62,9 @@ never disabled, allowing each new entry to the FIFO to trigger a new interrupt.
Neural Network Control (NNC) Protocol
=====================================
The implementation of NNC is split between the KMD (QAIC) and UMD. In general
The implementation of NNC is split between the KMD (QAIC) and UMD. In general,
QAIC understands how to encode/decode NNC wire protocol, and elements of the
protocol which require kernel space knowledge to process (for example, mapping
protocol which requires kernel space knowledge to process (for example, mapping
host memory to device IOVAs). QAIC understands the structure of a message, and
all of the transactions. QAIC does not understand commands (the payload of a
passthrough transaction).

View File

@ -76,41 +76,43 @@ The messages are in the format::
The taskstats payload is one of the following three kinds:
1. Commands: Sent from user to kernel. Commands to get data on
a pid/tgid consist of one attribute, of type TASKSTATS_CMD_ATTR_PID/TGID,
containing a u32 pid or tgid in the attribute payload. The pid/tgid denotes
the task/process for which userspace wants statistics.
a pid/tgid consist of one attribute, of type TASKSTATS_CMD_ATTR_PID/TGID,
containing a u32 pid or tgid in the attribute payload. The pid/tgid denotes
the task/process for which userspace wants statistics.
Commands to register/deregister interest in exit data from a set of cpus
consist of one attribute, of type
TASKSTATS_CMD_ATTR_REGISTER/DEREGISTER_CPUMASK and contain a cpumask in the
attribute payload. The cpumask is specified as an ascii string of
comma-separated cpu ranges e.g. to listen to exit data from cpus 1,2,3,5,7,8
the cpumask would be "1-3,5,7-8". If userspace forgets to deregister interest
in cpus before closing the listening socket, the kernel cleans up its interest
set over time. However, for the sake of efficiency, an explicit deregistration
is advisable.
Commands to register/deregister interest in exit data from a set of cpus
consist of one attribute, of type
TASKSTATS_CMD_ATTR_REGISTER/DEREGISTER_CPUMASK and contain a cpumask in the
attribute payload. The cpumask is specified as an ascii string of
comma-separated cpu ranges e.g. to listen to exit data from cpus 1,2,3,5,7,8
the cpumask would be "1-3,5,7-8". If userspace forgets to deregister
interest in cpus before closing the listening socket, the kernel cleans up
its interest set over time. However, for the sake of efficiency, an explicit
deregistration is advisable.
2. Response for a command: sent from the kernel in response to a userspace
command. The payload is a series of three attributes of type:
command. The payload is a series of three attributes of type:
a) TASKSTATS_TYPE_AGGR_PID/TGID : attribute containing no payload but indicates
a pid/tgid will be followed by some stats.
a) TASKSTATS_TYPE_AGGR_PID/TGID: attribute containing no payload but
indicates a pid/tgid will be followed by some stats.
b) TASKSTATS_TYPE_PID/TGID: attribute whose payload is the pid/tgid whose stats
are being returned.
b) TASKSTATS_TYPE_PID/TGID: attribute whose payload is the pid/tgid whose
stats are being returned.
c) TASKSTATS_TYPE_STATS: attribute with a struct taskstats as payload. The
same structure is used for both per-pid and per-tgid stats.
c) TASKSTATS_TYPE_STATS: attribute with a struct taskstats as payload. The
same structure is used for both per-pid and per-tgid stats.
3. New message sent by kernel whenever a task exits. The payload consists of a
series of attributes of the following type:
a) TASKSTATS_TYPE_AGGR_PID: indicates next two attributes will be pid+stats
b) TASKSTATS_TYPE_PID: contains exiting task's pid
c) TASKSTATS_TYPE_STATS: contains the exiting task's per-pid stats
d) TASKSTATS_TYPE_AGGR_TGID: indicates next two attributes will be tgid+stats
e) TASKSTATS_TYPE_TGID: contains tgid of process to which task belongs
f) TASKSTATS_TYPE_STATS: contains the per-tgid stats for exiting task's process
a) TASKSTATS_TYPE_AGGR_PID: indicates next two attributes will be pid+stats
b) TASKSTATS_TYPE_PID: contains exiting task's pid
c) TASKSTATS_TYPE_STATS: contains the exiting task's per-pid stats
d) TASKSTATS_TYPE_AGGR_TGID: indicates next two attributes will be
tgid+stats
e) TASKSTATS_TYPE_TGID: contains tgid of process to which task belongs
f) TASKSTATS_TYPE_STATS: contains the per-tgid stats for exiting task's
process
per-tgid stats

View File

@ -601,10 +601,15 @@ specification.
Task Attribute
~~~~~~~~~~~~~~
The Smack label of a process can be read from /proc/<pid>/attr/current. A
process can read its own Smack label from /proc/self/attr/current. A
The Smack label of a process can be read from ``/proc/<pid>/attr/current``. A
process can read its own Smack label from ``/proc/self/attr/current``. A
privileged process can change its own Smack label by writing to
/proc/self/attr/current but not the label of another process.
``/proc/self/attr/current`` but not the label of another process.
Format of writing is : only the label or the label followed by one of the
3 trailers: ``\n`` (by common agreement for ``/proc/...`` interfaces),
``\0`` (because some applications incorrectly include it),
``\n\0`` (because we think some applications may incorrectly include it).
File Attribute
~~~~~~~~~~~~~~
@ -696,6 +701,11 @@ sockets.
A privileged program may set this to match the label of another
task with which it hopes to communicate.
UNIX domain socket (UDS) with a BSD address functions both as a file in a
filesystem and as a socket. As a file, it carries the SMACK64 attribute. This
attribute is not involved in Smack security enforcement and is immutably
assigned the label "*".
Smack Netlabel Exceptions
~~~~~~~~~~~~~~~~~~~~~~~~~

View File

@ -95,7 +95,20 @@ languages when these scripts are invoked by passing these program files
to the interpreter. This is because the way interpreters execute these
files; the scripts themselves are not evaluated as executable code
through one of IPE's hooks, but they are merely text files that are read
(as opposed to compiled executables) [#interpreters]_.
(as opposed to compiled executables). However, with the introduction of the
``AT_EXECVE_CHECK`` flag (:doc:`AT_EXECVE_CHECK </userspace-api/check_exec>`),
interpreters can use it to signal the kernel that a script file will be executed,
and request the kernel to perform LSM security checks on it.
IPE's EXECUTE operation enforcement differs between compiled executables and
interpreted scripts: For compiled executables, enforcement is triggered
automatically by the kernel during ``execve()``, ``execveat()``, ``mmap()``
and ``mprotect()`` syscalls when loading executable content. For interpreted
scripts, enforcement requires explicit interpreter integration using
``execveat()`` with ``AT_EXECVE_CHECK`` flag. Unlike exec syscalls that IPE
intercepts during the execution process, this mechanism needs the interpreter
to take the initiative, and existing interpreters won't be automatically
supported unless the signal call is added.
Threat Model
------------
@ -806,8 +819,6 @@ A:
.. [#digest_cache_lsm] https://lore.kernel.org/lkml/20240415142436.2545003-1-roberto.sassu@huaweicloud.com/
.. [#interpreters] There is `some interest in solving this issue <https://lore.kernel.org/lkml/20220321161557.495388-1-mic@digikod.net/>`_.
.. [#devdoc] Please see :doc:`the design docs </security/ipe>` for more on
this topic.

View File

@ -406,24 +406,8 @@ index of the MC::
|->mc2
....
Under each ``mcX`` directory each ``csrowX`` is again represented by a
``csrowX``, where ``X`` is the csrow index::
.../mc/mc0/
|
|->csrow0
|->csrow2
|->csrow3
....
Notice that there is no csrow1, which indicates that csrow0 is composed
of a single ranked DIMMs. This should also apply in both Channels, in
order to have dual-channel mode be operational. Since both csrow2 and
csrow3 are populated, this indicates a dual ranked set of DIMMs for
channels 0 and 1.
Within each of the ``mcX`` and ``csrowX`` directories are several EDAC
control and attribute files.
Within each of the ``mcX`` directory are several EDAC control and
attribute files.
``mcX`` directories
-------------------
@ -569,7 +553,7 @@ this ``X`` memory module:
- Unbuffered-DDR
.. [#f5] On some systems, the memory controller doesn't have any logic
to identify the memory module. On such systems, the directory is called ``rankX`` and works on a similar way as the ``csrowX`` directories.
to identify the memory module. On such systems, the directory is called ``rankX``.
On modern Intel memory controllers, the memory controller identifies the
memory modules directly. On such systems, the directory is called ``dimmX``.
@ -577,126 +561,6 @@ this ``X`` memory module:
symlinks inside the sysfs mapping that are automatically created by
the sysfs subsystem. Currently, they serve no purpose.
``csrowX`` directories
----------------------
When CONFIG_EDAC_LEGACY_SYSFS is enabled, sysfs will contain the ``csrowX``
directories. As this API doesn't work properly for Rambus, FB-DIMMs and
modern Intel Memory Controllers, this is being deprecated in favor of
``dimmX`` directories.
In the ``csrowX`` directories are EDAC control and attribute files for
this ``X`` instance of csrow:
- ``ue_count`` - Total Uncorrectable Errors count attribute file
This attribute file displays the total count of uncorrectable
errors that have occurred on this csrow. If panic_on_ue is set
this counter will not have a chance to increment, since EDAC
will panic the system.
- ``ce_count`` - Total Correctable Errors count attribute file
This attribute file displays the total count of correctable
errors that have occurred on this csrow. This count is very
important to examine. CEs provide early indications that a
DIMM is beginning to fail. This count field should be
monitored for non-zero values and report such information
to the system administrator.
- ``size_mb`` - Total memory managed by this csrow attribute file
This attribute file displays, in count of megabytes, the memory
that this csrow contains.
- ``mem_type`` - Memory Type attribute file
This attribute file will display what type of memory is currently
on this csrow. Normally, either buffered or unbuffered memory.
Examples:
- Registered-DDR
- Unbuffered-DDR
- ``edac_mode`` - EDAC Mode of operation attribute file
This attribute file will display what type of Error detection
and correction is being utilized.
- ``dev_type`` - Device type attribute file
This attribute file will display what type of DRAM device is
being utilized on this DIMM.
Examples:
- x1
- x2
- x4
- x8
- ``ch0_ce_count`` - Channel 0 CE Count attribute file
This attribute file will display the count of CEs on this
DIMM located in channel 0.
- ``ch0_ue_count`` - Channel 0 UE Count attribute file
This attribute file will display the count of UEs on this
DIMM located in channel 0.
- ``ch0_dimm_label`` - Channel 0 DIMM Label control file
This control file allows this DIMM to have a label assigned
to it. With this label in the module, when errors occur
the output can provide the DIMM label in the system log.
This becomes vital for panic events to isolate the
cause of the UE event.
DIMM Labels must be assigned after booting, with information
that correctly identifies the physical slot with its
silk screen label. This information is currently very
motherboard specific and determination of this information
must occur in userland at this time.
- ``ch1_ce_count`` - Channel 1 CE Count attribute file
This attribute file will display the count of CEs on this
DIMM located in channel 1.
- ``ch1_ue_count`` - Channel 1 UE Count attribute file
This attribute file will display the count of UEs on this
DIMM located in channel 0.
- ``ch1_dimm_label`` - Channel 1 DIMM Label control file
This control file allows this DIMM to have a label assigned
to it. With this label in the module, when errors occur
the output can provide the DIMM label in the system log.
This becomes vital for panic events to isolate the
cause of the UE event.
DIMM Labels must be assigned after booting, with information
that correctly identifies the physical slot with its
silk screen label. This information is currently very
motherboard specific and determination of this information
must occur in userland at this time.
System Logging
--------------

View File

@ -17,8 +17,7 @@ The latest bcache kernel code can be found from mainline Linux kernel:
It's designed around the performance characteristics of SSDs - it only allocates
in erase block sized buckets, and it uses a hybrid btree/log to track cached
extents (which can be anywhere from a single sector to the bucket size). It's
designed to avoid random writes at all costs; it fills up an erase block
sequentially, then issues a discard before reusing it.
designed to avoid random writes at all costs.
Both writethrough and writeback caching are supported. Writeback defaults to
off, but can be switched on and off arbitrarily at runtime. Bcache goes to
@ -618,19 +617,11 @@ bucket_size
cache_replacement_policy
One of either lru, fifo or random.
discard
Boolean; if on a discard/TRIM will be issued to each bucket before it is
reused. Defaults to off, since SATA TRIM is an unqueued command (and thus
slow).
freelist_percent
Size of the freelist as a percentage of nbuckets. Can be written to to
increase the number of buckets kept on the freelist, which lets you
artificially reduce the size of the cache at runtime. Mostly for testing
purposes (i.e. testing how different size caches affect your hit rate), but
since buckets are discarded when they move on to the freelist will also make
the SSD's garbage collection easier by effectively giving it more reserved
space.
purposes (i.e. testing how different size caches affect your hit rate).
io_errors
Number of errors that have occurred, decayed by io_error_halflife.

View File

@ -68,30 +68,43 @@ The options available for the add command can be listed by reading the
In more details, the options that can be used with the "add" command are as
follows.
================ ===========================================================
id Device number (the X in /dev/zloopX).
Default: automatically assigned.
capacity_mb Device total capacity in MiB. This is always rounded up to
the nearest higher multiple of the zone size.
Default: 16384 MiB (16 GiB).
zone_size_mb Device zone size in MiB. Default: 256 MiB.
zone_capacity_mb Device zone capacity (must always be equal to or lower than
the zone size. Default: zone size.
conv_zones Total number of conventioanl zones starting from sector 0.
Default: 8.
base_dir Path to the base directory where to create the directory
containing the zone files of the device.
Default=/var/local/zloop.
The device directory containing the zone files is always
named with the device ID. E.g. the default zone file
directory for /dev/zloop0 is /var/local/zloop/0.
nr_queues Number of I/O queues of the zoned block device. This value is
always capped by the number of online CPUs
Default: 1
queue_depth Maximum I/O queue depth per I/O queue.
Default: 64
buffered_io Do buffered IOs instead of direct IOs (default: false)
================ ===========================================================
=================== =========================================================
id Device number (the X in /dev/zloopX).
Default: automatically assigned.
capacity_mb Device total capacity in MiB. This is always rounded up
to the nearest higher multiple of the zone size.
Default: 16384 MiB (16 GiB).
zone_size_mb Device zone size in MiB. Default: 256 MiB.
zone_capacity_mb Device zone capacity (must always be equal to or lower
than the zone size. Default: zone size.
conv_zones Total number of conventioanl zones starting from
sector 0
Default: 8
base_dir Path to the base directory where to create the directory
containing the zone files of the device.
Default=/var/local/zloop.
The device directory containing the zone files is always
named with the device ID. E.g. the default zone file
directory for /dev/zloop0 is /var/local/zloop/0.
nr_queues Number of I/O queues of the zoned block device. This
value is always capped by the number of online CPUs
Default: 1
queue_depth Maximum I/O queue depth per I/O queue.
Default: 64
buffered_io Do buffered IOs instead of direct IOs (default: false)
zone_append Enable or disable a zloop device native zone append
support.
Default: 1 (enabled).
If native zone append support is disabled, the block layer
will emulate this operation using regular write
operations.
ordered_zone_append Enable zloop mitigation of zone append reordering.
Default: disabled.
This is useful for testing file systems file data mapping
(extents), as when enabled, this can significantly reduce
the number of data extents needed to for a file data
mapping.
=================== =========================================================
3) Deleting a Zoned Device
--------------------------

View File

@ -53,7 +53,8 @@ v1 is available under :ref:`Documentation/admin-guide/cgroup-v1/index.rst <cgrou
5-2. Memory
5-2-1. Memory Interface Files
5-2-2. Usage Guidelines
5-2-3. Memory Ownership
5-2-3. Reclaim Protection
5-2-4. Memory Ownership
5-3. IO
5-3-1. IO Interface Files
5-3-2. Writeback
@ -1317,7 +1318,7 @@ PAGE_SIZE multiple when read back.
smaller overages.
Effective min boundary is limited by memory.min values of
all ancestor cgroups. If there is memory.min overcommitment
ancestor cgroups. If there is memory.min overcommitment
(child cgroup or cgroups are requiring more protected memory
than parent will allow), then each child cgroup will get
the part of parent's protection proportional to its
@ -1326,9 +1327,6 @@ PAGE_SIZE multiple when read back.
Putting more memory than generally available under this
protection is discouraged and may lead to constant OOMs.
If a memory cgroup is not populated with processes,
its memory.min is ignored.
memory.low
A read-write single value file which exists on non-root
cgroups. The default is "0".
@ -1343,7 +1341,7 @@ PAGE_SIZE multiple when read back.
smaller overages.
Effective low boundary is limited by memory.low values of
all ancestor cgroups. If there is memory.low overcommitment
ancestor cgroups. If there is memory.low overcommitment
(child cgroup or cgroups are requiring more protected memory
than parent will allow), then each child cgroup will get
the part of parent's protection proportional to its
@ -1515,6 +1513,10 @@ The following nested keys are defined.
oom_group_kill
The number of times a group OOM has occurred.
sock_throttled
The number of times network sockets associated with
this cgroup are throttled.
memory.events.local
Similar to memory.events but the fields in the file are local
to the cgroup i.e. not hierarchical. The file modified event
@ -1934,6 +1936,27 @@ memory - is necessary to determine whether a workload needs more
memory; unfortunately, memory pressure monitoring mechanism isn't
implemented yet.
Reclaim Protection
~~~~~~~~~~~~~~~~~~
The protection configured with "memory.low" or "memory.min" applies relatively
to the target of the reclaim (i.e. any of memory cgroup limits, proactive
memory.reclaim or global reclaim apparently located in the root cgroup).
The protection value configured for B applies unchanged to the reclaim
targeting A (i.e. caused by competition with the sibling E)::
root - ... - A - B - C
\ ` D
` E
When the reclaim targets ancestors of A, the effective protection of B is
capped by the protection value configured for A (and any other intermediate
ancestors between A and the target).
To express indifference about relative sibling protection, it is suggested to
use memory_recursiveprot. Configuring all descendants of a parent with finite
protection to "max" works but it may unnecessarily skew memory.events:low
field.
Memory Ownership
~~~~~~~~~~~~~~~~

View File

@ -20,10 +20,10 @@ The target is named "raid" and it accepts the following parameters::
raid0 RAID0 striping (no resilience)
raid1 RAID1 mirroring
raid4 RAID4 with dedicated last parity disk
raid5_n RAID5 with dedicated last parity disk supporting takeover
raid5_n RAID5 with dedicated last parity disk supporting takeover from/to raid1
Same as raid4
- Transitory layout
- Transitory layout for takeover from/to raid1
raid5_la RAID5 left asymmetric
- rotating parity 0 with data continuation
@ -48,8 +48,8 @@ The target is named "raid" and it accepts the following parameters::
raid6_n_6 RAID6 with dedicate parity disks
- parity and Q-syndrome on the last 2 disks;
layout for takeover from/to raid4/raid5_n
raid6_la_6 Same as "raid_la" plus dedicated last Q-syndrome disk
layout for takeover from/to raid0/raid4/raid5_n
raid6_la_6 Same as "raid_la" plus dedicated last Q-syndrome disk supporting takeover from/to raid5
- layout for takeover from raid5_la from/to raid6
raid6_ra_6 Same as "raid5_ra" dedicated last Q-syndrome disk
@ -173,9 +173,9 @@ The target is named "raid" and it accepts the following parameters::
The delta_disks option value (-251 < N < +251) triggers
device removal (negative value) or device addition (positive
value) to any reshape supporting raid levels 4/5/6 and 10.
RAID levels 4/5/6 allow for addition of devices (metadata
and data device tuple), raid10_near and raid10_offset only
allow for device addition. raid10_far does not support any
RAID levels 4/5/6 allow for addition and removal of devices
(metadata and data device tuple), raid10_near and raid10_offset
only allow for device addition. raid10_far does not support any
reshaping at all.
A minimum of devices have to be kept to enforce resilience,
which is 3 devices for raid4/5 and 4 devices for raid6.
@ -372,6 +372,72 @@ to safely enable discard support for RAID 4/5/6:
'devices_handle_discards_safely'
Takeover/Reshape Support
------------------------
The target natively supports these two types of MDRAID conversions:
o Takeover: Converts an array from one RAID level to another
o Reshape: Changes the internal layout while maintaining the current RAID level
Each operation is only valid under specific constraints imposed by the existing array's layout and configuration.
Takeover:
linear -> raid1 with N >= 2 mirrors
raid0 -> raid4 (add dedicated parity device)
raid0 -> raid5 (add dedicated parity device)
raid0 -> raid10 with near layout and N >= 2 mirror groups (raid0 stripes have to become first member within mirror groups)
raid1 -> linear
raid1 -> raid5 with 2 mirrors
raid4 -> raid5 w/ rotating parity
raid5 with dedicated parity device -> raid4
raid5 -> raid6 (with dedicated Q-syndrome)
raid6 (with dedicated Q-syndrome) -> raid5
raid10 with near layout and even number of disks -> raid0 (select any in-sync device from each mirror group)
Reshape:
linear: not possible
raid0: not possible
raid1: change number of mirrors
raid4: add and remove stripes (minimum 3), change stripesize
raid5: add and remove stripes (minimum 3, special case 2 for raid1 takeover), change rotating parity algorithms, change stripesize
raid6: add and remove stripes (minimum 4), change rotating syndrome algorithms, change stripesize
raid10 near: add stripes (minimum 4), change stripesize, no stripe removal possible, change to offset layout
raid10 offset: add stripes, change stripesize, no stripe removal possible, change to near layout
raid10 far: not possible
Table line examples:
### raid1 -> raid5
#
# 2 devices limitation in raid1.
# raid5 personality is able to just map 2 like raid1.
# Reshape after takeover to change to full raid5 layout
0 1960886272 raid raid1 3 0 region_size 2048 2 /dev/dm-0 /dev/dm-1 /dev/dm-2 /dev/dm-3
# dm-0 and dm-2 are e.g. 4MiB large metadata devices, dm-1 and dm-3 have to be at least 1960886272 big.
#
# Table line to takeover to raid5
0 1960886272 raid raid5 3 0 region_size 2048 2 /dev/dm-0 /dev/dm-1 /dev/dm-2 /dev/dm-3
# Add required out-of-place reshape space to the beginniong of the given 2 data devices,
# allocate another metadata/data device tuple with the same sizes for the parity space
# and zero the first 4K of the metadata device.
#
# Example table of the out-of-place reshape space addition for one data device, e.g. dm-1
0 8192 linear 8:0 0 1960903888 # <- must be free space segment
8192 1960886272 linear 8:0 0 2048 # previous data segment
# Mapping table for e.g. raid5_rs reshape causing the size of the raid device to double-fold once the reshape finishes.
# Check the status output (e.g. "dmsetup status $RaidDev") for progess.
0 $((2 * 1960886272)) raid raid5 7 0 region_size 2048 data_offset 8192 delta_disk 1 2 /dev/dm-0 /dev/dm-1 /dev/dm-2 /dev/dm-3
Version History
---------------

View File

@ -236,8 +236,10 @@ is available at the cryptsetup project's wiki page
Status
======
V (for Valid) is returned if every check performed so far was valid.
If any check failed, C (for Corruption) is returned.
1. V (for Valid) is returned if every check performed so far was valid.
If any check failed, C (for Corruption) is returned.
2. Number of corrected blocks by Forward Error Correction.
'-' if Forward Error Correction is not enabled.
Example
=======

View File

@ -223,12 +223,13 @@ The flags are::
f Include the function name
s Include the source file name
l Include line number
d Include call trace
For ``print_hex_dump_debug()`` and ``print_hex_dump_bytes()``, only
the ``p`` flag has meaning, other flags are ignored.
Note the regexp ``^[-+=][fslmpt_]+$`` matches a flags specification.
To clear all flags at once, use ``=_`` or ``-fslmpt``.
Note the regexp ``^[-+=][fslmptd_]+$`` matches a flags specification.
To clear all flags at once, use ``=_`` or ``-fslmptd``.
Debug messages during Boot Process

View File

@ -79,6 +79,9 @@ because the image we're executing is interpreted by the EFI shell,
which understands relative paths, whereas the rest of the command line
is passed to bzImage.efi.
.. hint::
It is also possible to provide an initrd using a Linux-specific UEFI
protocol at boot time. See :ref:`pe-coff-entry-point` for details.
The "dtb=" option
-----------------

View File

@ -31,7 +31,7 @@ specifically opt into the feature to enable it.
Mitigation
----------
When PR_SET_L1D_FLUSH is enabled for a task a flush of the L1D cache is
When PR_SPEC_L1D_FLUSH is enabled for a task a flush of the L1D cache is
performed when the task is scheduled out and the incoming task belongs to a
different process and therefore to a different address space.

View File

@ -406,7 +406,7 @@ The possible values in this file are:
- Single threaded indirect branch prediction (STIBP) status for protection
between different hyper threads. This feature can be controlled through
prctl per process, or through kernel command line options. This is x86
prctl per process, or through kernel command line options. This is an x86
only feature. For more details see below.
==================== ========================================================

View File

@ -110,102 +110,7 @@ The parameters listed below are only valid if certain kernel build options
were enabled and if respective hardware is present. This list should be kept
in alphabetical order. The text in square brackets at the beginning
of each description states the restrictions within which a parameter
is applicable::
ACPI ACPI support is enabled.
AGP AGP (Accelerated Graphics Port) is enabled.
ALSA ALSA sound support is enabled.
APIC APIC support is enabled.
APM Advanced Power Management support is enabled.
APPARMOR AppArmor support is enabled.
ARM ARM architecture is enabled.
ARM64 ARM64 architecture is enabled.
AX25 Appropriate AX.25 support is enabled.
CLK Common clock infrastructure is enabled.
CMA Contiguous Memory Area support is enabled.
DRM Direct Rendering Management support is enabled.
DYNAMIC_DEBUG Build in debug messages and enable them at runtime
EARLY Parameter processed too early to be embedded in initrd.
EDD BIOS Enhanced Disk Drive Services (EDD) is enabled
EFI EFI Partitioning (GPT) is enabled
EVM Extended Verification Module
FB The frame buffer device is enabled.
FTRACE Function tracing enabled.
GCOV GCOV profiling is enabled.
HIBERNATION HIBERNATION is enabled.
HW Appropriate hardware is enabled.
HYPER_V HYPERV support is enabled.
IMA Integrity measurement architecture is enabled.
IP_PNP IP DHCP, BOOTP, or RARP is enabled.
IPV6 IPv6 support is enabled.
ISAPNP ISA PnP code is enabled.
ISDN Appropriate ISDN support is enabled.
ISOL CPU Isolation is enabled.
JOY Appropriate joystick support is enabled.
KGDB Kernel debugger support is enabled.
KVM Kernel Virtual Machine support is enabled.
LIBATA Libata driver is enabled
LOONGARCH LoongArch architecture is enabled.
LOOP Loopback device support is enabled.
LP Printer support is enabled.
M68k M68k architecture is enabled.
These options have more detailed description inside of
Documentation/arch/m68k/kernel-options.rst.
MDA MDA console support is enabled.
MIPS MIPS architecture is enabled.
MOUSE Appropriate mouse support is enabled.
MSI Message Signaled Interrupts (PCI).
MTD MTD (Memory Technology Device) support is enabled.
NET Appropriate network support is enabled.
NFS Appropriate NFS support is enabled.
NUMA NUMA support is enabled.
OF Devicetree is enabled.
PARISC The PA-RISC architecture is enabled.
PCI PCI bus support is enabled.
PCIE PCI Express support is enabled.
PCMCIA The PCMCIA subsystem is enabled.
PNP Plug & Play support is enabled.
PPC PowerPC architecture is enabled.
PPT Parallel port support is enabled.
PS2 Appropriate PS/2 support is enabled.
PV_OPS A paravirtualized kernel is enabled.
RAM RAM disk support is enabled.
RDT Intel Resource Director Technology.
RISCV RISCV architecture is enabled.
S390 S390 architecture is enabled.
SCSI Appropriate SCSI support is enabled.
A lot of drivers have their options described inside
the Documentation/scsi/ sub-directory.
SDW SoundWire support is enabled.
SECURITY Different security models are enabled.
SELINUX SELinux support is enabled.
SERIAL Serial support is enabled.
SH SuperH architecture is enabled.
SMP The kernel is an SMP kernel.
SPARC Sparc architecture is enabled.
SUSPEND System suspend states are enabled.
SWSUSP Software suspend (hibernation) is enabled.
TPM TPM drivers are enabled.
UMS USB Mass Storage support is enabled.
USB USB support is enabled.
USBHID USB Human Interface Device support is enabled.
V4L Video For Linux support is enabled.
VGA The VGA console has been enabled.
VMMIO Driver for memory mapped virtio devices is enabled.
VT Virtual terminal support is enabled.
WDT Watchdog support is enabled.
X86-32 X86-32, aka i386 architecture is enabled.
X86-64 X86-64 architecture is enabled.
X86 Either 32-bit or 64-bit x86 (same as X86-32+X86-64)
X86_UV SGI UV support is enabled.
XEN Xen support is enabled
XTENSA xtensa architecture is enabled.
In addition, the following text indicates that the option::
BOOT Is a boot loader parameter.
BUGS= Relates to possible processor bugs on the said processor.
KNL Is a kernel start-up parameter.
is applicable.
Parameters denoted with BOOT are actually interpreted by the boot
loader, and have no meaning to the kernel directly.

View File

@ -1,3 +1,101 @@
ACPI ACPI support is enabled.
AGP AGP (Accelerated Graphics Port) is enabled.
ALSA ALSA sound support is enabled.
APIC APIC support is enabled.
APM Advanced Power Management support is enabled.
APPARMOR AppArmor support is enabled.
ARM ARM architecture is enabled.
ARM64 ARM64 architecture is enabled.
AX25 Appropriate AX.25 support is enabled.
CLK Common clock infrastructure is enabled.
CMA Contiguous Memory Area support is enabled.
DRM Direct Rendering Management support is enabled.
DYNAMIC_DEBUG Build in debug messages and enable them at runtime
EARLY Parameter processed too early to be embedded in initrd.
EDD BIOS Enhanced Disk Drive Services (EDD) is enabled
EFI EFI Partitioning (GPT) is enabled
EVM Extended Verification Module
FB The frame buffer device is enabled.
FTRACE Function tracing enabled.
GCOV GCOV profiling is enabled.
HIBERNATION HIBERNATION is enabled.
HW Appropriate hardware is enabled.
HYPER_V HYPERV support is enabled.
IMA Integrity measurement architecture is enabled.
IP_PNP IP DHCP, BOOTP, or RARP is enabled.
IPV6 IPv6 support is enabled.
ISAPNP ISA PnP code is enabled.
ISDN Appropriate ISDN support is enabled.
ISOL CPU Isolation is enabled.
JOY Appropriate joystick support is enabled.
KGDB Kernel debugger support is enabled.
KVM Kernel Virtual Machine support is enabled.
LIBATA Libata driver is enabled
LOONGARCH LoongArch architecture is enabled.
LOOP Loopback device support is enabled.
LP Printer support is enabled.
M68k M68k architecture is enabled.
These options have more detailed description inside of
Documentation/arch/m68k/kernel-options.rst.
MDA MDA console support is enabled.
MIPS MIPS architecture is enabled.
MOUSE Appropriate mouse support is enabled.
MSI Message Signaled Interrupts (PCI).
MTD MTD (Memory Technology Device) support is enabled.
NET Appropriate network support is enabled.
NFS Appropriate NFS support is enabled.
NUMA NUMA support is enabled.
OF Devicetree is enabled.
PARISC The PA-RISC architecture is enabled.
PCI PCI bus support is enabled.
PCIE PCI Express support is enabled.
PCMCIA The PCMCIA subsystem is enabled.
PNP Plug & Play support is enabled.
PPC PowerPC architecture is enabled.
PPT Parallel port support is enabled.
PS2 Appropriate PS/2 support is enabled.
PV_OPS A paravirtualized kernel is enabled.
RAM RAM disk support is enabled.
RDT Intel Resource Director Technology.
RISCV RISCV architecture is enabled.
S390 S390 architecture is enabled.
SCSI Appropriate SCSI support is enabled.
A lot of drivers have their options described inside
the Documentation/scsi/ sub-directory.
SDW SoundWire support is enabled.
SECURITY Different security models are enabled.
SELINUX SELinux support is enabled.
SERIAL Serial support is enabled.
SH SuperH architecture is enabled.
SMP The kernel is an SMP kernel.
SPARC Sparc architecture is enabled.
SUSPEND System suspend states are enabled.
SWSUSP Software suspend (hibernation) is enabled.
TPM TPM drivers are enabled.
UMS USB Mass Storage support is enabled.
USB USB support is enabled.
USBHID USB Human Interface Device support is enabled.
V4L Video For Linux support is enabled.
VGA The VGA console has been enabled.
VMMIO Driver for memory mapped virtio devices is enabled.
VT Virtual terminal support is enabled.
WDT Watchdog support is enabled.
X86-32 X86-32, aka i386 architecture is enabled.
X86-64 X86-64 architecture is enabled.
X86 Either 32-bit or 64-bit x86 (same as X86-32+X86-64)
X86_UV SGI UV support is enabled.
XEN Xen support is enabled
XTENSA xtensa architecture is enabled.
In addition, the following text indicates that the option
BOOT Is a boot loader parameter.
BUGS= Relates to possible processor bugs on the said processor.
KNL Is a kernel start-up parameter.
Kernel parameters
accept_memory= [MM]
Format: { eager | lazy }
default: lazy
@ -669,6 +767,14 @@
nokmem -- Disable kernel memory accounting.
nobpf -- Disable BPF memory accounting.
check_pages= [MM,EARLY] Enable sanity checking of pages after
allocations / before freeing. This adds checks to catch
double-frees, use-after-frees, and other sources of
page corruption by inspecting page internals (flags,
mapcount/refcount, memcg_data, etc.).
Format: { "0" | "1" }
Default: 0 (1 if CONFIG_DEBUG_VM is set)
checkreqprot= [SELINUX] Set initial checkreqprot flag value.
Format: { "0" | "1" }
See security/selinux/Kconfig help text.
@ -1013,7 +1119,7 @@
It will be ignored when crashkernel=X,high is not used
or memory reserved is below 4G.
crashkernel=size[KMG],cma
[KNL, X86] Reserve additional crash kernel memory from
[KNL, X86, ppc] Reserve additional crash kernel memory from
CMA. This reservation is usable by the first system's
userspace memory and kernel movable allocations (memory
balloon, zswap). Pages allocated from this memory range
@ -1113,12 +1219,8 @@
debugfs= [KNL,EARLY] This parameter enables what is exposed to
userspace and debugfs internal clients.
Format: { on, no-mount, off }
Format: { on, off }
on: All functions are enabled.
no-mount:
Filesystem is not registered but kernel clients can
access APIs and a crashkernel can be used to read
its content. There is nothing to mount.
off: Filesystem is not registered and clients
get a -EPERM as result when trying to register files
or directories within debugfs.
@ -1907,6 +2009,16 @@
/sys/power/pm_test). Only available when CONFIG_PM_DEBUG
is set. Default value is 5.
hibernate_compression_threads=
[HIBERNATION]
Set the number of threads used for compressing or decompressing
hibernation images.
Format: <integer>
Default: 3
Minimum: 1
Example: hibernate_compression_threads=4
highmem=nn[KMG] [KNL,BOOT,EARLY] forces the highmem zone to have an exact
size of <nn>. This works even on boxes that have no
highmem otherwise. This also works to reduce highmem
@ -2010,14 +2122,20 @@
the added memory block itself do not be affected.
hung_task_panic=
[KNL] Should the hung task detector generate panics.
Format: 0 | 1
[KNL] Number of hung tasks to trigger kernel panic.
Format: <int>
A value of 1 instructs the kernel to panic when a
hung task is detected. The default value is controlled
by the CONFIG_BOOTPARAM_HUNG_TASK_PANIC build-time
option. The value selected by this boot parameter can
be changed later by the kernel.hung_task_panic sysctl.
When set to a non-zero value, a kernel panic will be triggered if
the number of detected hung tasks reaches this value.
0: don't panic
1: panic immediately on first hung task
N: panic after N hung tasks are detected in a single scan
The default value is controlled by the
CONFIG_BOOTPARAM_HUNG_TASK_PANIC build-time option. The value
selected by this boot parameter can be changed later by the
kernel.hung_task_panic sysctl.
hvc_iucv= [S390] Number of z/VM IUCV hypervisor console (HVC)
terminal devices. Valid values: 0..8
@ -6207,7 +6325,7 @@
rdt= [HW,X86,RDT]
Turn on/off individual RDT features. List is:
cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp,
mba, smba, bmec, abmc.
mba, smba, bmec, abmc, sdciae.
E.g. to turn on cmt and turn off mba use:
rdt=cmt,!mba
@ -6404,7 +6522,7 @@
that don't.
off - no mitigation
auto - automatically select a migitation
auto - automatically select a mitigation
auto,nosmt - automatically select a mitigation,
disabling SMT if necessary for
the full mitigation (only on Zen1
@ -6500,6 +6618,10 @@
Memory area to be used by remote processor image,
managed by CMA.
rseq_debug= [KNL] Enable or disable restartable sequence
debug mode. Defaults to CONFIG_RSEQ_DEBUG_DEFAULT_ENABLE.
Format: <bool>
rt_group_sched= [KNL] Enable or disable SCHED_RR/FIFO group scheduling
when CONFIG_RT_GROUP_SCHED=y. Defaults to
!CONFIG_RT_GROUP_SCHED_DEFAULT_DISABLED.
@ -7150,7 +7272,7 @@
limit. Default value is 8191 pools.
stacktrace [FTRACE]
Enabled the stack tracer on boot up.
Enable the stack tracer on boot up.
stacktrace_filter=[function-list]
[FTRACE] Limit the functions that the stack tracer
@ -7192,6 +7314,9 @@
them frequently to increase the rate of SLB faults
on kernel addresses.
no_slb_preload [PPC,EARLY]
Disables slb preloading for userspace.
sunrpc.min_resvport=
sunrpc.max_resvport=
[NFS,SUNRPC]

View File

@ -17,3 +17,4 @@ Laptop Drivers
sonypi
thinkpad-acpi
toshiba_haps
uniwill-laptop

View File

@ -0,0 +1,60 @@
.. SPDX-License-Identifier: GPL-2.0+
Uniwill laptop extra features
=============================
On laptops manufactured by Uniwill (either directly or as ODM), the ``uniwill-laptop`` driver
handles various platform-specific features.
Module Loading
--------------
The ``uniwill-laptop`` driver relies on a DMI table to automatically load on supported devices.
When using the ``force`` module parameter, this DMI check will be omitted, allowing the driver
to be loaded on unsupported devices for testing purposes.
Hotkeys
-------
Usually the FN keys work without a special driver. However as soon as the ``uniwill-laptop`` driver
is loaded, the FN keys need to be handled manually. This is done automatically by the driver itself.
Keyboard settings
-----------------
The ``uniwill-laptop`` driver allows the user to enable/disable:
- the FN and super key lock functionality of the integrated keyboard
- the touchpad toggle functionality of the integrated touchpad
See Documentation/ABI/testing/sysfs-driver-uniwill-laptop for details.
Hwmon interface
---------------
The ``uniwill-laptop`` driver supports reading of the CPU and GPU temperature and supports up to
two fans. Userspace applications can access sensor readings over the hwmon sysfs interface.
Platform profile
----------------
Support for changing the platform performance mode is currently not implemented.
Battery Charging Control
------------------------
The ``uniwill-laptop`` driver supports controlling the battery charge limit. This happens over
the standard ``charge_control_end_threshold`` power supply sysfs attribute. All values
between 1 and 100 percent are supported.
Additionally the driver signals the presence of battery charging issues through the standard
``health`` power supply sysfs attribute.
Lightbar
--------
The ``uniwill-laptop`` driver exposes the lightbar found on some models as a standard multicolor
LED class device. The default name of this LED class device is ``uniwill:multicolor:status``.
See Documentation/ABI/testing/sysfs-driver-uniwill-laptop for details on how to control the various
animation modes of the lightbar.

View File

@ -238,6 +238,16 @@ All md devices contain:
the number of devices in a raid4/5/6, or to support external
metadata formats which mandate such clipping.
logical_block_size
Configure the array's logical block size in bytes. This attribute
is only supported for 1.x meta. Write the value before starting
array. The final array LBS uses the maximum between this
configuration and LBS of all combined devices. Note that
LBS cannot exceed PAGE_SIZE before RAID supports folio.
WARNING: Arrays created on new kernel cannot be assembled at old
kernel due to padding check, Set module parameter 'check_new_feature'
to false to bypass, but data loss may occur.
reshape_position
This is either ``none`` or a sector number within the devices of
the array where ``reshape`` is up to. If this is set, the three

View File

@ -0,0 +1,19 @@
digraph board {
rankdir=TB
n00000001 [label="{{} | mali-c55 tpg\n/dev/v4l-subdev0 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
n00000001:port0 -> n00000003:port0 [style=dashed]
n00000003 [label="{{<port0> 0} | mali-c55 isp\n/dev/v4l-subdev1 | {<port1> 1 | <port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
n00000003:port1 -> n00000007:port0 [style=bold]
n00000003:port2 -> n00000007:port2 [style=bold]
n00000003:port1 -> n0000000b:port0 [style=bold]
n00000007 [label="{{<port0> 0 | <port2> 2} | mali-c55 resizer fr\n/dev/v4l-subdev2 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
n00000007:port1 -> n0000000e [style=bold]
n0000000b [label="{{<port0> 0} | mali-c55 resizer ds\n/dev/v4l-subdev3 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
n0000000b:port1 -> n00000012 [style=bold]
n0000000e [label="mali-c55 fr\n/dev/video0", shape=box, style=filled, fillcolor=yellow]
n00000012 [label="mali-c55 ds\n/dev/video1", shape=box, style=filled, fillcolor=yellow]
n00000022 [label="{{<port0> 0} | csi2-rx\n/dev/v4l-subdev4 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
n00000022:port1 -> n00000003:port0
n00000027 [label="{{} | imx415 1-001a\n/dev/v4l-subdev5 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
n00000027:port0 -> n00000022:port0 [style=bold]
}

View File

@ -0,0 +1,413 @@
.. SPDX-License-Identifier: GPL-2.0
==========================================
ARM Mali-C55 Image Signal Processor driver
==========================================
Introduction
============
This file documents the driver for ARM's Mali-C55 Image Signal Processor. The
driver is located under drivers/media/platform/arm/mali-c55.
The Mali-C55 ISP receives data in either raw Bayer format or RGB/YUV format from
sensors through either a parallel interface or a memory bus before processing it
and outputting it through an internal DMA engine. Two output pipelines are
possible (though one may not be fitted, depending on the implementation). These
are referred to as "Full resolution" and "Downscale", but the naming is historic
and both pipes are capable of cropping/scaling operations. The full resolution
pipe is also capable of outputting RAW data, bypassing much of the ISP's
processing. The downscale pipe cannot output RAW data. An integrated test
pattern generator can be used to drive the ISP and produce image data in the
absence of a connected camera sensor. The driver module is named mali_c55, and
is enabled through the CONFIG_VIDEO_MALI_C55 config option.
The driver implements V4L2, Media Controller and V4L2 Subdevice interfaces and
expects camera sensors connected to the ISP to have V4L2 subdevice interfaces.
Mali-C55 ISP hardware
=====================
A high level functional view of the Mali-C55 ISP is presented below. The ISP
takes input from either a live source or through a DMA engine for memory input,
depending on the SoC integration.::
+---------+ +----------+ +--------+
| Sensor |--->| CSI-2 Rx | "Full Resolution" | DMA |
+---------+ +----------+ |\ Output +--->| Writer |
| | \ | +--------+
| | \ +----------+ +------+---> Streaming I/O
+------------+ +------->| | | | |
| | | |-->| Mali-C55 |--+
| DMA Reader |--------------->| | | ISP | |
| | | / | | | +---> Streaming I/O
+------------+ | / +----------+ | |
|/ +------+
| +--------+
+--->| DMA |
"Downscaled" | Writer |
Output +--------+
Media Controller Topology
=========================
An example of the ISP's topology (as implemented in a system with an IMX415
camera sensor and generic CSI-2 receiver) is below:
.. kernel-figure:: mali-c55-graph.dot
:alt: mali-c55-graph.dot
:align: center
The driver has 4 V4L2 subdevices:
- `mali_c55 isp`: Responsible for configuring input crop and color space
conversion
- `mali_c55 tpg`: The test pattern generator, emulating a camera sensor.
- `mali_c55 resizer fr`: The Full-Resolution pipe resizer
- `mali_c55 resizer ds`: The Downscale pipe resizer
The driver has 3 V4L2 video devices:
- `mali-c55 fr`: The full-resolution pipe's capture device
- `mali-c55 ds`: The downscale pipe's capture device
- `mali-c55 3a stats`: The 3A statistics capture device
Frame sequences are synchronised across to two capture devices, meaning if one
pipe is started later than the other the sequence numbers returned in its
buffers will match those of the other pipe rather than starting from zero.
Idiosyncrasies
--------------
**mali-c55 isp**
The `mali-c55 isp` subdevice has a single sink pad to which all sources of data
should be connected. The active source is selected by enabling the appropriate
media link and disabling all others. The ISP has two source pads, reflecting the
different paths through which it can internally route data. Tap points within
the ISP allow users to divert data to avoid processing by some or all of the
hardware's processing steps. The diagram below is intended only to highlight how
the bypassing works and is not a true reflection of those processing steps; for
a high-level functional block diagram see ARM's developer page for the
ISP [3]_::
+--------------------------------------------------------------+
| Possible Internal ISP Data Routes |
| +------------+ +----------+ +------------+ |
+---+ | | | | | Colour | +---+
| 0 |--+-->| Processing |->| Demosaic |->| Space |--->| 1 |
+---+ | | | | | | Conversion | +---+
| | +------------+ +----------+ +------------+ |
| | +---+
| +---------------------------------------------------| 2 |
| +---+
| |
+--------------------------------------------------------------+
.. flat-table::
:header-rows: 1
* - Pad
- Direction
- Purpose
* - 0
- sink
- Data input, connected to the TPG and camera sensors
* - 1
- source
- RGB/YUV data, connected to the FR and DS V4L2 subdevices
* - 2
- source
- RAW bayer data, connected to the FR V4L2 subdevices
The ISP is limited to both input and output resolutions between 640x480 and
8192x8192, and this is reflected in the ISP and resizer subdevice's .set_fmt()
operations.
**mali-c55 resizer fr**
The `mali-c55 resizer fr` subdevice has two _sink_ pads to reflect the different
insertion points in the hardware (either RAW or demosaiced data):
.. flat-table::
:header-rows: 1
* - Pad
- Direction
- Purpose
* - 0
- sink
- Data input connected to the ISP's demosaiced stream.
* - 1
- source
- Data output connected to the capture video device
* - 2
- sink
- Data input connected to the ISP's raw data stream
The data source in use is selected through the routing API; two routes each of a
single stream are available:
.. flat-table::
:header-rows: 1
* - Sink Pad
- Source Pad
- Purpose
* - 0
- 1
- Demosaiced data route
* - 2
- 1
- Raw data route
If the demosaiced route is active then the FR pipe is only capable of output
in RGB/YUV formats. If the raw route is active then the output reflects the
input (which may be either Bayer or RGB/YUV data).
Using the driver to capture video
=================================
Using the media controller APIs we can configure the input source and ISP to
capture images in a variety of formats. In the examples below, configuring the
media graph is done with the v4l-utils [1]_ package's media-ctl utility.
Capturing the images is done with yavta [2]_.
Configuring the input source
----------------------------
The first step is to set the input source that we wish by enabling the correct
media link. Using the example topology above, we can select the TPG as follows:
.. code-block:: none
media-ctl -l "'lte-csi2-rx':1->'mali-c55 isp':0[0]"
media-ctl -l "'mali-c55 tpg':0->'mali-c55 isp':0[1]"
Configuring which video devices will stream data
------------------------------------------------
The driver will wait for all video devices to have their VIDIOC_STREAMON ioctl
called before it tells the sensor to start streaming. To facilitate this we need
to enable links to the video devices that we want to use. In the example below
we enable the links to both of the image capture video devices
.. code-block:: none
media-ctl -l "'mali-c55 resizer fr':1->'mali-c55 fr':0[1]"
media-ctl -l "'mali-c55 resizer ds':1->'mali-c55 ds':0[1]"
Capturing bayer data from the source and processing to RGB/YUV
--------------------------------------------------------------
To capture 1920x1080 bayer data from the source and push it through the ISP's
full processing pipeline, we configure the data formats appropriately on the
source, ISP and resizer subdevices and set the FR resizer's routing to select
processed data. The media bus format on the resizer's source pad will be either
RGB121212_1X36 or YUV10_1X30, depending on whether you want to capture RGB or
YUV. The ISP's debayering block outputs RGB data natively, setting the source
pad format to YUV10_1X30 enables the colour space conversion block.
In this example we target RGB565 output, so select RGB121212_1X36 as the resizer
source pad's format:
.. code-block:: none
# Set formats on the TPG and ISP
media-ctl -V "'mali-c55 tpg':0[fmt:SRGGB20_1X20/1920x1080]"
media-ctl -V "'mali-c55 isp':0[fmt:SRGGB20_1X20/1920x1080]"
media-ctl -V "'mali-c55 isp':1[fmt:SRGGB20_1X20/1920x1080]"
# Set routing on the FR resizer
media-ctl -R "'mali-c55 resizer fr'[0/0->1/0[1],2/0->1/0[0]]"
# Set format on the resizer, must be done AFTER the routing.
media-ctl -V "'mali-c55 resizer fr':1[fmt:RGB121212_1X36/1920x1080]"
The downscale output can also be used to stream data at the same time. In this
case since only processed data can be captured through the downscale output no
routing need be set:
.. code-block:: none
# Set format on the resizer
media-ctl -V "'mali-c55 resizer ds':1[fmt:RGB121212_1X36/1920x1080]"
Following which images can be captured from both the FR and DS output's video
devices (simultaneously, if desired):
.. code-block:: none
yavta -f RGB565 -s 1920x1080 -c10 /dev/video0
yavta -f RGB565 -s 1920x1080 -c10 /dev/video1
Cropping the image
~~~~~~~~~~~~~~~~~~
Both the full resolution and downscale pipes can crop to a minimum resolution of
640x480. To crop the image simply configure the resizer's sink pad's crop and
compose rectangles and set the format on the video device:
.. code-block:: none
media-ctl -V "'mali-c55 resizer fr':0[fmt:RGB121212_1X36/1920x1080 crop:(480,270)/640x480 compose:(0,0)/640x480]"
media-ctl -V "'mali-c55 resizer fr':1[fmt:RGB121212_1X36/640x480]"
yavta -f RGB565 -s 640x480 -c10 /dev/video0
Downscaling the image
~~~~~~~~~~~~~~~~~~~~~
Both the full resolution and downscale pipes can downscale the image by up to 8x
provided the minimum 640x480 output resolution is adhered to. For the best image
result the scaling ratio for each direction should be the same. To configure
scaling we use the compose rectangle on the resizer's sink pad:
.. code-block:: none
media-ctl -V "'mali-c55 resizer fr':0[fmt:RGB121212_1X36/1920x1080 crop:(0,0)/1920x1080 compose:(0,0)/640x480]"
media-ctl -V "'mali-c55 resizer fr':1[fmt:RGB121212_1X36/640x480]"
yavta -f RGB565 -s 640x480 -c10 /dev/video0
Capturing images in YUV formats
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If we need to output YUV data rather than RGB the color space conversion block
needs to be active, which is achieved by setting MEDIA_BUS_FMT_YUV10_1X30 on the
resizer's source pad. We can then configure a capture format like NV12 (here in
its multi-planar variant)
.. code-block:: none
media-ctl -V "'mali-c55 resizer fr':1[fmt:YUV10_1X30/1920x1080]"
yavta -f NV12M -s 1920x1080 -c10 /dev/video0
Capturing RGB data from the source and processing it with the resizers
----------------------------------------------------------------------
The Mali-C55 ISP can work with sensors capable of outputting RGB data. In this
case although none of the image quality blocks would be used it can still
crop/scale the data in the usual way. For this reason RGB data input to the ISP
still goes through the ISP subdevice's pad 1 to the resizer.
To achieve this, the ISP's sink pad's format is set to
MEDIA_BUS_FMT_RGB202020_1X60 - this reflects the format that data must be in to
work with the ISP. Converting the camera sensor's output to that format is the
responsibility of external hardware.
In this example we ask the test pattern generator to give us RGB data instead of
bayer.
.. code-block:: none
media-ctl -V "'mali-c55 tpg':0[fmt:RGB202020_1X60/1920x1080]"
media-ctl -V "'mali-c55 isp':0[fmt:RGB202020_1X60/1920x1080]"
Cropping or scaling the data can be done in exactly the same way as outlined
earlier.
Capturing raw data from the source and outputting it unmodified
-----------------------------------------------------------------
The ISP can additionally capture raw data from the source and output it on the
full resolution pipe only, completely unmodified. In this case the downscale
pipe can still process the data normally and be used at the same time.
To configure raw bypass the FR resizer's subdevice's routing table needs to be
configured, followed by formats in the appropriate places:
.. code-block:: none
media-ctl -R "'mali-c55 resizer fr'[0/0->1/0[0],2/0->1/0[1]]"
media-ctl -V "'mali-c55 isp':0[fmt:RGB202020_1X60/1920x1080]"
media-ctl -V "'mali-c55 resizer fr':2[fmt:RGB202020_1X60/1920x1080]"
media-ctl -V "'mali-c55 resizer fr':1[fmt:RGB202020_1X60/1920x1080]"
# Set format on the video device and stream
yavta -f RGB565 -s 1920x1080 -c10 /dev/video0
.. _mali-c55-3a-stats:
Capturing ISP Statistics
========================
The ISP is capable of producing statistics for consumption by image processing
algorithms running in userspace. These statistics can be captured by queueing
buffers to the `mali-c55 3a stats` V4L2 Device whilst the ISP is streaming. Only
the :ref:`V4L2_META_FMT_MALI_C55_STATS <v4l2-meta-fmt-mali-c55-stats>`
format is supported, so no format-setting need be done:
.. code-block:: none
# We assume the media graph has been configured to support RGB565 capture
# from the mali-c55 fr V4L2 Device, which is at /dev/video0. The statistics
# V4L2 device is at /dev/video3
yavta -f RGB565 -s 1920x1080 -c32 /dev/video0 && \
yavta -c10 -F /dev/video3
The layout of the buffer is described by :c:type:`mali_c55_stats_buffer`,
but broadly statistics are generated to support three image processing
algorithms; AEXP (Auto-Exposure), AWB (Auto-White Balance) and AF (Auto-Focus).
These stats can be drawn from various places in the Mali C55 ISP pipeline, known
as "tap points". This high-level block diagram is intended to explain where in
the processing flow the statistics can be drawn from::
+--> AEXP-2 +----> AEXP-1 +--> AF-0
| +----> AF-1 |
| | |
+---------+ | +--------------+ | +--------------+ |
| Input +-+-->+ Digital Gain +---+-->+ Black Level +---+---+
+---------+ +--------------+ +--------------+ |
+-----------------------------------------------------------------+
|
| +--------------+ +---------+ +----------------+
+-->| Sinter Noise +-+ White +--+--->| Lens Shading +--+---------------+
| Reduction | | Balance | | | | | |
+--------------+ +---------+ | +----------------+ | |
+---> AEXP-0 (A) +--> AEXP-0 (B) |
+--------------------------------------------------------------------------+
|
| +----------------+ +--------------+ +----------------+
+-->| Tone mapping +-+--->| Demosaicing +->+ Purple Fringe +-+-----------+
| | | +--------------+ | Correction | | |
+----------------+ +-> AEXP-IRIDIX +----------------+ +---> AWB-0 |
+----------------------------------------------------------------------------+
| +-------------+ +-------------+
+------------------->| Colour +---+--->| Output |
| Correction | | | Pipelines |
+-------------+ | +-------------+
+--> AWB-1
By default all statistics are drawn from the 0th tap point for each algorithm;
I.E. AEXP statistics from AEXP-0 (A), AWB statistics from AWB-0 and AF
statistics from AF-0. This is configurable for AEXP and AWB statsistics through
programming the ISP's parameters.
.. _mali-c55-3a-params:
Programming ISP Parameters
==========================
The ISP can be programmed with various parameters from userspace to apply to the
hardware before and during video stream. This allows userspace to dynamically
change values such as black level, white balance and lens shading gains and so
on.
The buffer format and how to populate it are described by the
:ref:`V4L2_META_FMT_MALI_C55_PARAMS <v4l2-meta-fmt-mali-c55-params>` format,
which should be set as the data format for the `mali-c55 3a params` video node.
References
==========
.. [1] https://git.linuxtv.org/v4l-utils.git/
.. [2] https://git.ideasonboard.org/yavta.git
.. [3] https://developer.arm.com/Processors/Mali-C55

View File

@ -18,8 +18,6 @@ am437x-vpfe TI AM437x VPFE
aspeed-video Aspeed AST2400 and AST2500
atmel-isc ATMEL Image Sensor Controller (ISC)
atmel-isi ATMEL Image Sensor Interface (ISI)
c8sectpfe SDR platform devices
c8sectpfe SDR platform devices
cafe_ccic Marvell 88ALP01 (Cafe) CMOS Camera Controller
cdns-csi2rx Cadence MIPI-CSI2 RX Controller
cdns-csi2tx Cadence MIPI-CSI2 TX Controller

View File

@ -30,7 +30,6 @@ radio-terratec TerraTec ActiveRadio ISA Standalone
radio-timb Enable the Timberdale radio driver
radio-trust Trust FM radio card
radio-typhoon Typhoon Radio (a.k.a. EcoRadio)
radio-wl1273 Texas Instruments WL1273 I2C FM Radio
fm_drv ISA radio devices
fm_drv ISA radio devices
radio-zoltrix Zoltrix Radio

View File

@ -0,0 +1,8 @@
digraph board {
rankdir=TB
n00000001 [label="{{<port0> 0} | rkcif-dvp0\n/dev/v4l-subdev0 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
n00000001:port1 -> n00000004
n00000004 [label="rkcif-dvp0-id0\n/dev/video0", shape=box, style=filled, fillcolor=yellow]
n00000025 [label="{{} | it6801 2-0048\n/dev/v4l-subdev1 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
n00000025:port0 -> n00000001:port0
}

View File

@ -0,0 +1,79 @@
.. SPDX-License-Identifier: GPL-2.0
=========================================
Rockchip Camera Interface (CIF)
=========================================
Introduction
============
The Rockchip Camera Interface (CIF) is featured in many Rockchip SoCs in
different variants.
The different variants are combinations of common building blocks, such as
* INTERFACE blocks of different types, namely
* the Digital Video Port (DVP, a parallel data interface)
* the interface block for the MIPI CSI-2 receiver
* CROP units
* MIPI CSI-2 receiver (not available on all variants): This unit is referred
to as MIPI CSI HOST in the Rockchip documentation.
Technically, it is a separate hardware block, but it is strongly coupled to
the CIF and therefore included here.
* MUX units (not available on all variants) that pass the video data to an
image signal processor (ISP)
* SCALE units (not available on all variants)
* DMA engines that transfer video data into system memory using a
double-buffering mechanism called ping-pong mode
* Support for four streams per INTERFACE block (not available on all
variants), e.g., for MIPI CSI-2 Virtual Channels (VCs)
This document describes the different variants of the CIF, their hardware
layout, as well as their representation in the media controller centric rkcif
device driver, which is located under drivers/media/platform/rockchip/rkcif.
Variants
========
Rockchip PX30 Video Input Processor (VIP)
-----------------------------------------
The PX30 Video Input Processor (VIP) features a digital video port that accepts
parallel video data or BT.656.
Since these protocols do not feature multiple streams, the VIP has one DMA
engine that transfers the input video data into system memory.
The rkcif driver represents this hardware variant by exposing one V4L2 subdevice
(the DVP INTERFACE/CROP block) and one V4L2 device (the DVP DMA engine).
Rockchip RK3568 Video Capture (VICAP)
-------------------------------------
The RK3568 Video Capture (VICAP) unit features a digital video port and a MIPI
CSI-2 receiver that can receive video data independently.
The DVP accepts parallel video data, BT.656 and BT.1120.
Since the BT.1120 protocol may feature more than one stream, the RK3568 VICAP
DVP features four DMA engines that can capture different streams.
Similarly, the RK3568 VICAP MIPI CSI-2 receiver features four DMA engines to
handle different Virtual Channels (VCs).
The rkcif driver represents this hardware variant by exposing up the following
V4L2 subdevices:
* rkcif-dvp0: INTERFACE/CROP block for the DVP
and the following video devices:
* rkcif-dvp0-id0: The support for multiple streams on the DVP is not yet
implemented, as it is hard to find test hardware. Thus, this video device
represents the first DMA engine of the RK3568 DVP.
.. kernel-figure:: rkcif-rk3568-vicap.dot
:alt: Topology of the RK3568 Video Capture (VICAP) unit
:align: center

View File

@ -19,12 +19,14 @@ Video4Linux (V4L) driver-specific documentation
ipu3
ipu6-isys
ivtv
mali-c55
mgb4
omap3isp
philips
qcom_camss
raspberrypi-pisp-be
rcar-fdp1
rkcif
rkisp1
raspberrypi-rp1-cfe
saa7134

View File

@ -211,6 +211,28 @@ End of target memory region in physical address.
The end physical address of memory region that DAMON_LRU_SORT will do work
against. By default, biggest System RAM is used as the region.
addr_unit
---------
A scale factor for memory addresses and bytes.
This parameter is for setting and getting the :ref:`address unit
<damon_design_addr_unit>` parameter of the DAMON instance for DAMON_RECLAIM.
``monitor_region_start`` and ``monitor_region_end`` should be provided in this
unit. For example, let's suppose ``addr_unit``, ``monitor_region_start`` and
``monitor_region_end`` are set as ``1024``, ``0`` and ``10``, respectively.
Then DAMON_LRU_SORT will work for 10 KiB length of physical address range that
starts from address zero (``[0 * 1024, 10 * 1024)`` in bytes).
Stat parameters having ``bytes_`` prefix are also in this unit. For example,
let's suppose values of ``addr_unit``, ``bytes_lru_sort_tried_hot_regions`` and
``bytes_lru_sorted_hot_regions`` are ``1024``, ``42``, and ``32``,
respectively. Then it means DAMON_LRU_SORT tried to LRU-sort 42 KiB of hot
memory and successfully LRU-sorted 32 KiB of the memory in total.
If unsure, use only the default value (``1``) and forget about this.
kdamond_pid
-----------

View File

@ -232,6 +232,28 @@ The end physical address of memory region that DAMON_RECLAIM will do work
against. That is, DAMON_RECLAIM will find cold memory regions in this region
and reclaims. By default, biggest System RAM is used as the region.
addr_unit
---------
A scale factor for memory addresses and bytes.
This parameter is for setting and getting the :ref:`address unit
<damon_design_addr_unit>` parameter of the DAMON instance for DAMON_RECLAIM.
``monitor_region_start`` and ``monitor_region_end`` should be provided in this
unit. For example, let's suppose ``addr_unit``, ``monitor_region_start`` and
``monitor_region_end`` are set as ``1024``, ``0`` and ``10``, respectively.
Then DAMON_RECLAIM will work for 10 KiB length of physical address range that
starts from address zero (``[0 * 1024, 10 * 1024)`` in bytes).
``bytes_reclaim_tried_regions`` and ``bytes_reclaimed_regions`` are also in
this unit. For example, let's suppose values of ``addr_unit``,
``bytes_reclaim_tried_regions`` and ``bytes_reclaimed_regions`` are ``1024``,
``42``, and ``32``, respectively. Then it means DAMON_RECLAIM tried to reclaim
42 KiB memory and successfully reclaimed 32 KiB memory in total.
If unsure, use only the default value (``1``) and forget about this.
skip_anon
---------

View File

@ -10,6 +10,8 @@ on the system's entire physical memory using DAMON, and provides simplified
access monitoring results statistics, namely idle time percentiles and
estimated memory bandwidth.
.. _damon_stat_monitoring_accuracy_overhead:
Monitoring Accuracy and Overhead
================================
@ -17,9 +19,11 @@ DAMON_STAT uses monitoring intervals :ref:`auto-tuning
<damon_design_monitoring_intervals_autotuning>` to make its accuracy high and
overhead minimum. It auto-tunes the intervals aiming 4 % of observable access
events to be captured in each snapshot, while limiting the resulting sampling
events to be 5 milliseconds in minimum and 10 seconds in maximum. On a few
interval to be 5 milliseconds in minimum and 10 seconds in maximum. On a few
production server systems, it resulted in consuming only 0.x % single CPU time,
while capturing reasonable quality of access patterns.
while capturing reasonable quality of access patterns. The tuning-resulting
intervals can be retrieved via ``aggr_interval_us`` :ref:`parameter
<damon_stat_aggr_interval_us>`.
Interface: Module Parameters
============================
@ -41,6 +45,18 @@ You can enable DAMON_STAT by setting the value of this parameter as ``Y``.
Setting it as ``N`` disables DAMON_STAT. The default value is set by
``CONFIG_DAMON_STAT_ENABLED_DEFAULT`` build config option.
.. _damon_stat_aggr_interval_us:
aggr_interval_us
----------------
Auto-tuned aggregation time interval in microseconds.
Users can read the aggregation interval of DAMON that is being used by the
DAMON instance for DAMON_STAT. It is :ref:`auto-tuned
<damon_stat_monitoring_accuracy_overhead>` and therefore the value is
dynamically changed.
estimated_memory_bandwidth
--------------------------
@ -58,12 +74,13 @@ memory_idle_ms_percentiles
Per-byte idle time (milliseconds) percentiles of the system.
DAMON_STAT calculates how long each byte of the memory was not accessed until
now (idle time), based on the current DAMON results snapshot. If DAMON found a
region of access frequency (nr_accesses) larger than zero, every byte of the
region gets zero idle time. If a region has zero access frequency
(nr_accesses), how long the region was keeping the zero access frequency (age)
becomes the idle time of every byte of the region. Then, DAMON_STAT exposes
the percentiles of the idle time values via this read-only parameter. Reading
the parameter returns 101 idle time values in milliseconds, separated by comma.
now (idle time), based on the current DAMON results snapshot. For regions
having access frequency (nr_accesses) larger than zero, how long the current
access frequency level was kept multiplied by ``-1`` becomes the idlee time of
every byte of the region. If a region has zero access frequency (nr_accesses),
how long the region was keeping the zero access frequency (age) becomes the
idle time of every byte of the region. Then, DAMON_STAT exposes the
percentiles of the idle time values via this read-only parameter. Reading the
parameter returns 101 idle time values in milliseconds, separated by comma.
Each value represents 0-th, 1st, 2nd, 3rd, ..., 99th and 100th percentile idle
times.

View File

@ -67,7 +67,7 @@ comma (",").
│ │ │ │ │ │ │ intervals_goal/access_bp,aggrs,min_sample_us,max_sample_us
│ │ │ │ │ │ nr_regions/min,max
│ │ │ │ │ :ref:`targets <sysfs_targets>`/nr_targets
│ │ │ │ │ │ :ref:`0 <sysfs_target>`/pid_target
│ │ │ │ │ │ :ref:`0 <sysfs_target>`/pid_target,obsolete_target
│ │ │ │ │ │ │ :ref:`regions <sysfs_regions>`/nr_regions
│ │ │ │ │ │ │ │ :ref:`0 <sysfs_region>`/start,end
│ │ │ │ │ │ │ │ ...
@ -81,7 +81,7 @@ comma (",").
│ │ │ │ │ │ │ :ref:`quotas <sysfs_quotas>`/ms,bytes,reset_interval_ms,effective_bytes
│ │ │ │ │ │ │ │ weights/sz_permil,nr_accesses_permil,age_permil
│ │ │ │ │ │ │ │ :ref:`goals <sysfs_schemes_quota_goals>`/nr_goals
│ │ │ │ │ │ │ │ │ 0/target_metric,target_value,current_value,nid
│ │ │ │ │ │ │ │ │ 0/target_metric,target_value,current_value,nid,path
│ │ │ │ │ │ │ :ref:`watermarks <sysfs_watermarks>`/metric,interval_us,high,mid,low
│ │ │ │ │ │ │ :ref:`{core_,ops_,}filters <sysfs_filters>`/nr_filters
│ │ │ │ │ │ │ │ 0/type,matching,allow,memcg_path,addr_start,addr_end,target_idx,min,max
@ -134,7 +134,8 @@ Users can write below commands for the kdamond to the ``state`` file.
- ``on``: Start running.
- ``off``: Stop running.
- ``commit``: Read the user inputs in the sysfs files except ``state`` file
again.
again. Monitoring :ref:`target region <sysfs_regions>` inputs are also be
ignored if no target region is specified.
- ``update_tuned_intervals``: Update the contents of ``sample_us`` and
``aggr_us`` files of the kdamond with the auto-tuning applied ``sampling
interval`` and ``aggregation interval`` for the files. Please refer to
@ -264,13 +265,20 @@ to ``N-1``. Each directory represents each monitoring target.
targets/<N>/
------------
In each target directory, one file (``pid_target``) and one directory
(``regions``) exist.
In each target directory, two files (``pid_target`` and ``obsolete_target``)
and one directory (``regions``) exist.
If you wrote ``vaddr`` to the ``contexts/<N>/operations``, each target should
be a process. You can specify the process to DAMON by writing the pid of the
process to the ``pid_target`` file.
Users can selectively remove targets in the middle of the targets array by
writing non-zero value to ``obsolete_target`` file and committing it (writing
``commit`` to ``state`` file). DAMON will remove the matching targets from its
internal targets array. Users are responsible to construct target directories
again, so that those correctly represent the changed internal targets array.
.. _sysfs_regions:
targets/<N>/regions
@ -289,6 +297,11 @@ In the beginning, this directory has only one file, ``nr_regions``. Writing a
number (``N``) to the file creates the number of child directories named ``0``
to ``N-1``. Each directory represents each initial monitoring target region.
If ``nr_regions`` is zero when committing new DAMON parameters online (writing
``commit`` to ``state`` file of :ref:`kdamond <sysfs_kdamond>`), the commit
logic ignores the target regions. In other words, the current monitoring
results for the target are preserved.
.. _sysfs_region:
regions/<N>/
@ -402,9 +415,9 @@ number (``N``) to the file creates the number of child directories named ``0``
to ``N-1``. Each directory represents each goal and current achievement.
Among the multiple feedback, the best one is used.
Each goal directory contains four files, namely ``target_metric``,
``target_value``, ``current_value`` and ``nid``. Users can set and get the
four parameters for the quota auto-tuning goals that specified on the
Each goal directory contains five files, namely ``target_metric``,
``target_value``, ``current_value`` ``nid`` and ``path``. Users can set and
get the five parameters for the quota auto-tuning goals that specified on the
:ref:`design doc <damon_design_damos_quotas_auto_tuning>` by writing to and
reading from each of the files. Note that users should further write
``commit_schemes_quota_goals`` to the ``state`` file of the :ref:`kdamond

View File

@ -39,7 +39,6 @@ the Linux memory management.
shrinker_debugfs
slab
soft-dirty
swap_numa
transhuge
userfaultfd
zswap

View File

@ -115,7 +115,8 @@ Short descriptions to the page flags
A free memory block managed by the buddy system allocator.
The buddy system organizes free memory in blocks of various orders.
An order N block has 2^N physically contiguous pages, with the BUDDY flag
set for and _only_ for the first page.
set for all pages.
Before 4.6 only the first page of the block had the flag set.
15 - COMPOUND_HEAD
A compound page with order N consists of 2^N physically contiguous pages.
A compound page with order 2 takes the form of "HTTT", where H donates its

View File

@ -1,78 +0,0 @@
===========================================
Automatically bind swap device to numa node
===========================================
If the system has more than one swap device and swap device has the node
information, we can make use of this information to decide which swap
device to use in get_swap_pages() to get better performance.
How to use this feature
=======================
Swap device has priority and that decides the order of it to be used. To make
use of automatically binding, there is no need to manipulate priority settings
for swap devices. e.g. on a 2 node machine, assume 2 swap devices swapA and
swapB, with swapA attached to node 0 and swapB attached to node 1, are going
to be swapped on. Simply swapping them on by doing::
# swapon /dev/swapA
# swapon /dev/swapB
Then node 0 will use the two swap devices in the order of swapA then swapB and
node 1 will use the two swap devices in the order of swapB then swapA. Note
that the order of them being swapped on doesn't matter.
A more complex example on a 4 node machine. Assume 6 swap devices are going to
be swapped on: swapA and swapB are attached to node 0, swapC is attached to
node 1, swapD and swapE are attached to node 2 and swapF is attached to node3.
The way to swap them on is the same as above::
# swapon /dev/swapA
# swapon /dev/swapB
# swapon /dev/swapC
# swapon /dev/swapD
# swapon /dev/swapE
# swapon /dev/swapF
Then node 0 will use them in the order of::
swapA/swapB -> swapC -> swapD -> swapE -> swapF
swapA and swapB will be used in a round robin mode before any other swap device.
node 1 will use them in the order of::
swapC -> swapA -> swapB -> swapD -> swapE -> swapF
node 2 will use them in the order of::
swapD/swapE -> swapA -> swapB -> swapC -> swapF
Similaly, swapD and swapE will be used in a round robin mode before any
other swap devices.
node 3 will use them in the order of::
swapF -> swapA -> swapB -> swapC -> swapD -> swapE
Implementation details
======================
The current code uses a priority based list, swap_avail_list, to decide
which swap device to use and if multiple swap devices share the same
priority, they are used round robin. This change here replaces the single
global swap_avail_list with a per-numa-node list, i.e. for each numa node,
it sees its own priority based list of available swap devices. Swap
device's priority can be promoted on its matching node's swap_avail_list.
The current swap device's priority is set as: user can set a >=0 value,
or the system will pick one starting from -1 then downwards. The priority
value in the swap_avail_list is the negated value of the swap device's
due to plist being sorted from low to high. The new policy doesn't change
the semantics for priority >=0 cases, the previous starting from -1 then
downwards now becomes starting from -2 then downwards and -1 is reserved
as the promoted value. So if multiple swap devices are attached to the same
node, they will all be promoted to priority -1 on that node's plist and will
be used round robin before any other swap devices.

View File

@ -381,6 +381,11 @@ hugepage allocation policy for the tmpfs mount by using the kernel parameter
four valid policies for tmpfs (``always``, ``within_size``, ``advise``,
``never``). The tmpfs mount default policy is ``never``.
Additionally, Kconfig options are available to set the default hugepage
policies for shmem (``CONFIG_TRANSPARENT_HUGEPAGE_SHMEM_HUGE_*``) and tmpfs
(``CONFIG_TRANSPARENT_HUGEPAGE_TMPFS_HUGE_*``) at build time. Refer to the
Kconfig help for more details.
In the same manner as ``thp_anon`` controls each supported anonymous THP
size, ``thp_shmem`` controls each supported shmem THP size. ``thp_shmem``
has the same format as ``thp_anon``, but also supports the policy

View File

@ -59,11 +59,11 @@ returned by the allocation routine and that handle must be mapped before being
accessed. The compressed memory pool grows on demand and shrinks as compressed
pages are freed. The pool is not preallocated.
When a swap page is passed from swapout to zswap, zswap maintains a mapping
of the swap entry, a combination of the swap type and swap offset, to the
zsmalloc handle that references that compressed swap page. This mapping is
achieved with a red-black tree per swap type. The swap offset is the search
key for the tree nodes.
When a swap page is passed from swapout to zswap, zswap maintains a mapping of
the swap entry, a combination of the swap type and swap offset, to the zsmalloc
handle that references that compressed swap page. This mapping is achieved
with an xarray per swap type. The swap offset is the search key for the xarray
nodes.
During a page fault on a PTE that is a swap entry, the swapin code calls the
zswap load function to decompress the page into the page allocated by the page

View File

@ -580,6 +580,15 @@ the given CPU as the upper limit for the exit latency of the idle states that
they are allowed to select for that CPU. They should never select any idle
states with exit latency beyond that limit.
While the above CPU QoS constraints apply to CPU idle time management, user
space may also request a CPU system wakeup latency QoS limit, via the
`cpu_wakeup_latency` file. This QoS constraint is respected when selecting a
suitable idle state for the CPUs, while entering the system-wide suspend-to-idle
sleep state, but also to the regular CPU idle time management.
Note that, the management of the `cpu_wakeup_latency` file works according to
the 'cpu_dma_latency' file from user space point of view. Moreover, the unit
is also microseconds.
Idle States Control Via Kernel Command Line
===========================================

View File

@ -48,8 +48,9 @@ only way to pass early-configuration-time parameters to it is via the kernel
command line. However, its configuration can be adjusted via ``sysfs`` to a
great extent. In some configurations it even is possible to unregister it via
``sysfs`` which allows another ``CPUFreq`` scaling driver to be loaded and
registered (see `below <status_attr_>`_).
registered (see :ref:`below <status_attr>`).
.. _operation_modes:
Operation Modes
===============
@ -62,6 +63,8 @@ a certain performance scaling algorithm. Which of them will be in effect
depends on what kernel command line options are used and on the capabilities of
the processor.
.. _active_mode:
Active Mode
-----------
@ -94,6 +97,8 @@ Which of the P-state selection algorithms is used by default depends on the
Namely, if that option is set, the ``performance`` algorithm will be used by
default, and the other one will be used by default if it is not set.
.. _active_mode_hwp:
Active Mode With HWP
~~~~~~~~~~~~~~~~~~~~
@ -123,7 +128,7 @@ Energy-Performance Bias (EPB) knob (otherwise), which means that the processor's
internal P-state selection logic is expected to focus entirely on performance.
This will override the EPP/EPB setting coming from the ``sysfs`` interface
(see `Energy vs Performance Hints`_ below). Moreover, any attempts to change
(see :ref:`energy_performance_hints` below). Moreover, any attempts to change
the EPP/EPB to a value different from 0 ("performance") via ``sysfs`` in this
configuration will be rejected.
@ -192,6 +197,8 @@ This is the default P-state selection algorithm if the
:c:macro:`CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE` kernel configuration option
is not set.
.. _passive_mode:
Passive Mode
------------
@ -289,12 +296,12 @@ Unlike ``_PSS`` objects in the ACPI tables, ``intel_pstate`` always exposes
the entire range of available P-states, including the whole turbo range, to the
``CPUFreq`` core and (in the passive mode) to generic scaling governors. This
generally causes turbo P-states to be set more often when ``intel_pstate`` is
used relative to ACPI-based CPU performance scaling (see `below <acpi-cpufreq_>`_
for more information).
used relative to ACPI-based CPU performance scaling (see
:ref:`below <acpi-cpufreq>` for more information).
Moreover, since ``intel_pstate`` always knows what the real turbo threshold is
(even if the Configurable TDP feature is enabled in the processor), its
``no_turbo`` attribute in ``sysfs`` (described `below <no_turbo_attr_>`_) should
``no_turbo`` attribute in ``sysfs`` (described :ref:`below <no_turbo_attr>`) should
work as expected in all cases (that is, if set to disable turbo P-states, it
always should prevent ``intel_pstate`` from using them).
@ -307,12 +314,12 @@ pieces of information on it to be known, including:
* The minimum supported P-state.
* The maximum supported `non-turbo P-state <turbo_>`_.
* The maximum supported :ref:`non-turbo P-state <turbo>`.
* Whether or not turbo P-states are supported at all.
* The maximum supported `one-core turbo P-state <turbo_>`_ (if turbo P-states
are supported).
* The maximum supported :ref:`one-core turbo P-state <turbo>` (if turbo
P-states are supported).
* The scaling formula to translate the driver's internal representation
of P-states into frequencies and the other way around.
@ -400,10 +407,10 @@ Energy-Aware Scheduling Support
If ``CONFIG_ENERGY_MODEL`` has been set during kernel configuration and
``intel_pstate`` runs on a hybrid processor without SMT, in addition to enabling
`CAS <CAS_>`_ it registers an Energy Model for the processor. This allows the
:ref:`CAS` it registers an Energy Model for the processor. This allows the
Energy-Aware Scheduling (EAS) support to be enabled in the CPU scheduler if
``schedutil`` is used as the ``CPUFreq`` governor which requires ``intel_pstate``
to operate in the `passive mode <Passive Mode_>`_.
to operate in the :ref:`passive mode <passive_mode>`.
The Energy Model registered by ``intel_pstate`` is artificial (that is, it is
based on abstract cost values and it does not include any real power numbers)
@ -432,6 +439,8 @@ the ``energy_model`` directory in ``debugfs`` (typlically mounted on
User Space Interface in ``sysfs``
=================================
.. _global_attributes:
Global Attributes
-----------------
@ -444,8 +453,8 @@ argument is passed to the kernel in the command line.
``max_perf_pct``
Maximum P-state the driver is allowed to set in percent of the
maximum supported performance level (the highest supported `turbo
P-state <turbo_>`_).
maximum supported performance level (the highest supported :ref:`turbo
P-state <turbo>`).
This attribute will not be exposed if the
``intel_pstate=per_cpu_perf_limits`` argument is present in the kernel
@ -453,8 +462,8 @@ argument is passed to the kernel in the command line.
``min_perf_pct``
Minimum P-state the driver is allowed to set in percent of the
maximum supported performance level (the highest supported `turbo
P-state <turbo_>`_).
maximum supported performance level (the highest supported :ref:`turbo
P-state <turbo>`).
This attribute will not be exposed if the
``intel_pstate=per_cpu_perf_limits`` argument is present in the kernel
@ -463,18 +472,18 @@ argument is passed to the kernel in the command line.
``num_pstates``
Number of P-states supported by the processor (between 0 and 255
inclusive) including both turbo and non-turbo P-states (see
`Turbo P-states Support`_).
:ref:`turbo`).
This attribute is present only if the value exposed by it is the same
for all of the CPUs in the system.
The value of this attribute is not affected by the ``no_turbo``
setting described `below <no_turbo_attr_>`_.
setting described :ref:`below <no_turbo_attr>`.
This attribute is read-only.
``turbo_pct``
Ratio of the `turbo range <turbo_>`_ size to the size of the entire
Ratio of the :ref:`turbo range <turbo>` size to the size of the entire
range of supported P-states, in percent.
This attribute is present only if the value exposed by it is the same
@ -486,7 +495,7 @@ argument is passed to the kernel in the command line.
``no_turbo``
If set (equal to 1), the driver is not allowed to set any turbo P-states
(see `Turbo P-states Support`_). If unset (equal to 0, which is the
(see :ref:`turbo`). If unset (equal to 0, which is the
default), turbo P-states can be set by the driver.
[Note that ``intel_pstate`` does not support the general ``boost``
attribute (supported by some other scaling drivers) which is replaced
@ -495,11 +504,11 @@ argument is passed to the kernel in the command line.
This attribute does not affect the maximum supported frequency value
supplied to the ``CPUFreq`` core and exposed via the policy interface,
but it affects the maximum possible value of per-policy P-state limits
(see `Interpretation of Policy Attributes`_ below for details).
(see :ref:`policy_attributes_interpretation` below for details).
``hwp_dynamic_boost``
This attribute is only present if ``intel_pstate`` works in the
`active mode with the HWP feature enabled <Active Mode With HWP_>`_ in
:ref:`active mode with the HWP feature enabled <active_mode_hwp>` in
the processor. If set (equal to 1), it causes the minimum P-state limit
to be increased dynamically for a short time whenever a task previously
waiting on I/O is selected to run on a given logical CPU (the purpose
@ -514,12 +523,12 @@ argument is passed to the kernel in the command line.
Operation mode of the driver: "active", "passive" or "off".
"active"
The driver is functional and in the `active mode
<Active Mode_>`_.
The driver is functional and in the :ref:`active mode
<active_mode>`.
"passive"
The driver is functional and in the `passive mode
<Passive Mode_>`_.
The driver is functional and in the :ref:`passive mode
<passive_mode>`.
"off"
The driver is not functional (it is not registered as a scaling
@ -547,13 +556,15 @@ argument is passed to the kernel in the command line.
attribute to "1" enables the energy-efficiency optimizations and setting
to "0" disables them.
.. _policy_attributes_interpretation:
Interpretation of Policy Attributes
-----------------------------------
The interpretation of some ``CPUFreq`` policy attributes described in
Documentation/admin-guide/pm/cpufreq.rst is special with ``intel_pstate``
as the current scaling driver and it generally depends on the driver's
`operation mode <Operation Modes_>`_.
:ref:`operation mode <operation_modes>`.
First of all, the values of the ``cpuinfo_max_freq``, ``cpuinfo_min_freq`` and
``scaling_cur_freq`` attributes are produced by applying a processor-specific
@ -562,9 +573,10 @@ Also, the values of the ``scaling_max_freq`` and ``scaling_min_freq``
attributes are capped by the frequency corresponding to the maximum P-state that
the driver is allowed to set.
If the ``no_turbo`` `global attribute <no_turbo_attr_>`_ is set, the driver is
not allowed to use turbo P-states, so the maximum value of ``scaling_max_freq``
and ``scaling_min_freq`` is limited to the maximum non-turbo P-state frequency.
If the ``no_turbo`` :ref:`global attribute <no_turbo_attr>` is set, the driver
is not allowed to use turbo P-states, so the maximum value of
``scaling_max_freq`` and ``scaling_min_freq`` is limited to the maximum
non-turbo P-state frequency.
Accordingly, setting ``no_turbo`` causes ``scaling_max_freq`` and
``scaling_min_freq`` to go down to that value if they were above it before.
However, the old values of ``scaling_max_freq`` and ``scaling_min_freq`` will be
@ -576,7 +588,7 @@ and ``scaling_min_freq`` corresponds to the maximum supported turbo P-state,
which also is the value of ``cpuinfo_max_freq`` in either case.
Next, the following policy attributes have special meaning if
``intel_pstate`` works in the `active mode <Active Mode_>`_:
``intel_pstate`` works in the :ref:`active mode <active_mode>`:
``scaling_available_governors``
List of P-state selection algorithms provided by ``intel_pstate``.
@ -597,20 +609,22 @@ processor:
Shows the base frequency of the CPU. Any frequency above this will be
in the turbo frequency range.
The meaning of these attributes in the `passive mode <Passive Mode_>`_ is the
The meaning of these attributes in the :ref:`passive mode <passive_mode>` is the
same as for other scaling drivers.
Additionally, the value of the ``scaling_driver`` attribute for ``intel_pstate``
depends on the operation mode of the driver. Namely, it is either
"intel_pstate" (in the `active mode <Active Mode_>`_) or "intel_cpufreq" (in the
`passive mode <Passive Mode_>`_).
"intel_pstate" (in the :ref:`active mode <active_mode>`) or "intel_cpufreq"
(in the :ref:`passive mode <passive_mode>`).
.. _pstate_limits_coordination:
Coordination of P-State Limits
------------------------------
``intel_pstate`` allows P-state limits to be set in two ways: with the help of
the ``max_perf_pct`` and ``min_perf_pct`` `global attributes
<Global Attributes_>`_ or via the ``scaling_max_freq`` and ``scaling_min_freq``
the ``max_perf_pct`` and ``min_perf_pct`` :ref:`global attributes
<global_attributes>` or via the ``scaling_max_freq`` and ``scaling_min_freq``
``CPUFreq`` policy attributes. The coordination between those limits is based
on the following rules, regardless of the current operation mode of the driver:
@ -632,17 +646,18 @@ on the following rules, regardless of the current operation mode of the driver:
3. The global and per-policy limits can be set independently.
In the `active mode with the HWP feature enabled <Active Mode With HWP_>`_, the
In the :ref:`active mode with the HWP feature enabled <active_mode_hwp>`, the
resulting effective values are written into hardware registers whenever the
limits change in order to request its internal P-state selection logic to always
set P-states within these limits. Otherwise, the limits are taken into account
by scaling governors (in the `passive mode <Passive Mode_>`_) and by the driver
every time before setting a new P-state for a CPU.
by scaling governors (in the :ref:`passive mode <passive_mode>`) and by the
driver every time before setting a new P-state for a CPU.
Additionally, if the ``intel_pstate=per_cpu_perf_limits`` command line argument
is passed to the kernel, ``max_perf_pct`` and ``min_perf_pct`` are not exposed
at all and the only way to set the limits is by using the policy attributes.
.. _energy_performance_hints:
Energy vs Performance Hints
---------------------------
@ -702,9 +717,9 @@ output.
On those systems each ``_PSS`` object returns a list of P-states supported by
the corresponding CPU which basically is a subset of the P-states range that can
be used by ``intel_pstate`` on the same system, with one exception: the whole
`turbo range <turbo_>`_ is represented by one item in it (the topmost one). By
convention, the frequency returned by ``_PSS`` for that item is greater by 1 MHz
than the frequency of the highest non-turbo P-state listed by it, but the
:ref:`turbo range <turbo>` is represented by one item in it (the topmost one).
By convention, the frequency returned by ``_PSS`` for that item is greater by
1 MHz than the frequency of the highest non-turbo P-state listed by it, but the
corresponding P-state representation (following the hardware specification)
returned for it matches the maximum supported turbo P-state (or is the
special value 255 meaning essentially "go as high as you can get").
@ -730,18 +745,18 @@ benefit from running at turbo frequencies will be given non-turbo P-states
instead.
One more issue related to that may appear on systems supporting the
`Configurable TDP feature <turbo_>`_ allowing the platform firmware to set the
turbo threshold. Namely, if that is not coordinated with the lists of P-states
returned by ``_PSS`` properly, there may be more than one item corresponding to
a turbo P-state in those lists and there may be a problem with avoiding the
turbo range (if desirable or necessary). Usually, to avoid using turbo
P-states overall, ``acpi-cpufreq`` simply avoids using the topmost state listed
by ``_PSS``, but that is not sufficient when there are other turbo P-states in
the list returned by it.
:ref:`Configurable TDP feature <turbo>` allowing the platform firmware to set
the turbo threshold. Namely, if that is not coordinated with the lists of
P-states returned by ``_PSS`` properly, there may be more than one item
corresponding to a turbo P-state in those lists and there may be a problem with
avoiding the turbo range (if desirable or necessary). Usually, to avoid using
turbo P-states overall, ``acpi-cpufreq`` simply avoids using the topmost state
listed by ``_PSS``, but that is not sufficient when there are other turbo
P-states in the list returned by it.
Apart from the above, ``acpi-cpufreq`` works like ``intel_pstate`` in the
`passive mode <Passive Mode_>`_, except that the number of P-states it can set
is limited to the ones listed by the ACPI ``_PSS`` objects.
:ref:`passive mode <passive_mode>`, except that the number of P-states it can
set is limited to the ones listed by the ACPI ``_PSS`` objects.
Kernel Command Line Options for ``intel_pstate``
@ -756,11 +771,11 @@ of them have to be prepended with the ``intel_pstate=`` prefix.
processor is supported by it.
``active``
Register ``intel_pstate`` in the `active mode <Active Mode_>`_ to start
with.
Register ``intel_pstate`` in the :ref:`active mode <active_mode>` to
start with.
``passive``
Register ``intel_pstate`` in the `passive mode <Passive Mode_>`_ to
Register ``intel_pstate`` in the :ref:`passive mode <passive_mode>` to
start with.
``force``
@ -793,12 +808,12 @@ of them have to be prepended with the ``intel_pstate=`` prefix.
and this option has no effect.
``per_cpu_perf_limits``
Use per-logical-CPU P-State limits (see `Coordination of P-state
Limits`_ for details).
Use per-logical-CPU P-State limits (see
:ref:`pstate_limits_coordination` for details).
``no_cas``
Do not enable `capacity-aware scheduling <CAS_>`_ which is enabled by
default on hybrid systems without SMT.
Do not enable :ref:`capacity-aware scheduling <CAS>` which is enabled
by default on hybrid systems without SMT.
Diagnostics and Tuning
======================
@ -810,7 +825,7 @@ There are two static trace events that can be used for ``intel_pstate``
diagnostics. One of them is the ``cpu_frequency`` trace event generally used
by ``CPUFreq``, and the other one is the ``pstate_sample`` trace event specific
to ``intel_pstate``. Both of them are triggered by ``intel_pstate`` only if
it works in the `active mode <Active Mode_>`_.
it works in the :ref:`active mode <active_mode>`.
The following sequence of shell commands can be used to enable them and see
their output (if the kernel is generally configured to support event tracing)::
@ -822,7 +837,7 @@ their output (if the kernel is generally configured to support event tracing)::
gnome-terminal--4510 [001] ..s. 1177.680733: pstate_sample: core_busy=107 scaled=94 from=26 to=26 mperf=1143818 aperf=1230607 tsc=29838618 freq=2474476
cat-5235 [002] ..s. 1177.681723: cpu_frequency: state=2900000 cpu_id=2
If ``intel_pstate`` works in the `passive mode <Passive Mode_>`_, the
If ``intel_pstate`` works in the :ref:`passive mode <passive_mode>`, the
``cpu_frequency`` trace event will be triggered either by the ``schedutil``
scaling governor (for the policies it is attached to), or by the ``CPUFreq``
core (for the policies with other scaling governors).

View File

@ -397,13 +397,14 @@ a hung task is detected.
hung_task_panic
===============
Controls the kernel's behavior when a hung task is detected.
When set to a non-zero value, a kernel panic will be triggered if the
number of hung tasks found during a single scan reaches this value.
This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled.
= =================================================
= =======================================================
0 Continue operation. This is the default behavior.
1 Panic immediately.
= =================================================
N Panic when N hung tasks are found during a single scan.
= =======================================================
hung_task_check_count
@ -421,6 +422,11 @@ the system boot.
This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled.
hung_task_sys_info
==================
A comma separated list of extra system information to be dumped when
hung task is detected, for example, "tasks,mem,timers,locks,...".
Refer 'panic_sys_info' section below for more details.
hung_task_timeout_secs
======================
@ -515,6 +521,15 @@ default), only processes with the CAP_SYS_ADMIN capability may create
io_uring instances.
kernel_sys_info
===============
A comma separated list of extra system information to be dumped when
soft/hard lockup is detected, for example, "tasks,mem,timers,locks,...".
Refer 'panic_sys_info' section below for more details.
It serves as the default kernel control knob, which will take effect
when a kernel module calls sys_info() with parameter==0.
kexec_load_disabled
===================
@ -576,6 +591,11 @@ if leaking kernel pointer values to unprivileged users is a concern.
When ``kptr_restrict`` is set to 2, kernel pointers printed using
%pK will be replaced with 0s regardless of privileges.
softlockup_sys_info & hardlockup_sys_info
=========================================
A comma separated list of extra system information to be dumped when
soft/hard lockup is detected, for example, "tasks,mem,timers,locks,...".
Refer 'panic_sys_info' section below for more details.
modprobe
========
@ -910,8 +930,8 @@ to 'panic_print'. Possible values are:
============= ===================================================
tasks print all tasks info
mem print system memory info
timer print timers info
lock print locks info if CONFIG_LOCKDEP is on
timers print timers info
locks print locks info if CONFIG_LOCKDEP is on
ftrace print ftrace buffer
all_bt print all CPUs backtrace (if available in the arch)
blocked_tasks print only tasks in uninterruptible (blocked) state

View File

@ -212,6 +212,14 @@ mem_pcpu_rsv
Per-cpu reserved forward alloc cache size in page units. Default 1MB per CPU.
bypass_prot_mem
---------------
Skip charging socket buffers to the global per-protocol memory
accounting controlled by net.ipv4.tcp_mem, net.ipv4.udp_mem, etc.
Default: 0 (off)
rmem_default
------------
@ -347,9 +355,9 @@ skb_defer_max
-------------
Max size (in skbs) of the per-cpu list of skbs being freed
by the cpu which allocated them. Used by TCP stack so far.
by the cpu which allocated them.
Default: 64
Default: 128
optmem_max
----------
@ -406,6 +414,23 @@ to SOCK_TXREHASH_DEFAULT (i. e. not overridden by setsockopt).
If set to 1 (default), hash rethink is performed on listening socket.
If set to 0, hash rethink is not performed.
txq_reselection_ms
------------------
Controls how often (in ms) a busy connected flow can select another tx queue.
A resection is desirable when/if user thread has migrated and XPS
would select a different queue. Same can occur without XPS
if the flow hash has changed.
But switching txq can introduce reorders, especially if the
old queue is under high pressure. Modern TCP stacks deal
well with reorders if they happen not too often.
To disable this feature, set the value to 0.
Default : 1000
gro_normal_batch
----------------

View File

@ -186,6 +186,6 @@ More detailed explanation for tainting
18) ``N`` if an in-kernel test, such as a KUnit test, has been run.
19) ``J`` if userpace opened /dev/fwctl/* and performed a FWTCL_RPC_DEBUG_WRITE
19) ``J`` if userspace opened /dev/fwctl/* and performed a FWTCL_RPC_DEBUG_WRITE
to use the devices debugging features. Device debugging features could
cause the device to malfunction in undefined ways.

View File

@ -6,3 +6,4 @@ Thermal Subsystem
:maxdepth: 1
intel_powerclamp
intel_thermal_throttle

View File

@ -0,0 +1,91 @@
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>
=======================================
Intel thermal throttle events reporting
=======================================
:Author: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Introduction
------------
Intel processors have built in automatic and adaptive thermal monitoring
mechanisms that force the processor to reduce its power consumption in order
to operate within predetermined temperature limits.
Refer to section "THERMAL MONITORING AND PROTECTION" in the "Intel® 64 and
IA-32 Architectures Software Developers Manual Volume 3 (3A, 3B, 3C, & 3D):
System Programming Guide" for more details.
In general, there are two mechanisms to control the core temperature of the
processor. They are called "Thermal Monitor 1 (TM1) and Thermal Monitor 2 (TM2)".
The status of the temperature sensor that triggers the thermal monitor (TM1/TM2)
is indicated through the "thermal status flag" and "thermal status log flag" in
MSR_IA32_THERM_STATUS for core level and MSR_IA32_PACKAGE_THERM_STATUS for
package level.
Thermal Status flag, bit 0 — When set, indicates that the processor core
temperature is currently at the trip temperature of the thermal monitor and that
the processor power consumption is being reduced via either TM1 or TM2, depending
on which is enabled. When clear, the flag indicates that the core temperature is
below the thermal monitor trip temperature. This flag is read only.
Thermal Status Log flag, bit 1 — When set, indicates that the thermal sensor has
tripped since the last power-up or reset or since the last time that software
cleared this flag. This flag is a sticky bit; once set it remains set until
cleared by software or until a power-up or reset of the processor. The default
state is clear.
It is possible that when user reads MSR_IA32_THERM_STATUS or
MSR_IA32_PACKAGE_THERM_STATUS, TM1/TM2 is not active. In this case,
"Thermal Status flag" will read "0" and the "Thermal Status Log flag" will be set
to show any previous "TM1/TM2" activation. But since it needs to be cleared by
the software, it can't show the number of occurrences of "TM1/TM2" activations.
Hence, Linux provides counters of how many times the "Thermal Status flag" was
set. Also presents how long the "Thermal Status flag" was active in milliseconds.
Using these counters, users can check if the performance was limited because of
thermal events. It is recommended to read from sysfs instead of directly reading
MSRs as the "Thermal Status Log flag" is reset by the driver to implement rate
control.
Sysfs Interface
---------------
Thermal throttling events are presented for each CPU under
"/sys/devices/system/cpu/cpuX/thermal_throttle/", where "X" is the CPU number.
All these counters are read-only. They can't be reset to 0. So, they can potentially
overflow after reaching the maximum 64 bit unsigned integer.
``core_throttle_count``
Shows the number of times "Thermal Status flag" changed from 0 to 1 for this
CPU since OS boot and thermal vector is initialized. This is a 64 bit counter.
``package_throttle_count``
Shows the number of times "Thermal Status flag" changed from 0 to 1 for the
package containing this CPU since OS boot and thermal vector is initialized.
Package status is broadcast to all CPUs; all CPUs in the package increment
this count. This is a 64-bit counter.
``core_throttle_max_time_ms``
Shows the maximum amount of time for which "Thermal Status flag" has been
set to 1 for this CPU at the core level since OS boot and thermal vector
is initialized.
``package_throttle_max_time_ms``
Shows the maximum amount of time for which "Thermal Status flag" has been
set to 1 for the package containing this CPU since OS boot and thermal
vector is initialized.
``core_throttle_total_time_ms``
Shows the cumulative time for which "Thermal Status flag" has been
set to 1 for this CPU for core level since OS boot and thermal vector
is initialized.
``package_throttle_total_time_ms``
Shows the cumulative time for which "Thermal Status flag" has been set
to 1 for the package containing this CPU since OS boot and thermal vector
is initialized.

View File

@ -203,10 +203,10 @@ host controller or a device, it is important that the firmware can be
upgraded to the latest where possible bugs in it have been fixed.
Typically OEMs provide this firmware from their support site.
There is also a central site which has links where to download firmware
for some machines:
`Thunderbolt Updates <https://thunderbolttechnology.net/updates>`_
Currently, recommended method of updating firmware is through "fwupd" tool.
It uses LVFS (Linux Vendor Firmware Service) portal by default to get the
latest firmware from hardware vendors and updates connected devices if found
compatible. For details refer to: https://github.com/fwupd/fwupd.
Before you upgrade firmware on a device, host or retimer, please make
sure it is a suitable upgrade. Failing to do that may render the device
@ -215,18 +215,40 @@ tools!
Host NVM upgrade on Apple Macs is not supported.
Once the NVM image has been downloaded, you need to plug in a
Thunderbolt device so that the host controller appears. It does not
matter which device is connected (unless you are upgrading NVM on a
device - then you need to connect that particular device).
Fwupd is installed by default. If you don't have it on your system, simply
use your distro package manager to get it.
To see possible updates through fwupd, you need to plug in a Thunderbolt
device so that the host controller appears. It does not matter which
device is connected (unless you are upgrading NVM on a device - then you
need to connect that particular device).
Note an OEM-specific method to power the controller up ("force power") may
be available for your system in which case there is no need to plug in a
Thunderbolt device.
After that we can write the firmware to the non-active parts of the NVM
of the host or device. As an example here is how Intel NUC6i7KYK (Skull
Canyon) Thunderbolt controller NVM is upgraded::
Updating firmware using fwupd is straightforward - refer to official
readme on fwupd github.
If firmware image is written successfully, the device shortly disappears.
Once it comes back, the driver notices it and initiates a full power
cycle. After a while device appears again and this time it should be
fully functional.
Device of interest should display new version under "Current version"
and "Update State: Success" in fwupd's interface.
Upgrading firmware manually
---------------------------------------------------------------
If possible, use fwupd to updated the firmware. However, if your device OEM
has not uploaded the firmware to LVFS, but it is available for download
from their side, you can use method below to directly upgrade the
firmware.
Manual firmware update can be done with 'dd' tool. To update firmware
using this method, you need to write it to the non-active parts of NVM
of the host or device. Example on how to update Intel NUC6i7KYK
(Skull Canyon) Thunderbolt controller NVM::
# dd if=KYK_TBT_FW_0018.bin of=/sys/bus/thunderbolt/devices/0-0/nvm_non_active0/nvmem
@ -235,10 +257,8 @@ upgrade process as follows::
# echo 1 > /sys/bus/thunderbolt/devices/0-0/nvm_authenticate
If no errors are returned, the host controller shortly disappears. Once
it comes back the driver notices it and initiates a full power cycle.
After a while the host controller appears again and this time it should
be fully functional.
If no errors are returned, device should behave as described in previous
section.
We can verify that the new NVM firmware is active by running the following
commands::

View File

@ -196,11 +196,11 @@ Lets checkout the latest Linux repository and build cscope database::
cscope -R -p10 # builds cscope.out database before starting browse session
cscope -d -p10 # starts browse session on cscope.out database
Note: Run "cscope -R -p10" to build the database and c"scope -d -p10" to
enter into the browsing session. cscope by default cscope.out database.
To get out of this mode press ctrl+d. -p option is used to specify the
number of file path components to display. -p10 is optimal for browsing
kernel sources.
Note: Run "cscope -R -p10" to build the database and "cscope -d -p10" to
enter into the browsing session. cscope by default uses the cscope.out
database. To get out of this mode press ctrl+d. -p option is used to
specify the number of file path components to display. -p10 is optimal
for browsing kernel sources.
What is perf and how do we use it?
==================================

View File

@ -391,13 +391,13 @@ Before jumping into the kernel, the following conditions must be met:
- SMCR_EL2.LEN must be initialised to the same value for all CPUs the
kernel will execute on.
- HWFGRTR_EL2.nTPIDR2_EL0 (bit 55) must be initialised to 0b01.
- HFGRTR_EL2.nTPIDR2_EL0 (bit 55) must be initialised to 0b01.
- HWFGWTR_EL2.nTPIDR2_EL0 (bit 55) must be initialised to 0b01.
- HFGWTR_EL2.nTPIDR2_EL0 (bit 55) must be initialised to 0b01.
- HWFGRTR_EL2.nSMPRI_EL1 (bit 54) must be initialised to 0b01.
- HFGRTR_EL2.nSMPRI_EL1 (bit 54) must be initialised to 0b01.
- HWFGWTR_EL2.nSMPRI_EL1 (bit 54) must be initialised to 0b01.
- HFGWTR_EL2.nSMPRI_EL1 (bit 54) must be initialised to 0b01.
For CPUs with the Scalable Matrix Extension FA64 feature (FEAT_SME_FA64):

View File

@ -402,6 +402,11 @@ The regset data starts with struct user_sve_header, containing:
streaming mode and any SETREGSET of NT_ARM_SSVE will enter streaming mode
if the target was not in streaming mode.
* On systems that do not support SVE it is permitted to use SETREGSET to
write SVE_PT_REGS_FPSIMD formatted data via NT_ARM_SVE, in this case the
vector length should be specified as 0. This allows streaming mode to be
disabled on systems with SME but not SVE.
* If any register data is provided along with SVE_PT_VL_ONEXEC then the
registers data will be interpreted with the current vector length, not
the vector length configured for use on exec.

View File

@ -249,6 +249,9 @@ The following keys are defined:
defined in the in the RISC-V ISA manual starting from commit e87412e621f1
("integrate Zaamo and Zalrsc text (#1304)").
* :c:macro:`RISCV_HWPROBE_EXT_ZALASR`: The Zalasr extension is supported as
frozen at commit 194f0094 ("Version 0.9 for freeze") of riscv-zalasr.
* :c:macro:`RISCV_HWPROBE_EXT_ZALRSC`: The Zalrsc extension is supported as
defined in the in the RISC-V ISA manual starting from commit e87412e621f1
("integrate Zaamo and Zalrsc text (#1304)").
@ -275,6 +278,9 @@ The following keys are defined:
ratified in commit 49f49c842ff9 ("Update to Rafified state") of
riscv-zabha.
* :c:macro:`RISCV_HWPROBE_EXT_ZICBOP`: The Zicbop extension is supported, as
ratified in commit 3dd606f ("Create cmobase-v1.0.pdf") of riscv-CMOs.
* :c:macro:`RISCV_HWPROBE_KEY_CPUPERF_0`: Deprecated. Returns similar values to
:c:macro:`RISCV_HWPROBE_KEY_MISALIGNED_SCALAR_PERF`, but the key was
mistakenly classified as a bitmask rather than a value.
@ -369,4 +375,7 @@ The following keys are defined:
* :c:macro:`RISCV_HWPROBE_VENDOR_EXT_XSFVFWMACCQQQ`: The Xsfvfwmaccqqq
vendor extension is supported in version 1.0 of Matrix Multiply Accumulate
Instruction Extensions Specification.
Instruction Extensions Specification.
* :c:macro:`RISCV_HWPROBE_KEY_ZICBOP_BLOCK_SIZE`: An unsigned int which
represents the size of the Zicbop block in bytes.

View File

@ -243,9 +243,8 @@ Examples:
Changing the size of debug areas
------------------------------------
It is possible the change the size of debug areas through piping
the number of pages to the debugfs file "pages". The resize request will
also flush the debug areas.
To resize a debug area, write the desired page count to the "pages" file.
Existing data is preserved if it fits; otherwise, oldest entries are dropped.
Example:

View File

@ -416,7 +416,7 @@ Offset/size: 0x210/1
Protocol: 2.00+
============ ==================
If your boot loader has an assigned id (see table below), enter
If your boot loader has an assigned ID (see table below), enter
0xTV here, where T is an identifier for the boot loader and V is
a version number. Otherwise, enter 0xFF here.
@ -431,31 +431,31 @@ Protocol: 2.00+
ext_loader_type <- 0x05
ext_loader_ver <- 0x23
Assigned boot loader ids (hexadecimal):
Assigned boot loader IDs:
== =======================================
0 LILO
(0x00 reserved for pre-2.00 bootloader)
1 Loadlin
2 bootsect-loader
(0x20, all other values reserved)
3 Syslinux
4 Etherboot/gPXE/iPXE
5 ELILO
7 GRUB
8 U-Boot
9 Xen
A Gujin
B Qemu
C Arcturus Networks uCbootloader
D kexec-tools
E Extended (see ext_loader_type)
F Special (0xFF = undefined)
10 Reserved
11 Minimal Linux Bootloader
<http://sebastian-plotz.blogspot.de>
12 OVMF UEFI virtualization stack
13 barebox
0x0 LILO
(0x00 reserved for pre-2.00 bootloader)
0x1 Loadlin
0x2 bootsect-loader
(0x20, all other values reserved)
0x3 Syslinux
0x4 Etherboot/gPXE/iPXE
0x5 ELILO
0x7 GRUB
0x8 U-Boot
0x9 Xen
0xA Gujin
0xB Qemu
0xC Arcturus Networks uCbootloader
0xD kexec-tools
0xE Extended (see ext_loader_type)
0xF Special (0xFF = undefined)
0x10 Reserved
0x11 Minimal Linux Bootloader
<http://sebastian-plotz.blogspot.de>
0x12 OVMF UEFI virtualization stack
0x13 barebox
== =======================================
Please contact <hpa@zytor.com> if you need a bootloader ID value assigned.
@ -1431,12 +1431,34 @@ The boot loader *must* fill out the following fields in bp::
All other fields should be zero.
.. note::
The EFI Handover Protocol is deprecated in favour of the ordinary PE/COFF
entry point, combined with the LINUX_EFI_INITRD_MEDIA_GUID based initrd
loading protocol (refer to [0] for an example of the bootloader side of
this), which removes the need for any knowledge on the part of the EFI
bootloader regarding the internal representation of boot_params or any
requirements/limitations regarding the placement of the command line
and ramdisk in memory, or the placement of the kernel image itself.
The EFI Handover Protocol is deprecated in favour of the ordinary PE/COFF
entry point described below.
[0] https://github.com/u-boot/u-boot/commit/ec80b4735a593961fe701cc3a5d717d4739b0fd0
.. _pe-coff-entry-point:
PE/COFF entry point
===================
When compiled with ``CONFIG_EFI_STUB=y``, the kernel can be executed as a
regular PE/COFF binary. See Documentation/admin-guide/efi-stub.rst for
implementation details.
The stub loader can request the initrd via a UEFI protocol. For this to work,
the firmware or bootloader needs to register a handle which carries
implementations of the ``EFI_LOAD_FILE2`` protocol and the device path
protocol exposing the ``LINUX_EFI_INITRD_MEDIA_GUID`` vendor media device path.
In this case, a kernel booting via the EFI stub will invoke
``LoadFile2::LoadFile()`` method on the registered protocol to instruct the
firmware to load the initrd into a memory location chosen by the kernel/EFI
stub.
This approach removes the need for any knowledge on the part of the EFI
bootloader regarding the internal representation of boot_params or any
requirements/limitations regarding the placement of the command line and
ramdisk in memory, or the placement of the kernel image itself.
For sample implementations, refer to `the original u-boot implementation`_ or
`the OVMF implementation`_.
.. _the original u-boot implementation: https://github.com/u-boot/u-boot/commit/ec80b4735a593961fe701cc3a5d717d4739b0fd0
.. _the OVMF implementation: https://github.com/tianocore/edk2/blob/1780373897f12c25075f8883e073144506441168/OvmfPkg/LinuxInitrdDynamicShellCommand/LinuxInitrdDynamicShellCommand.c

View File

@ -100,10 +100,26 @@ described in more detail in the footnotes.
| | | ``uretprobe.s+`` [#uprobe]_ | Yes |
+ + +----------------------------------+-----------+
| | | ``usdt+`` [#usdt]_ | |
+ + +----------------------------------+-----------+
| | | ``usdt.s+`` [#usdt]_ | Yes |
+ +----------------------------------------+----------------------------------+-----------+
| | ``BPF_TRACE_KPROBE_MULTI`` | ``kprobe.multi+`` [#kpmulti]_ | |
+ + +----------------------------------+-----------+
| | | ``kretprobe.multi+`` [#kpmulti]_ | |
+ +----------------------------------------+----------------------------------+-----------+
| | ``BPF_TRACE_KPROBE_SESSION`` | ``kprobe.session+`` [#kpmulti]_ | |
+ +----------------------------------------+----------------------------------+-----------+
| | ``BPF_TRACE_UPROBE_MULTI`` | ``uprobe.multi+`` [#upmul]_ | |
+ + +----------------------------------+-----------+
| | | ``uprobe.multi.s+`` [#upmul]_ | Yes |
+ + +----------------------------------+-----------+
| | | ``uretprobe.multi+`` [#upmul]_ | |
+ + +----------------------------------+-----------+
| | | ``uretprobe.multi.s+`` [#upmul]_ | Yes |
+ +----------------------------------------+----------------------------------+-----------+
| | ``BPF_TRACE_UPROBE_SESSION`` | ``uprobe.session+`` [#upmul]_ | |
+ + +----------------------------------+-----------+
| | | ``uprobe.session.s+`` [#upmul]_ | Yes |
+-------------------------------------------+----------------------------------------+----------------------------------+-----------+
| ``BPF_PROG_TYPE_LIRC_MODE2`` | ``BPF_LIRC_MODE2`` | ``lirc_mode2`` | |
+-------------------------------------------+----------------------------------------+----------------------------------+-----------+
@ -219,6 +235,8 @@ described in more detail in the footnotes.
non-negative integer.
.. [#ksyscall] The ``ksyscall`` attach format is ``ksyscall/<syscall>``.
.. [#uprobe] The ``uprobe`` attach format is ``uprobe[.s]/<path>:<function>[+<offset>]``.
.. [#upmul] The ``uprobe.multi`` attach format is ``uprobe.multi[.s]/<path>:<function-pattern>``
where ``function-pattern`` supports ``*`` and ``?`` wildcards.
.. [#usdt] The ``usdt`` attach format is ``usdt/<path>:<provider>:<name>``.
.. [#kpmulti] The ``kprobe.multi`` attach format is ``kprobe.multi/<pattern>`` where ``pattern``
supports ``*`` and ``?`` wildcards. Valid characters for pattern are

View File

@ -15,8 +15,9 @@ of constant size. The size of the array is defined in ``max_entries`` at
creation time. All array elements are pre-allocated and zero initialized when
created. ``BPF_MAP_TYPE_PERCPU_ARRAY`` uses a different memory region for each
CPU whereas ``BPF_MAP_TYPE_ARRAY`` uses the same memory region. The value
stored can be of any size, however, all array elements are aligned to 8
bytes.
stored can be of any size for ``BPF_MAP_TYPE_ARRAY`` and not more than
``PCPU_MIN_UNIT_SIZE`` (32 kB) for ``BPF_MAP_TYPE_PERCPU_ARRAY``. All
array elements are aligned to 8 bytes.
Since kernel 5.5, memory mapping may be enabled for ``BPF_MAP_TYPE_ARRAY`` by
setting the flag ``BPF_F_MMAPABLE``. The map definition is page-aligned and

View File

@ -18,8 +18,6 @@ import sphinx
# documentation root, use os.path.abspath to make it absolute, like shown here.
sys.path.insert(0, os.path.abspath("sphinx"))
from load_config import loadConfig # pylint: disable=C0413,E0401
# Minimal supported version
needs_sphinx = "3.4.3"
@ -93,8 +91,12 @@ def config_init(app, config):
# LaTeX and PDF output require a list of documents with are dependent
# of the app.srcdir. Add them here
# When SPHINXDIRS is used, we just need to get index.rst, if it exists
# Handle the case where SPHINXDIRS is used
if not os.path.samefile(doctree, app.srcdir):
# Add a tag to mark that the build is actually a subproject
tags.add("subproject")
# get index.rst, if it exists
doc = os.path.basename(app.srcdir)
fname = "index"
if os.path.exists(os.path.join(app.srcdir, fname + ".rst")):
@ -583,13 +585,6 @@ pdf_documents = [
kerneldoc_bin = "../scripts/kernel-doc.py"
kerneldoc_srctree = ".."
# ------------------------------------------------------------------------------
# Since loadConfig overwrites settings from the global namespace, it has to be
# the last statement in the conf.py file
# ------------------------------------------------------------------------------
loadConfig(globals())
def setup(app):
"""Patterns need to be updated at init time on older Sphinx versions"""

View File

@ -92,18 +92,18 @@ There are two functions for dealing with the script:
void assoc_array_apply_edit(struct assoc_array_edit *edit);
This will perform the edit functions, interpolating various write barriers
to permit accesses under the RCU read lock to continue. The edit script
will then be passed to ``call_rcu()`` to free it and any dead stuff it points
to.
This will perform the edit functions, interpolating various write barriers
to permit accesses under the RCU read lock to continue. The edit script
will then be passed to ``call_rcu()`` to free it and any dead stuff it
points to.
2. Cancel an edit script::
void assoc_array_cancel_edit(struct assoc_array_edit *edit);
This frees the edit script and all preallocated memory immediately. If
this was for insertion, the new object is _not_ released by this function,
but must rather be released by the caller.
This frees the edit script and all preallocated memory immediately. If
this was for insertion, the new object is *not* released by this function,
but must rather be released by the caller.
These functions are guaranteed not to fail.
@ -123,43 +123,43 @@ This points to a number of methods, all of which need to be provided:
unsigned long (*get_key_chunk)(const void *index_key, int level);
This should return a chunk of caller-supplied index key starting at the
*bit* position given by the level argument. The level argument will be a
multiple of ``ASSOC_ARRAY_KEY_CHUNK_SIZE`` and the function should return
``ASSOC_ARRAY_KEY_CHUNK_SIZE bits``. No error is possible.
This should return a chunk of caller-supplied index key starting at the
*bit* position given by the level argument. The level argument will be a
multiple of ``ASSOC_ARRAY_KEY_CHUNK_SIZE`` and the function should return
``ASSOC_ARRAY_KEY_CHUNK_SIZE bits``. No error is possible.
2. Get a chunk of an object's index key::
unsigned long (*get_object_key_chunk)(const void *object, int level);
As the previous function, but gets its data from an object in the array
rather than from a caller-supplied index key.
As the previous function, but gets its data from an object in the array
rather than from a caller-supplied index key.
3. See if this is the object we're looking for::
bool (*compare_object)(const void *object, const void *index_key);
Compare the object against an index key and return ``true`` if it matches and
``false`` if it doesn't.
Compare the object against an index key and return ``true`` if it matches
and ``false`` if it doesn't.
4. Diff the index keys of two objects::
int (*diff_objects)(const void *object, const void *index_key);
Return the bit position at which the index key of the specified object
differs from the given index key or -1 if they are the same.
Return the bit position at which the index key of the specified object
differs from the given index key or -1 if they are the same.
5. Free an object::
void (*free_object)(void *object);
Free the specified object. Note that this may be called an RCU grace period
after ``assoc_array_apply_edit()`` was called, so ``synchronize_rcu()`` may be
necessary on module unloading.
Free the specified object. Note that this may be called an RCU grace period
after ``assoc_array_apply_edit()`` was called, so ``synchronize_rcu()`` may
be necessary on module unloading.
Manipulation Functions
@ -171,7 +171,7 @@ There are a number of functions for manipulating an associative array:
void assoc_array_init(struct assoc_array *array);
This initialises the base structure for an associative array. It can't fail.
This initialises the base structure for an associative array. It can't fail.
2. Insert/replace an object in an associative array::
@ -182,21 +182,21 @@ This initialises the base structure for an associative array. It can't fail.
const void *index_key,
void *object);
This inserts the given object into the array. Note that the least
significant bit of the pointer must be zero as it's used to type-mark
pointers internally.
This inserts the given object into the array. Note that the least
significant bit of the pointer must be zero as it's used to type-mark
pointers internally.
If an object already exists for that key then it will be replaced with the
new object and the old one will be freed automatically.
If an object already exists for that key then it will be replaced with the
new object and the old one will be freed automatically.
The ``index_key`` argument should hold index key information and is
passed to the methods in the ops table when they are called.
The ``index_key`` argument should hold index key information and is
passed to the methods in the ops table when they are called.
This function makes no alteration to the array itself, but rather returns
an edit script that must be applied. ``-ENOMEM`` is returned in the case of
an out-of-memory error.
This function makes no alteration to the array itself, but rather returns
an edit script that must be applied. ``-ENOMEM`` is returned in the case of
an out-of-memory error.
The caller should lock exclusively against other modifiers of the array.
The caller should lock exclusively against other modifiers of the array.
3. Delete an object from an associative array::
@ -206,15 +206,15 @@ The caller should lock exclusively against other modifiers of the array.
const struct assoc_array_ops *ops,
const void *index_key);
This deletes an object that matches the specified data from the array.
This deletes an object that matches the specified data from the array.
The ``index_key`` argument should hold index key information and is
passed to the methods in the ops table when they are called.
The ``index_key`` argument should hold index key information and is
passed to the methods in the ops table when they are called.
This function makes no alteration to the array itself, but rather returns
an edit script that must be applied. ``-ENOMEM`` is returned in the case of
an out-of-memory error. ``NULL`` will be returned if the specified object is
not found within the array.
This function makes no alteration to the array itself, but rather returns
an edit script that must be applied. ``-ENOMEM`` is returned in the case of
an out-of-memory error. ``NULL`` will be returned if the specified object
is not found within the array.
The caller should lock exclusively against other modifiers of the array.
@ -225,14 +225,14 @@ The caller should lock exclusively against other modifiers of the array.
assoc_array_clear(struct assoc_array *array,
const struct assoc_array_ops *ops);
This deletes all the objects from an associative array and leaves it
completely empty.
This deletes all the objects from an associative array and leaves it
completely empty.
This function makes no alteration to the array itself, but rather returns
an edit script that must be applied. ``-ENOMEM`` is returned in the case of
an out-of-memory error.
This function makes no alteration to the array itself, but rather returns
an edit script that must be applied. ``-ENOMEM`` is returned in the case of
an out-of-memory error.
The caller should lock exclusively against other modifiers of the array.
The caller should lock exclusively against other modifiers of the array.
5. Destroy an associative array, deleting all objects::
@ -240,14 +240,14 @@ The caller should lock exclusively against other modifiers of the array.
void assoc_array_destroy(struct assoc_array *array,
const struct assoc_array_ops *ops);
This destroys the contents of the associative array and leaves it
completely empty. It is not permitted for another thread to be traversing
the array under the RCU read lock at the same time as this function is
destroying it as no RCU deferral is performed on memory release -
something that would require memory to be allocated.
This destroys the contents of the associative array and leaves it
completely empty. It is not permitted for another thread to be traversing
the array under the RCU read lock at the same time as this function is
destroying it as no RCU deferral is performed on memory release -
something that would require memory to be allocated.
The caller should lock exclusively against other modifiers and accessors
of the array.
The caller should lock exclusively against other modifiers and accessors
of the array.
6. Garbage collect an associative array::
@ -257,24 +257,24 @@ of the array.
bool (*iterator)(void *object, void *iterator_data),
void *iterator_data);
This iterates over the objects in an associative array and passes each one to
``iterator()``. If ``iterator()`` returns ``true``, the object is kept. If it
returns ``false``, the object will be freed. If the ``iterator()`` function
returns ``true``, it must perform any appropriate refcount incrementing on the
object before returning.
This iterates over the objects in an associative array and passes each one
to ``iterator()``. If ``iterator()`` returns ``true``, the object is kept.
If it returns ``false``, the object will be freed. If the ``iterator()``
function returns ``true``, it must perform any appropriate refcount
incrementing on the object before returning.
The internal tree will be packed down if possible as part of the iteration
to reduce the number of nodes in it.
The internal tree will be packed down if possible as part of the iteration
to reduce the number of nodes in it.
The ``iterator_data`` is passed directly to ``iterator()`` and is otherwise
ignored by the function.
The ``iterator_data`` is passed directly to ``iterator()`` and is otherwise
ignored by the function.
The function will return ``0`` if successful and ``-ENOMEM`` if there wasn't
enough memory.
The function will return ``0`` if successful and ``-ENOMEM`` if there wasn't
enough memory.
It is possible for other threads to iterate over or search the array under
the RCU read lock while this function is in progress. The caller should
lock exclusively against other modifiers of the array.
It is possible for other threads to iterate over or search the array under
the RCU read lock while this function is in progress. The caller should
lock exclusively against other modifiers of the array.
Access Functions
@ -289,19 +289,19 @@ There are two functions for accessing an associative array:
void *iterator_data),
void *iterator_data);
This passes each object in the array to the iterator callback function.
``iterator_data`` is private data for that function.
This passes each object in the array to the iterator callback function.
``iterator_data`` is private data for that function.
This may be used on an array at the same time as the array is being
modified, provided the RCU read lock is held. Under such circumstances,
it is possible for the iteration function to see some objects twice. If
this is a problem, then modification should be locked against. The
iteration algorithm should not, however, miss any objects.
This may be used on an array at the same time as the array is being
modified, provided the RCU read lock is held. Under such circumstances,
it is possible for the iteration function to see some objects twice. If
this is a problem, then modification should be locked against. The
iteration algorithm should not, however, miss any objects.
The function will return ``0`` if no objects were in the array or else it will
return the result of the last iterator function called. Iteration stops
immediately if any call to the iteration function results in a non-zero
return.
The function will return ``0`` if no objects were in the array or else it
will return the result of the last iterator function called. Iteration
stops immediately if any call to the iteration function results in a
non-zero return.
2. Find an object in an associative array::
@ -310,14 +310,14 @@ return.
const struct assoc_array_ops *ops,
const void *index_key);
This walks through the array's internal tree directly to the object
specified by the index key..
This walks through the array's internal tree directly to the object
specified by the index key.
This may be used on an array at the same time as the array is being
modified, provided the RCU read lock is held.
This may be used on an array at the same time as the array is being
modified, provided the RCU read lock is held.
The function will return the object if found (and set ``*_type`` to the object
type) or will return ``NULL`` if the object was not found.
The function will return the object if found (and set ``*_type`` to the
object type) or will return ``NULL`` if the object was not found.
Index Key Form
@ -399,10 +399,11 @@ fixed levels. For example::
In the above example, there are 7 nodes (A-G), each with 16 slots (0-f).
Assuming no other meta data nodes in the tree, the key space is divided
thusly::
thusly:
=========== ====
KEY PREFIX NODE
========== ====
=========== ====
137* D
138* E
13[0-69-f]* C
@ -410,10 +411,12 @@ thusly::
e6* G
e[0-57-f]* F
[02-df]* A
=========== ====
So, for instance, keys with the following example index keys will be found in
the appropriate nodes::
the appropriate nodes:
=============== ======= ====
INDEX KEY PREFIX NODE
=============== ======= ====
13694892892489 13 C
@ -422,12 +425,13 @@ the appropriate nodes::
138bbb89003093 138 E
1394879524789 12 C
1458952489 1 B
9431809de993ba - A
b4542910809cd - A
9431809de993ba \- A
b4542910809cd \- A
e5284310def98 e F
e68428974237 e6 G
e7fffcbd443 e F
f3842239082 - A
f3842239082 \- A
=============== ======= ====
To save memory, if a node can hold all the leaves in its portion of keyspace,
then the node will have all those leaves in it and will not have any metadata
@ -441,8 +445,9 @@ metadata pointer. If the metadata pointer is there, any leaf whose key matches
the metadata key prefix must be in the subtree that the metadata pointer points
to.
In the above example list of index keys, node A will contain::
In the above example list of index keys, node A will contain:
==== =============== ==================
SLOT CONTENT INDEX KEY (PREFIX)
==== =============== ==================
1 PTR TO NODE B 1*
@ -450,11 +455,16 @@ In the above example list of index keys, node A will contain::
any LEAF b4542910809cd
e PTR TO NODE F e*
any LEAF f3842239082
==== =============== ==================
and node B::
and node B:
3 PTR TO NODE C 13*
any LEAF 1458952489
==== =============== ==================
SLOT CONTENT INDEX KEY (PREFIX)
==== =============== ==================
3 PTR TO NODE C 13*
any LEAF 1458952489
==== =============== ==================
Shortcuts

View File

@ -138,6 +138,7 @@ Documents that don't fit elsewhere or which have yet to be categorized.
:maxdepth: 1
librs
liveupdate
netlink
.. only:: subproject and html

View File

@ -70,5 +70,5 @@ in the FDT. That state is called the KHO finalization phase.
Public API
==========
.. kernel-doc:: kernel/kexec_handover.c
.. kernel-doc:: kernel/liveupdate/kexec_handover.c
:export:

View File

@ -0,0 +1,61 @@
.. SPDX-License-Identifier: GPL-2.0
========================
Live Update Orchestrator
========================
:Author: Pasha Tatashin <pasha.tatashin@soleen.com>
.. kernel-doc:: kernel/liveupdate/luo_core.c
:doc: Live Update Orchestrator (LUO)
LUO Sessions
============
.. kernel-doc:: kernel/liveupdate/luo_session.c
:doc: LUO Sessions
LUO Preserving File Descriptors
===============================
.. kernel-doc:: kernel/liveupdate/luo_file.c
:doc: LUO File Descriptors
Live Update Orchestrator ABI
============================
.. kernel-doc:: include/linux/kho/abi/luo.h
:doc: Live Update Orchestrator ABI
The following types of file descriptors can be preserved
.. toctree::
:maxdepth: 1
../mm/memfd_preservation
Public API
==========
.. kernel-doc:: include/linux/liveupdate.h
.. kernel-doc:: include/linux/kho/abi/luo.h
:functions:
.. kernel-doc:: kernel/liveupdate/luo_core.c
:export:
.. kernel-doc:: kernel/liveupdate/luo_file.c
:export:
Internal API
============
.. kernel-doc:: kernel/liveupdate/luo_core.c
:internal:
.. kernel-doc:: kernel/liveupdate/luo_session.c
:internal:
.. kernel-doc:: kernel/liveupdate/luo_file.c
:internal:
See Also
========
- :doc:`Live Update uAPI </userspace-api/liveupdate>`
- :doc:`/core-api/kho/concepts`

View File

@ -547,11 +547,13 @@ Time and date
%pt[RT]s YYYY-mm-dd HH:MM:SS
%pt[RT]d YYYY-mm-dd
%pt[RT]t HH:MM:SS
%pt[RT][dt][r][s]
%ptSp <seconds>.<nanoseconds>
%pt[RST][dt][r][s]
For printing date and time as represented by::
R struct rtc_time structure
R content of struct rtc_time
S content of struct timespec64
T time64_t type
in human readable format.
@ -563,6 +565,11 @@ The %pt[RT]s (space) will override ISO 8601 separator by using ' ' (space)
instead of 'T' (Capital T) between date and time. It won't have any effect
when date or time is omitted.
The %ptSp is equivalent to %lld.%09ld for the content of the struct timespec64.
When the other specifiers are given, it becomes the respective equivalent of
%ptT[dt][r][s].%09ld. In other words, the seconds are being printed in
the human readable format followed by a dot and nanoseconds.
Passed by reference.
struct clk

View File

@ -27,3 +27,4 @@ for cryptographic use cases, as well as programming examples.
descore-readme
device_drivers/index
krb5
sha3

View File

@ -0,0 +1,130 @@
.. SPDX-License-Identifier: GPL-2.0-or-later
==========================
SHA-3 Algorithm Collection
==========================
.. contents::
Overview
========
The SHA-3 family of algorithms, as specified in NIST FIPS-202 [1]_, contains six
algorithms based on the Keccak sponge function. The differences between them
are: the "rate" (how much of the state buffer gets updated with new data between
invocations of the Keccak function and analogous to the "block size"), what
domain separation suffix gets appended to the input data, and how much output
data is extracted at the end. The Keccak sponge function is designed such that
arbitrary amounts of output can be obtained for certain algorithms.
Four digest algorithms are provided:
- SHA3-224
- SHA3-256
- SHA3-384
- SHA3-512
Additionally, two Extendable-Output Functions (XOFs) are provided:
- SHAKE128
- SHAKE256
The SHA-3 library API supports all six of these algorithms. The four digest
algorithms are also supported by the crypto_shash and crypto_ahash APIs.
This document describes the SHA-3 library API.
Digests
=======
The following functions compute SHA-3 digests::
void sha3_224(const u8 *in, size_t in_len, u8 out[SHA3_224_DIGEST_SIZE]);
void sha3_256(const u8 *in, size_t in_len, u8 out[SHA3_256_DIGEST_SIZE]);
void sha3_384(const u8 *in, size_t in_len, u8 out[SHA3_384_DIGEST_SIZE]);
void sha3_512(const u8 *in, size_t in_len, u8 out[SHA3_512_DIGEST_SIZE]);
For users that need to pass in data incrementally, an incremental API is also
provided. The incremental API uses the following struct::
struct sha3_ctx { ... };
Initialization is done with one of::
void sha3_224_init(struct sha3_ctx *ctx);
void sha3_256_init(struct sha3_ctx *ctx);
void sha3_384_init(struct sha3_ctx *ctx);
void sha3_512_init(struct sha3_ctx *ctx);
Input data is then added with any number of calls to::
void sha3_update(struct sha3_ctx *ctx, const u8 *in, size_t in_len);
Finally, the digest is generated using::
void sha3_final(struct sha3_ctx *ctx, u8 *out);
which also zeroizes the context. The length of the digest is determined by the
initialization function that was called.
Extendable-Output Functions
===========================
The following functions compute the SHA-3 extendable-output functions (XOFs)::
void shake128(const u8 *in, size_t in_len, u8 *out, size_t out_len);
void shake256(const u8 *in, size_t in_len, u8 *out, size_t out_len);
For users that need to provide the input data incrementally and/or receive the
output data incrementally, an incremental API is also provided. The incremental
API uses the following struct::
struct shake_ctx { ... };
Initialization is done with one of::
void shake128_init(struct shake_ctx *ctx);
void shake256_init(struct shake_ctx *ctx);
Input data is then added with any number of calls to::
void shake_update(struct shake_ctx *ctx, const u8 *in, size_t in_len);
Finally, the output data is extracted with any number of calls to::
void shake_squeeze(struct shake_ctx *ctx, u8 *out, size_t out_len);
and telling it how much data should be extracted. Note that performing multiple
squeezes, with the output laid consecutively in a buffer, gets exactly the same
output as doing a single squeeze for the combined amount over the same buffer.
More input data cannot be added after squeezing has started.
Once all the desired output has been extracted, zeroize the context::
void shake_zeroize_ctx(struct shake_ctx *ctx);
Testing
=======
To test the SHA-3 code, use sha3_kunit (CONFIG_CRYPTO_LIB_SHA3_KUNIT_TEST).
Since the SHA-3 algorithms are FIPS-approved, when the kernel is booted in FIPS
mode the SHA-3 library also performs a simple self-test. This is purely to meet
a FIPS requirement. Normal testing done by kernel developers and integrators
should use the much more comprehensive KUnit test suite instead.
References
==========
.. [1] https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf
API Function Reference
======================
.. kernel-doc:: include/crypto/sha3.h

View File

@ -302,10 +302,9 @@ follows:
Depending on the RNG type, the RNG must be seeded. The seed is provided
using the setsockopt interface to set the key. For example, the
ansi_cprng requires a seed. The DRBGs do not require a seed, but may be
seeded. The seed is also known as a *Personalization String* in NIST SP 800-90A
standard.
using the setsockopt interface to set the key. The SP800-90A DRBGs do
not require a seed, but may be seeded. The seed is also known as a
*Personalization String* in NIST SP 800-90A standard.
Using the read()/recvmsg() system calls, random numbers can be obtained.
The kernel generates at most 128 bytes in one call. If user space

View File

@ -461,16 +461,9 @@ Comments
line comments is::
/*
* This is the preferred style
* for multi line comments.
*/
The networking comment style is a bit different, with the first line
not empty like the former::
/* This is the preferred comment style
* for files in net/ and drivers/net/
*/
* This is the preferred style
* for multi line comments.
*/
See: https://www.kernel.org/doc/html/latest/process/coding-style.html#commenting
@ -1009,6 +1002,29 @@ Functions and Variables
return bar;
**UNINITIALIZED_PTR_WITH_FREE**
Pointers with __free attribute should be declared at the place of use
and initialized (see include/linux/cleanup.h). In this case
declarations at the top of the function rule can be relaxed. Not doing
so may lead to undefined behavior as the memory assigned (garbage,
in case not initialized) to the pointer is freed automatically when
the pointer goes out of scope.
Also see: https://lore.kernel.org/lkml/58fd478f408a34b578ee8d949c5c4b4da4d4f41d.camel@HansenPartnership.com/
Example::
type var __free(free_func);
... // var not used, but, in future someone might add a return here
var = malloc(var_size);
...
should be initialized as::
...
type var __free(free_func) = malloc(var_size);
...
Permissions
-----------
@ -1245,6 +1261,16 @@ Others
The patch file does not appear to be in unified-diff format. Please
regenerate the patch file before sending it to the maintainer.
**PLACEHOLDER_USE**
Detects unhandled placeholder text left in cover letters or commit headers/logs.
Common placeholders include lines like::
*** SUBJECT HERE ***
*** BLURB HERE ***
These typically come from autogenerated templates. Replace them with a proper
subject and description before sending.
**PRINTF_0XDECIMAL**
Prefixing 0x with decimal output is defective and should be corrected.

View File

@ -35,6 +35,12 @@ or be built into the kernel.
a good way of quickly testing everything applicable to the current
config.
KUnit can be enabled or disabled at boot time, and this behavior is
controlled by the kunit.enable kernel parameter.
By default, kunit.enable is set to 1 because KUNIT_DEFAULT_ENABLED is
enabled by default. To ensure that tests are executed as expected,
verify that kunit.enable=1 at boot time.
Once we have built our kernel (and/or modules), it is simple to run
the tests. If the tests are built-in, they will run automatically on the
kernel boot. The results will be written to the kernel log (``dmesg``)

View File

@ -30,7 +30,7 @@ rules:
document-start:
present: true
empty-lines:
max: 3
max: 1
max-end: 1
empty-values:
forbid-in-block-mappings: true

Some files were not shown because too many files have changed in this diff Show More