Pull drm updates from Dave Airlie:
"There are two external interactions of note, the msm tree pull in some
opp tree, hopefully the opp tree arrives from the same git tree
however it normally does.
There is also a new cgroup controller for device memory, that is used
by drm, so is merging through my tree. This will hopefully help open
up gpu cgroup usage a bit more and move us forward.
There is a new accelerator driver for the AMD XDNA Ryzen AI NPUs.
Then the usual xe/amdgpu/i915/msm leaders and lots of changes and
refactors across the board:
core:
- device memory cgroup controller added
- Remove driver date from drm_driver
- Add drm_printer based hex dumper
- drm memory stats docs update
- scheduler documentation improvements
new driver:
- amdxdna - Ryzen AI NPU support
connector:
- add a mutex to protect ELD
- make connector setup two-step
panels:
- Introduce backlight quirks infrastructure
- New panels: KDB KD116N2130B12, Tianma TM070JDHG34-00,
- Multi-Inno Technology MI1010Z1T-1CP11
bridge:
- ti-sn65dsi83: Add ti,lvds-vod-swing optional properties
- Provide default implementation of atomic_check for HDMI bridges
- it605: HDCP improvements, MCCS Support
xe:
- make OA buffer size configurable
- GuC capture fixes
- add ufence and g2h flushes
- restore system memory GGTT mappings
- ioctl fixes
- SRIOV PF scheduling priority
- allow fault injection
- lots of improvements/refactors
- Enable GuC's WA_DUAL_QUEUE for newer platforms
- IRQ related fixes and improvements
i915:
- More accurate engine busyness metrics with GuC submission
- Ensure partial BO segment offset never exceeds allowed max
- Flush GuC CT receive tasklet during reset preparation
- Some DG2 refactor to fix DG2 bugs when operating with certain CPUs
- Fix DG1 power gate sequence
- Enabling uncompressed 128b/132b UHBR SST
- Handle hdmi connector init failures, and no HDMI/DP cases
- More robust engine resets on Haswell and older
i915/xe display:
- HDCP fixes for Xe3Lpd
- New GSC FW ARL-H/ARL-U
- support 3 VDSC engines 12 slices
- MBUS joining sanitisation
- reconcile i915/xe display power mgmt
- Xe3Lpd fixes
- UHBR rates for Thunderbolt
amdgpu:
- DRM panic support
- track BO memory stats at runtime
- Fix max surface handling in DC
- Cleaner shader support for gfx10.3 dGPUs
- fix drm buddy trim handling
- SDMA engine reset updates
- Fix doorbell ttm cleanup
- RAS updates
- ISP updates
- SDMA queue reset support
- Rework DPM powergating interfaces
- Documentation updates and cleanups
- DCN 3.5 updates
- Use a pm notifier to more gracefully handle VRAM eviction on
suspend or hibernate
- Add debugfs interfaces for forcing scheduling to specific engine
instances
- GG 9.5 updates
- IH 4.4 updates
- Make missing optional firmware less noisy
- PSP 13.x updates
- SMU 13.x updates
- VCN 5.x updates
- JPEG 5.x updates
- GC 12.x updates
- DC FAMS updates
amdkfd:
- GG 9.5 updates
- Logging improvements
- Shader debugger fixes
- Trap handler cleanup
- Cleanup includes
- Eviction fence wq fix
msm:
- MDSS:
- properly described UBWC registers
- added SM6150 (aka QCS615) support
- DPU:
- added SM6150 (aka QCS615) support
- enabled wide planes if virtual planes are enabled (by using two
SSPPs for a single plane)
- added CWB hardware blocks support
- DSI:
- added SM6150 (aka QCS615) support
- GPU:
- Print GMU core fw version
- GMU bandwidth voting for a740 and a750
- Expose uche trap base via uapi
- UAPI error reporting
rcar-du:
- Add r8a779h0 Support
ivpu:
- Fix qemu crash when using passthrough
nouveau:
- expose GSP-RM logging buffers via debugfs
panfrost:
- Add MT8188 Mali-G57 MC3 support
rockchip:
- Gamma LUT support
hisilicon:
- new HIBMC support
virtio-gpu:
- convert to helpers
- add prime support for scanout buffers
v3d:
- Add DRM_IOCTL_V3D_PERFMON_SET_GLOBAL
vc4:
- Add support for BCM2712
vkms:
- line-per-line compositing algorithm to improve performance
zynqmp:
- Add DP audio support
mediatek:
- dp: Add sdp path reset
- dp: Support flexible length of DP calibration data
etnaviv:
- add fdinfo memory support
- add explicit reset handling"
* tag 'drm-next-2025-01-17' of https://gitlab.freedesktop.org/drm/kernel: (1070 commits)
drm/bridge: fix documentation for the hdmi_audio_prepare() callback
doc/cgroup: Fix title underline length
drm/doc: Include new drm-compute documentation
cgroup/dmem: Fix parameters documentation
cgroup/dmem: Select PAGE_COUNTER
kernel/cgroup: Remove the unused variable climit
drm/display: hdmi: Do not read EDID on disconnected connectors
drm/tests: hdmi: Add connector disablement test
drm/connector: hdmi: Do atomic check when necessary
drm/amd/display: 3.2.316
drm/amd/display: avoid reset DTBCLK at clock init
drm/amd/display: improve dpia pre-train
drm/amd/display: Apply DML21 Patches
drm/amd/display: Use HW lock mgr for PSR1
drm/amd/display: Revised for Replay Pseudo vblank control
drm/amd/display: Add a new flag for replay low hz
drm/amd/display: Remove unused read_ono_state function from Hwss module
drm/amd/display: Do not elevate mem_type change to full update
drm/amd/display: Do not wait for PSR disable on vbl enable
drm/amd/display: Remove unnecessary eDP power down
...
Pull io_uring updates from Jens Axboe:
"Not a lot in terms of features this time around, mostly just cleanups
and code consolidation:
- Support for PI meta data read/write via io_uring, with NVMe and
SCSI covered
- Cleanup the per-op structure caching, making it consistent across
various command types
- Consolidate the various user mapped features into a concept called
regions, making the various users of that consistent
- Various cleanups and fixes"
* tag 'for-6.14/io_uring-20250119' of git://git.kernel.dk/linux: (56 commits)
io_uring/fdinfo: fix io_uring_show_fdinfo() misuse of ->d_iname
io_uring: reuse io_should_terminate_tw() for cmds
io_uring: Factor out a function to parse restrictions
io_uring/rsrc: require cloned buffers to share accounting contexts
io_uring: simplify the SQPOLL thread check when cancelling requests
io_uring: expose read/write attribute capability
io_uring/rw: don't gate retry on completion context
io_uring/rw: handle -EAGAIN retry at IO completion time
io_uring/rw: use io_rw_recycle() from cleanup path
io_uring/rsrc: simplify the bvec iter count calculation
io_uring: ensure io_queue_deferred() is out-of-line
io_uring/rw: always clear ->bytes_done on io_async_rw setup
io_uring/rw: use NULL for rw->free_iovec assigment
io_uring/rw: don't mask in f_iocb_flags
io_uring/msg_ring: Drop custom destructor
io_uring: Move old async data allocation helper to header
io_uring/rw: Allocate async data through helper
io_uring/net: Allocate msghdr async data through helper
io_uring/uring_cmd: Allocate async data through generic helper
io_uring/poll: Allocate apoll with generic alloc_cache helper
...
Pull block updates from Jens Axboe:
- NVMe pull requests via Keith:
- Target support for PCI-Endpoint transport (Damien)
- TCP IO queue spreading fixes (Sagi, Chaitanya)
- Target handling for "limited retry" flags (Guixen)
- Poll type fix (Yongsoo)
- Xarray storage error handling (Keisuke)
- Host memory buffer free size fix on error (Francis)
- MD pull requests via Song:
- Reintroduce md-linear (Yu Kuai)
- md-bitmap refactor and fix (Yu Kuai)
- Replace kmap_atomic with kmap_local_page (David Reaver)
- Quite a few queue freeze and debugfs deadlock fixes
Ming introduced lockdep support for this in the 6.13 kernel, and it
has (unsurprisingly) uncovered quite a few issues
- Use const attributes for IO schedulers
- Remove bio ioprio wrappers
- Fixes for stacked device atomic write support
- Refactor queue affinity helpers, in preparation for better supporting
isolated CPUs
- Cleanups of loop O_DIRECT handling
- Cleanup of BLK_MQ_F_* flags
- Add rotational support for null_blk
- Various fixes and cleanups
* tag 'for-6.14/block-20250118' of git://git.kernel.dk/linux: (106 commits)
block: Don't trim an atomic write
block: Add common atomic writes enable flag
md/md-linear: Fix a NULL vs IS_ERR() bug in linear_add()
block: limit disk max sectors to (LLONG_MAX >> 9)
block: Change blk_stack_atomic_writes_limits() unit_min check
block: Ensure start sector is aligned for stacking atomic writes
blk-mq: Move more error handling into blk_mq_submit_bio()
block: Reorder the request allocation code in blk_mq_submit_bio()
nvme: fix bogus kzalloc() return check in nvme_init_effects_log()
md/md-bitmap: move bitmap_{start, end}write to md upper layer
md/raid5: implement pers->bitmap_sector()
md: add a new callback pers->bitmap_sector()
md/md-bitmap: remove the last parameter for bimtap_ops->endwrite()
md/md-bitmap: factor behind write counters out from bitmap_{start/end}write()
md: Replace deprecated kmap_atomic() with kmap_local_page()
md: reintroduce md-linear
partitions: ldm: remove the initial kernel-doc notation
blk-cgroup: rwstat: fix kernel-doc warnings in header file
blk-cgroup: fix kernel-doc warnings in header file
nbd: fix partial sending
...
Pull vfs direct-io updates from Christian Brauner:
"File systems that write out of place usually require different
alignment for direct I/O writes than what they can do for reads.
Add a separate dio read align field to statx, as many out of place
write file systems can easily do reads aligned to the device sector
size, but require bigger alignment for writes.
This is usually papered over by falling back to buffered I/O for
smaller writes and doing read-modify-write cycles, but performance for
this sucks, so applications benefit from knowing the actual write
alignment"
* tag 'vfs-6.14-rc1.statx.dio' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
xfs: report larger dio alignment for COW inodes
xfs: report the correct read/write dio alignment for reflinked inodes
xfs: cleanup xfs_vn_getattr
fs: add STATX_DIO_READ_ALIGN
fs: reformat the statx definition
Pull misc vfs updates from Christian Brauner:
"Features:
- Support caching symlink lengths in inodes
The size is stored in a new union utilizing the same space as
i_devices, thus avoiding growing the struct or taking up any more
space
When utilized it dodges strlen() in vfs_readlink(), giving about
1.5% speed up when issuing readlink on /initrd.img on ext4
- Add RWF_DONTCACHE iocb and FOP_DONTCACHE file_operations flag
If a file system supports uncached buffered IO, it may set
FOP_DONTCACHE and enable support for RWF_DONTCACHE.
If RWF_DONTCACHE is attempted without the file system supporting
it, it'll get errored with -EOPNOTSUPP
- Enable VBOXGUEST and VBOXSF_FS on ARM64
Now that VirtualBox is able to run as a host on arm64 (e.g. the
Apple M3 processors) we can enable VBOXSF_FS (and in turn
VBOXGUEST) for this architecture.
Tested with various runs of bonnie++ and dbench on an Apple MacBook
Pro with the latest Virtualbox 7.1.4 r165100 installed
Cleanups:
- Delay sysctl_nr_open check in expand_files()
- Use kernel-doc includes in fiemap docbook
- Use page->private instead of page->index in watch_queue
- Use a consume fence in mnt_idmap() as it's heavily used in
link_path_walk()
- Replace magic number 7 with ARRAY_SIZE() in fc_log
- Sort out a stale comment about races between fd alloc and dup2()
- Fix return type of do_mount() from long to int
- Various cosmetic cleanups for the lockref code
Fixes:
- Annotate spinning as unlikely() in __read_seqcount_begin
The annotation already used to be there, but got lost in commit
52ac39e5db ("seqlock: seqcount_t: Implement all read APIs as
statement expressions")
- Fix proc_handler for sysctl_nr_open
- Flush delayed work in delayed fput()
- Fix grammar and spelling in propagate_umount()
- Fix ESP not readable during coredump
In /proc/PID/stat, there is the kstkesp field which is the stack
pointer of a thread. While the thread is active, this field reads
zero. But during a coredump, it should have a valid value
However, at the moment, kstkesp is zero even during coredump
- Don't wake up the writer if the pipe is still full
- Fix unbalanced user_access_end() in select code"
* tag 'vfs-6.14-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits)
gfs2: use lockref_init for qd_lockref
erofs: use lockref_init for pcl->lockref
dcache: use lockref_init for d_lockref
lockref: add a lockref_init helper
lockref: drop superfluous externs
lockref: use bool for false/true returns
lockref: improve the lockref_get_not_zero description
lockref: remove lockref_put_not_zero
fs: Fix return type of do_mount() from long to int
select: Fix unbalanced user_access_end()
vbox: Enable VBOXGUEST and VBOXSF_FS on ARM64
pipe_read: don't wake up the writer if the pipe is still full
selftests: coredump: Add stackdump test
fs/proc: do_task_stat: Fix ESP not readable during coredump
fs: add RWF_DONTCACHE iocb and FOP_DONTCACHE file_operations flag
fs: sort out a stale comment about races between fd alloc and dup2
fs: Fix grammar and spelling in propagate_umount()
fs: fc_log replace magic number 7 with ARRAY_SIZE()
fs: use a consume fence in mnt_idmap()
file: flush delayed work in delayed fput()
...
THe md-linear is removed by commit 849d18e27b ("md: Remove deprecated
CONFIG_MD_LINEAR") because it has been marked as deprecated for a long
time.
However, md-linear is used widely for underlying disks with different size,
sadly we didn't know this until now, and it's true useful to create
partitions and assemble multiple raid and then append one to the other.
People have to use dm-linear in this case now, however, they will prefer
to minimize the number of involved modules.
Fixes: 849d18e27b ("md: Remove deprecated CONFIG_MD_LINEAR")
Cc: stable@vger.kernel.org
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Acked-by: Coly Li <colyli@kernel.org>
Acked-by: Mike Snitzer <snitzer@kernel.org>
Link: https://lore.kernel.org/r/20250102112841.1227111-1-yukuai1@huaweicloud.com
Signed-off-by: Song Liu <song@kernel.org>
Updates for v6.14
MDSS:
- properly described UBWC registers
- added SM6150 (aka QCS615) support
MDP4:
- several small fixes
DPU:
- added SM6150 (aka QCS615) support
- enabled wide planes if virtual planes are enabled (by using two SSPPs for a single plane)
- fixed modes filtering for platforms w/o 3DMux
- fixed DSPP DSPP_2 / _3 links on several platforms
- corrected DSPP definitions on SDM670
- added CWB hardware blocks support
- added VBIF to DPU snapshots
- dropped struct dpu_rm_requirements
DP:
- reworked DP audio support
DSI:
- added SM6150 (aka QCS615) support
GPU:
- Print GMU core fw version
- GMU bandwidth voting for a740 and a750
- Expose uche trap base via uapi
- UAPI error reporting
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGsutUu4ff6OpXNXxqf1xaV0rV6oV23VXNRiF0_OEfe72Q@mail.gmail.com
Add a separate dio read align field, as many out of place write
file systems can easily do reads aligned to the device sector size,
but require bigger alignment for writes.
This is usually papered over by falling back to buffered I/O for smaller
writes and doing read-modify-write cycles, but performance for this
sucks, so applications benefit from knowing the actual write alignment.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250109083109.1441561-3-hch@lst.de
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Pull networking fixes from Jakub Kicinski:
"Including fixes from wireles and netfilter.
Nothing major here. Over the last two weeks we gathered only around
two-thirds of our normal weekly fix count, but delaying sending these
until -rc7 seemed like a really bad idea.
AFAIK we have no bugs under investigation. One or two reverts for
stuff for which we haven't gotten a proper fix will likely come in the
next PR.
Current release - fix to a fix:
- netfilter: nft_set_hash: unaligned atomic read on struct
nft_set_ext
- eth: gve: trigger RX NAPI instead of TX NAPI in gve_xsk_wakeup
Previous releases - regressions:
- net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets
- mptcp:
- fix sleeping rcvmsg sleeping forever after bad recvbuffer adjust
- fix TCP options overflow
- prevent excessive coalescing on receive, fix throughput
- net: fix memory leak in tcp_conn_request() if map insertion fails
- wifi: cw1200: fix potential NULL dereference after conversion to
GPIO descriptors
- phy: micrel: dynamically control external clock of KSZ PHY, fix
suspend behavior
Previous releases - always broken:
- af_packet: fix VLAN handling with MSG_PEEK
- net: restrict SO_REUSEPORT to inet sockets
- netdev-genl: avoid empty messages in NAPI get
- dsa: microchip: fix set_ageing_time function on KSZ9477 and LAN937X
- eth:
- gve: XDP fixes around transmit, queue wakeup etc.
- ti: icssg-prueth: fix firmware load sequence to prevent time
jump which breaks timesync related operations
Misc:
- netlink: specs: mptcp: add missing attr and improve documentation"
* tag 'net-6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits)
net: ti: icssg-prueth: Fix clearing of IEP_CMP_CFG registers during iep_init
net: ti: icssg-prueth: Fix firmware load sequence.
mptcp: prevent excessive coalescing on receive
mptcp: don't always assume copied data in mptcp_cleanup_rbuf()
mptcp: fix recvbuffer adjust on sleeping rcvmsg
ila: serialize calls to nf_register_net_hooks()
af_packet: fix vlan_get_protocol_dgram() vs MSG_PEEK
af_packet: fix vlan_get_tci() vs MSG_PEEK
net: wwan: iosm: Properly check for valid exec stage in ipc_mmio_init()
net: restrict SO_REUSEPORT to inet sockets
net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets
net: sfc: Correct key_len for efx_tc_ct_zone_ht_params
net: wwan: t7xx: Fix FSM command timeout issue
sky2: Add device ID 11ab:4373 for Marvell 88E8075
mptcp: fix TCP options overflow.
net: mv643xx_eth: fix an OF node reference leak
gve: trigger RX NAPI instead of TX NAPI in gve_xsk_wakeup
eth: bcmsysport: fix call balance of priv->clk handling routines
net: llc: reset skb->transport_header
netlink: specs: mptcp: fix missing doc
...
Pull hardening fix from Kees Cook:
- stddef: make __struct_group() UAPI C++-friendly (Alexander Lobakin)
* tag 'hardening-v6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
stddef: make __struct_group() UAPI C++-friendly
Add the ability to pass additional attributes along with read/write.
Application can prepare attibute specific information and pass its
address using the SQE field:
__u64 attr_ptr;
Along with setting a mask indicating attributes being passed:
__u64 attr_type_mask;
Overall 64 attributes are allowed and currently one attribute
'IORING_RW_ATTR_FLAG_PI' is supported.
With PI attribute, userspace can pass following information:
- flags: integrity check flags IO_INTEGRITY_CHK_{GUARD/APPTAG/REFTAG}
- len: length of PI/metadata buffer
- addr: address of metadata buffer
- seed: seed value for reftag remapping
- app_tag: application defined 16b value
Process this information to prepare uio_meta_descriptor and pass it down
using kiocb->private.
PI attribute is supported only for direct IO.
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Link: https://lore.kernel.org/r/20241128112240.8867-7-anuj20.g@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
For the most part of the C++ history, it couldn't have type
declarations inside anonymous unions for different reasons. At the
same time, __struct_group() relies on the latters, so when the @TAG
argument is not empty, C++ code doesn't want to build (even under
`extern "C"`):
../linux/include/uapi/linux/pkt_cls.h:25:24: error:
'struct tc_u32_sel::<unnamed union>::tc_u32_sel_hdr,' invalid;
an anonymous union may only have public non-static data members
[-fpermissive]
The safest way to fix this without trying to switch standards (which
is impossible in UAPI anyway) etc., is to disable tag declaration
for that language. This won't break anything since for now it's not
buildable at all.
Use a separate definition for __struct_group() when __cplusplus is
defined to mitigate the error, including the version from tools/.
Fixes: 50d7bd38c3 ("stddef: Introduce struct_group() helper macro")
Reported-by: Christopher Ferris <cferris@google.com>
Closes: https://lore.kernel.org/linux-hardening/Z1HZpe3WE5As8UAz@google.com
Suggested-by: Kees Cook <kees@kernel.org> # __struct_group_tag()
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20241219135734.2130002-1-aleksander.lobakin@intel.com
Signed-off-by: Kees Cook <kees@kernel.org>
Groups can be killed during a reset even though they did nothing wrong.
That usually happens when the FW is put in a bad state by other groups,
resulting in group suspension failures when the reset happens.
If we end up in that situation, flag the group innocent and report
innocence through a new DRM_PANTHOR_GROUP_STATE flag.
Bump the minor driver version to reflect the uAPI change.
Changes in v4:
- Add an entry to the driver version changelog
- Add R-bs
Changes in v3:
- Actually report innocence to userspace
Changes in v2:
- New patch
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241211080500.2349505-1-boris.brezillon@collabora.com
Add SET_STATE ioctl to configure device power mode for aie2 device.
Three modes are supported initially.
POWER_MODE_DEFAULT: Enable clock gating and set DPM (Dynamic Power
Management) level to value which has been set by resource solver or
maximum DPM level the device supports.
POWER_MODE_HIGH: Enable clock gating and set DPM level to maximum DPM
level the device supports.
POWER_MODE_TURBO: Disable clock gating and set DPM level to maximum DPM
level the device supports.
Disabling clock gating means all clocks always run on full speed. And
the different clock frequency are used based on DPM level been set.
Initially, the driver set the power mode to default mode.
Co-developed-by: Narendra Gutta <VenkataNarendraKumar.Gutta@amd.com>
Signed-off-by: Narendra Gutta <VenkataNarendraKumar.Gutta@amd.com>
Co-developed-by: George Yang <George.Yang@amd.com>
Signed-off-by: George Yang <George.Yang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241213232933.1545388-4-lizhi.hou@amd.com
The macros giving the direction of the crossing thresholds use the BIT
macro which is not exported to the userspace. Consequently when an
userspace program includes the header, it fails to compile.
Replace the macros by their litteral to allow the compilation of
userspace program using this header.
Fixes: 445936f9e2 ("thermal: core: Add user thresholds support")
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/20241212201311.4143196-1-daniel.lezcano@linaro.org
[ rjw: Add Fixes: ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add a new property called DRM_XE_OA_PROPERTY_OA_BUFFER_SIZE to
allow OA buffer size to be configurable from userspace.
With this OA buffer size can be configured to any power of 2
size between 128KB and 128MB and it would default to 16MB in case
the size is not supplied.
v2:
- Rebase
v3:
- Add oa buffer size to capabilities [Ashutosh]
- Address several nitpicks [Ashutosh]
- Fix commit message/subject [Ashutosh]
BSpec: 61100, 61228
Signed-off-by: Sai Teja Pottumuttu <sai.teja.pottumuttu@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241205041913.883767-2-sai.teja.pottumuttu@intel.com
The v6.13-rc2 release included a bunch of breaking changes,
specifically the MODULE_IMPORT_NS commit.
Backmerge in order to fix them before the next pull-request.
Include the fix from Stephen Roswell.
Caused by commit
25c3fd1183 ("drm/virtio: Add a helper to map and note the dma addrs and lengths")
Interacting with commit
cdd30ebb1b ("module: Convert symbol namespace to string literal")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://patchwork.freedesktop.org/patch/msgid/20241209121717.2abe8026@canb.auug.org.au
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
Pull more io_uring updates from Jens Axboe:
- Remove a leftover struct from when the cqwait registered waiting was
transitioned to regions.
- Fix for an issue introduced in this merge window, where nop->fd might
be used uninitialized. Ensure it's always set.
- Add capping of the task_work run in local task_work mode, to prevent
bursty and long chains from adding too much latency.
- Work around xa_store() leaving ->head non-NULL if it encounters an
allocation error during storing. Just a debug trigger, and can go
away once xa_store() behaves in a more expected way for this
condition. Not a major thing as it basically requires fault injection
to trigger it.
- Fix a few mapping corner cases
- Fix KCSAN complaint on reading the table size post unlock. Again not
a "real" issue, but it's easy to silence by just keeping the reading
inside the lock that protects it.
* tag 'io_uring-6.13-20242901' of git://git.kernel.dk/linux:
io_uring/tctx: work around xa_store() allocation error issue
io_uring: fix corner case forgetting to vunmap
io_uring: fix task_work cap overshooting
io_uring: check for overflows in io_pin_pages
io_uring/nop: ensure nop->fd is always initialized
io_uring: limit local tw done
io_uring: add io_local_work_pending()
io_uring/region: return negative -E2BIG in io_create_region()
io_uring: protect register tracing
io_uring: remove io_uring_cqwait_reg_arg
Pull char/misc/IIO/whatever driver subsystem updates from Greg KH:
"Here is the 'big and hairy' char/misc/iio and other small driver
subsystem updates for 6.13-rc1.
Loads of things in here, and even a fun merge conflict!
- rust misc driver bindings and other rust changes to make misc
drivers actually possible.
I think this is the tipping point, expect to see way more rust
drivers going forward now that these bindings are present. Next
merge window hopefully we will have pci and platform drivers
working, which will fully enable almost all driver subsystems to
start accepting (or at least getting) rust drivers.
This is the end result of a lot of work from a lot of people,
congrats to all of them for getting this far, you've proved many of
us wrong in the best way possible, working code :)
- IIO driver updates, too many to list individually, that subsystem
keeps growing and growing...
- Interconnect driver updates
- nvmem driver updates
- pwm driver updates
- platform_driver::remove() fixups, loads of them
- counter driver updates
- misc driver updates (keba?)
- binder driver updates and fixes
- loads of other small char/misc/etc driver updates and additions,
full details in the shortlog.
All of these have been in linux-next for a while, with no other
reported issues other than that merge conflict"
* tag 'char-misc-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (401 commits)
mei: vsc: Fix typo "maintstepping" -> "mainstepping"
firmware: Switch back to struct platform_driver::remove()
misc: isl29020: Fix the wrong format specifier
scripts/tags.sh: Don't tag usages of DEFINE_MUTEX
fpga: Switch back to struct platform_driver::remove()
mei: vsc: Improve error logging in vsc_identify_silicon()
mei: vsc: Do not re-enable interrupt from vsc_tp_reset()
dt-bindings: spmi: qcom,x1e80100-spmi-pmic-arb: Add SAR2130P compatible
dt-bindings: spmi: spmi-mtk-pmif: Add compatible for MT8188
spmi: pmic-arb: fix return path in for_each_available_child_of_node()
iio: Move __private marking before struct element priv in struct iio_dev
docs: iio: ad7380: add adaq4370-4 and adaq4380-4
iio: adc: ad7380: add support for adaq4370-4 and adaq4380-4
iio: adc: ad7380: use local dev variable to shorten long lines
iio: adc: ad7380: fix oversampling formula
dt-bindings: iio: adc: ad7380: add adaq4370-4 and adaq4380-4 compatible parts
bus: mhi: host: pci_generic: Use pcim_iomap_region() to request and map MHI BAR
bus: mhi: host: Switch trace_mhi_gen_tre fields to native endian
misc: atmel-ssc: Use of_property_present() for non-boolean properties
misc: keba: Add hardware dependency
...
Pull USB / Thunderbolt updates from Greg KH:
"Here is the big set of USB and Thunderbolt changes for 6.13-rc1.
Overall, a pretty slow development cycle, the majority of the work
going into the debugfs interface for the thunderbolt (i.e. USB4) code,
to help with debugging the myrad ways that hardware vendors get their
interfaces messed up. Other than that, here's the highlights:
- thunderbolt changes and additions to debugfs interfaces
- lots of device tree updates for new and old hardware
- UVC configfs gadget updates and new apis for features
- xhci driver updates and fixes
- dwc3 driver updates and fixes
- typec driver updates and fixes
- lots of other small updates and fixes, full details in the shortlog
All of these have been in linux-next for a while with no reported
problems"
* tag 'usb-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (148 commits)
usb: typec: tcpm: Add support for sink-bc12-completion-time-ms DT property
dt-bindings: usb: maxim,max33359: add usage of sink bc12 time property
dt-bindings: connector: Add time property for Sink BC12 detection completion
usb: dwc3: gadget: Remove dwc3_request->needs_extra_trb
usb: dwc3: gadget: Cleanup SG handling
usb: dwc3: gadget: Fix looping of queued SG entries
usb: dwc3: gadget: Fix checking for number of TRBs left
usb: dwc3: ep0: Don't clear ep0 DWC3_EP_TRANSFER_STARTED
Revert "usb: gadget: composite: fix OS descriptors w_value logic"
usb: ehci-spear: fix call balance of sehci clk handling routines
USB: make to_usb_device_driver() use container_of_const()
USB: make to_usb_driver() use container_of_const()
USB: properly lock dynamic id list when showing an id
USB: make single lock for all usb dynamic id lists
drivers/usb/storage: refactor min with min_t
drivers/usb/serial: refactor min with min_t
drivers/usb/musb: refactor min/max with min_t/max_t
drivers/usb/mon: refactor min with min_t
drivers/usb/misc: refactor min with min_t
drivers/usb/host: refactor min/max with min_t/max_t
...
Pull VFIO updates from Alex Williamson:
- Constify an unmodified structure used in linking vfio and kvm
(Christophe JAILLET)
- Add ID for an additional hardware SKU supported by the nvgrace-gpu
vfio-pci variant driver (Ankit Agrawal)
- Fix incorrect signed cast in QAT vfio-pci variant driver, negating
test in check_add_overflow(), though still caught by later tests
(Giovanni Cabiddu)
- Additional debugfs attributes exposed in hisi_acc vfio-pci variant
driver for migration debugging (Longfang Liu)
- Migration support is added to the virtio vfio-pci variant driver,
becoming the primary feature of the driver while retaining emulation
of virtio legacy support as a secondary option (Yishai Hadas)
- Fixes to a few unwind flows in the mlx5 vfio-pci driver discovered
through reviews of the virtio variant driver (Yishai Hadas)
- Fix an unlikely issue where a PCI device exposed to userspace with an
unknown capability at the base of the extended capability chain can
overflow an array index (Avihai Horon)
* tag 'vfio-v6.13-rc1' of https://github.com/awilliam/linux-vfio:
vfio/pci: Properly hide first-in-list PCIe extended capability
vfio/mlx5: Fix unwind flows in mlx5vf_pci_save/resume_device_data()
vfio/mlx5: Fix an unwind issue in mlx5vf_add_migration_pages()
vfio/virtio: Enable live migration once VIRTIO_PCI was configured
vfio/virtio: Add PRE_COPY support for live migration
vfio/virtio: Add support for the basic live migration functionality
virtio-pci: Introduce APIs to execute device parts admin commands
virtio: Manage device and driver capabilities via the admin commands
virtio: Extend the admin command to include the result size
virtio_pci: Introduce device parts access commands
Documentation: add debugfs description for hisi migration
hisi_acc_vfio_pci: register debugfs for hisilicon migration driver
hisi_acc_vfio_pci: create subfunction for data reading
hisi_acc_vfio_pci: extract public functions for container_of
vfio/qat: fix overflow check in qat_vf_resume_write()
vfio/nvgrace-gpu: Add a new GH200 SKU to the devid table
kvm/vfio: Constify struct kvm_device_ops
Pull RISC-v updates from Palmer Dabbelt:
- Support for pointer masking in userspace
- Support for probing vector misaligned access performance
- Support for qspinlock on systems with Zacas and Zabha
* tag 'riscv-for-linus-6.13-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (38 commits)
RISC-V: Remove unnecessary include from compat.h
riscv: Fix default misaligned access trap
riscv: Add qspinlock support
dt-bindings: riscv: Add Ziccrse ISA extension description
riscv: Add ISA extension parsing for Ziccrse
asm-generic: ticket-lock: Add separate ticket-lock.h
asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock
riscv: Implement xchg8/16() using Zabha
riscv: Implement arch_cmpxchg128() using Zacas
riscv: Improve zacas fully-ordered cmpxchg()
riscv: Implement cmpxchg8/16() using Zabha
dt-bindings: riscv: Add Zabha ISA extension description
riscv: Implement cmpxchg32/64() using Zacas
riscv: Do not fail to build on byte/halfword operations with Zawrs
riscv: Move cpufeature.h macros into their own header
KVM: riscv: selftests: Add Smnpm and Ssnpm to get-reg-list test
RISC-V: KVM: Allow Smnpm and Ssnpm extensions for guests
riscv: hwprobe: Export the Supm ISA extension
riscv: selftests: Add a pointer masking test
riscv: Allow ptrace control of the tagged address ABI
...
Pull PCI updates from Bjorn Helgaas:
"Enumeration:
- Make pci_stop_dev() and pci_destroy_dev() safe so concurrent
callers can't stop a device multiple times, even as we migrate from
the global pci_rescan_remove_lock to finer-grained locking (Keith
Busch)
- Improve pci_walk_bus() implementation by making it recursive and
moving locking up to avoid need for a 'locked' parameter (Keith
Busch)
- Unexport pci_walk_bus_locked(), which is only used internally by
the PCI core (Keith Busch)
- Detect some Thunderbolt chips that are built-in and hence
'trustworthy' by a heuristic since the 'ExternalFacingPort' and
'usb4-host-interface' ACPI properties are not quite enough (Esther
Shimanovich)
Resource management:
- Use PCI bus addresses (not CPU addresses) in 'ranges' properties
when building dynamic DT nodes so systems where PCI and CPU
addresses differ work correctly (Andrea della Porta)
- Tidy resource sizing and assignment with helpers to reduce
redundancy (Ilpo Järvinen)
- Improve pdev_sort_resources() 'bogus alignment' warning to be more
specific (Ilpo Järvinen)
Driver binding:
- Convert driver .remove_new() callbacks to .remove() again to finish
the conversion from returning 'int' to being 'void' (Sergio
Paracuellos)
- Export pcim_request_all_regions(), a managed interface to request
all BARs (Philipp Stanner)
- Replace pcim_iomap_regions_request_all() with
pcim_request_all_regions(), and pcim_iomap_table()[n] with
pcim_iomap(n), in the following drivers: ahci, crypto qat, crypto
octeontx2, intel_th, iwlwifi, ntb idt, serial rp2, ALSA korg1212
(Philipp Stanner)
- Remove the now unused pcim_iomap_regions_request_all() (Philipp
Stanner)
- Export pcim_iounmap_region(), a managed interface to unmap and
release a PCI BAR (Philipp Stanner)
- Replace pcim_iomap_regions(mask) with pcim_iomap_region(n), and
pcim_iounmap_regions(mask) with pcim_iounmap_region(n), in the
following drivers: fpga dfl-pci, block mtip32xx, gpio-merrifield,
cavium (Philipp Stanner)
Error handling:
- Add sysfs 'reset_subordinate' to reset the entire hierarchy below a
bridge; previously Secondary Bus Reset could only be used when
there was a single device below a bridge (Keith Busch)
- Warn if we reset a running device where the driver didn't register
pci_error_handlers notification callbacks (Keith Busch)
ASPM:
- Disable ASPM L1 before touching L1 PM Substates to follow the spec
closer and avoid a CPU load timeout on some platforms (Ajay
Agarwal)
- Set devices below Intel VMD to D0 before enabling ASPM L1 Substates
as required per spec for all L1 Substates changes (Jian-Hong Pan)
Power management:
- Enable starfive controller runtime PM before probing host bridge
(Mayank Rana)
- Enable runtime power management for host bridges (Krishna chaitanya
chundru)
Power control:
- Use of_platform_device_create() instead of of_platform_populate()
to create pwrctl platform devices so we can control it based on the
child nodes (Manivannan Sadhasivam)
- Create pwrctrl platform devices only if there's a relevant power
supply property (Manivannan Sadhasivam)
- Add device link from the pwrctl supplier to the PCI dev to ensure
pwrctl drivers are probed before the PCI dev driver; this avoids a
race where pwrctl could change device power state while the PCI
driver was active (Manivannan Sadhasivam)
- Find pwrctl device for removal with of_find_device_by_node()
instead of searching all children of the parent (Manivannan
Sadhasivam)
- Rename 'pwrctl' to 'pwrctrl' to match new bandwidth controller
('bwctrl') and hotplug files (Bjorn Helgaas)
Bandwidth control:
- Add read/modify/write locking for Link Control 2, which is used to
manage Link speed (Ilpo Järvinen)
- Extract Link Bandwidth Management Status check into
pcie_lbms_seen(), where it can be shared between the bandwidth
controller and quirks that use it to help retrain failed links
(Ilpo Järvinen)
- Re-add Link Bandwidth notification support with updates to address
the reasons it was previously reverted (Alexandru Gagniuc, Ilpo
Järvinen)
- Add pcie_set_target_speed() and related functionality so drivers
can manage PCIe Link speed based on thermal or other constraints
(Ilpo Järvinen)
- Add a thermal cooling driver to throttle PCIe Links via the
existing thermal management framework (Ilpo Järvinen)
- Add a userspace selftest for the PCIe bandwidth controller (Ilpo
Järvinen)
PCI device hotplug:
- Add hotplug controller driver for Marvell OCTEON multi-function
device where function 0 has a management console interface to
enable/disable and provision various personalities for the other
functions (Shijith Thotton)
- Retain a reference to the pci_bus for the lifetime of a pci_slot to
avoid a use-after-free when the thunderbolt driver resets USB4 host
routers on boot, causing hotplug remove/add of downstream docks or
other devices (Lukas Wunner)
- Remove unused cpcihp struct cpci_hp_controller_ops.hardware_test
(Guilherme Giacomo Simoes)
- Remove unused cpqphp struct ctrl_dbg.ctrl (Christophe JAILLET)
- Use pci_bus_read_dev_vendor_id() instead of hand-coded presence
detection in cpqphp (Ilpo Järvinen)
- Simplify cpqphp enumeration, which is already simple-minded and
doesn't handle devices below hot-added bridges (Ilpo Järvinen)
Virtualization:
- Add ACS quirk for Wangxun FF5xxx NICs, which don't advertise an ACS
capability but do isolate functions as though PCI_ACS_RR and
PCI_ACS_CR were set, so the functions can be in independent IOMMU
groups (Mengyuan Lou)
TLP Processing Hints (TPH):
- Add and document TLP Processing Hints (TPH) support so drivers can
enable and disable TPH and the kernel can save/restore TPH
configuration (Wei Huang)
- Add TPH Steering Tag support so drivers can retrieve Steering Tag
values associated with specific CPUs via an ACPI _DSM to improve
performance by directing DMA writes closer to their consumers (Wei
Huang)
Data Object Exchange (DOE):
- Wait up to 1 second for DOE Busy bit to clear before writing a
request to the mailbox to avoid failures if the mailbox is still
busy from a previous transfer (Gregory Price)
Endpoint framework:
- Skip attempts to allocate from endpoint controller memory window if
the requested size is larger than the window (Damien Le Moal)
- Add and document pci_epc_mem_map() and pci_epc_mem_unmap() to
handle controller-specific size and alignment constraints, and add
test cases to the endpoint test driver (Damien Le Moal)
- Implement dwc pci_epc_ops.align_addr() so pci_epc_mem_map() can
observe DWC-specific alignment requirements (Damien Le Moal)
- Synchronously cancel command handler work in endpoint test before
cleaning up DMA and BARs (Damien Le Moal)
- Respect endpoint page size in dw_pcie_ep_align_addr() (Niklas
Cassel)
- Use dw_pcie_ep_align_addr() in dw_pcie_ep_raise_msi_irq() and
dw_pcie_ep_raise_msix_irq() instead of open coding the equivalent
(Niklas Cassel)
- Avoid NULL dereference if Modem Host Interface Endpoint lacks
'mmio' DT property (Zhongqiu Han)
- Release PCI domain ID of Endpoint controller parent (not controller
itself) and before unregistering the controller, to avoid
use-after-free (Zijun Hu)
- Clear secondary (not primary) EPC in pci_epc_remove_epf() when
removing the secondary controller associated with an NTB (Zijun Hu)
Cadence PCIe controller driver:
- Lower severity of 'phy-names' message (Bartosz Wawrzyniak)
Freescale i.MX6 PCIe controller driver:
- Fix suspend/resume support on i.MX6QDL, which has a hardware
erratum that prevents use of L2 (Stefan Eichenberger)
Intel VMD host bridge driver:
- Add 0xb60b and 0xb06f Device IDs for client SKUs (Nirmal Patel)
MediaTek PCIe Gen3 controller driver:
- Update mediatek-gen3 DT binding to require the exact number of
clocks for each SoC (Fei Shao)
- Add support for DT 'max-link-speed' and 'num-lanes' properties to
restrict the link speed and width (AngeloGioacchino Del Regno)
Microchip PolarFlare PCIe controller driver:
- Add DT and driver support for using either of the two PolarFire
Root Ports (Conor Dooley)
NVIDIA Tegra194 PCIe controller driver:
- Move endpoint controller cleanups that depend on refclk from the
host to the notifier that tells us the host has deasserted PERST#,
when refclk should be valid (Manivannan Sadhasivam)
Qualcomm PCIe controller driver:
- Add qcom SAR2130P DT binding with an additional clock (Dmitry
Baryshkov)
- Enable MSI interrupts if 'global' IRQ is supported, since a
previous commit unintentionally masked them (Manivannan Sadhasivam)
- Move endpoint controller cleanups that depend on refclk from the
host to the notifier that tells us the host has deasserted PERST#,
when refclk should be valid (Manivannan Sadhasivam)
- Add DT binding and driver support for IPQ9574, with Synopsys IP
v5.80a and Qcom IP 1.27.0 (devi priya)
- Move the OPP "operating-points-v2" table from the
qcom,pcie-sm8450.yaml DT binding to qcom,pcie-common.yaml, where it
can be used by other Qcom platforms (Qiang Yu)
- Add 'global' SPI interrupt for events like link-up, link-down to
qcom,pcie-x1e80100 DT binding so we can start enumeration when the
link comes up (Qiang Yu)
- Disable ASPM L0s for qcom,pcie-x1e80100 since the PHY is not tuned
to support this (Qiang Yu)
- Add ops_1_21_0 for SC8280X family SoC, which doesn't use the
'iommu-map' DT property and doesn't need BDF-to-SID translation
(Qiang Yu)
Rockchip PCIe controller driver:
- Define ROCKCHIP_PCIE_AT_SIZE_ALIGN to replace magic 256 endpoint
.align value (Damien Le Moal)
- When unmapping an endpoint window, compute the region index instead
of searching for it, and verify that the address was mapped (Damien
Le Moal)
- When mapping an endpoint window, verify that the address hasn't
been mapped already (Damien Le Moal)
- Implement pci_epc_ops.align_addr() for rockchip-ep (Damien Le Moal)
- Fix MSI IRQ data mapping to observe the alignment constraint, which
fixes intermittent page faults in memcpy_toio() and memcpy_fromio()
(Damien Le Moal)
- Rename rockchip_pcie_parse_ep_dt() to
rockchip_pcie_ep_get_resources() for consistency with similar DT
interfaces (Damien Le Moal)
- Skip the unnecessary link train in rockchip_pcie_ep_probe() and do
it only in the endpoint start operation (Damien Le Moal)
- Implement pci_epc_ops.stop_link() to disable link training and
controller configuration (Damien Le Moal)
- Attempt link training at 5 GT/s when both partners support it
(Damien Le Moal)
- Add a handler for PERST# signal so we can detect host-initiated
resets and start link training after PERST# is deasserted (Damien
Le Moal)
Synopsys DesignWare PCIe controller driver:
- Clear outbound address on unmap so dw_pcie_find_index() won't match
an ATU index that was already unmapped (Damien Le Moal)
- Use of_property_present() instead of of_property_read_bool() when
testing for presence of non-boolean DT properties (Rob Herring)
- Advertise 1MB size if endpoint supports Resizable BARs, which was
inadvertently lost in v6.11 (Niklas Cassel)
TI J721E PCIe driver:
- Add PCIe support for J722S SoC (Siddharth Vadapalli)
- Delay PCIE_T_PVPERL_MS (100 ms), not just PCIE_T_PERST_CLK_US (100
us), before deasserting PERST# to ensure power and refclk are
stable (Siddharth Vadapalli)
TI Keystone PCIe controller driver:
- Set the 'ti,keystone-pcie' mode so v3.65a devices work in Root
Complex mode (Kishon Vijay Abraham I)
- Try to avoid unrecoverable SError for attempts to issue config
transactions when the link is down; this is racy but the best we
can do (Kishon Vijay Abraham I)
Miscellaneous:
- Reorganize kerneldoc parameter names to match order in function
signature (Julia Lawall)
- Fix sysfs reset_method_store() memory leak (Todd Kjos)
- Simplify pci_create_slot() (Ilpo Järvinen)
- Fix incorrect printf format specifiers in pcitest (Luo Yifan)"
* tag 'pci-v6.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (127 commits)
PCI: rockchip-ep: Handle PERST# signal in EP mode
PCI: rockchip-ep: Improve link training
PCI: rockship-ep: Implement the pci_epc_ops::stop_link() operation
PCI: rockchip-ep: Refactor endpoint link training enable
PCI: rockchip-ep: Refactor rockchip_pcie_ep_probe() MSI-X hiding
PCI: rockchip-ep: Refactor rockchip_pcie_ep_probe() memory allocations
PCI: rockchip-ep: Rename rockchip_pcie_parse_ep_dt()
PCI: rockchip-ep: Fix MSI IRQ data mapping
PCI: rockchip-ep: Implement the pci_epc_ops::align_addr() operation
PCI: rockchip-ep: Improve rockchip_pcie_ep_map_addr()
PCI: rockchip-ep: Improve rockchip_pcie_ep_unmap_addr()
PCI: rockchip-ep: Use a macro to define EP controller .align feature
PCI: rockchip-ep: Fix address translation unit programming
PCI/pwrctrl: Rename pwrctrl functions and structures
PCI/pwrctrl: Rename pwrctl files to pwrctrl
PCI/pwrctl: Remove pwrctl device without iterating over all children of pwrctl parent
PCI/pwrctl: Ensure that pwrctl drivers are probed before PCI client drivers
PCI/pwrctl: Create pwrctl device only if at least one power supply is present
PCI/pwrctl: Use of_platform_device_create() to create pwrctl devices
tools: PCI: Fix incorrect printf format specifiers
...
Pull vfs exportfs updates from Christian Brauner:
"This contains work to bring NFS connectable file handles to userspace
servers.
The name_to_handle_at() system call is extended to encode connectable
file handles. Such file handles can be resolved to an open file with a
connected path. So far userspace NFS servers couldn't make use of this
functionality even though the kernel does already support it. This is
achieved by introducing a new flag for name_to_handle_at().
Similarly, the open_by_handle_at() system call is tought to understand
connectable file handles explicitly created via name_to_handle_at()"
* tag 'vfs-6.13.exportfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs: open_by_handle_at() support for decoding "explicit connectable" file handles
fs: name_to_handle_at() support for "explicit connectable" file handles
fs: prepare for "explicit connectable" file handles
Pull f2fs updates from Jaegeuk Kim:
"This series introduces a device aliasing feature where user can carve
out partitions but reclaim the space back by deleting aliased file in
root dir.
In addition to that, there're numerous minor bug fixes in zoned device
support, checkpoint=disable, extent cache management, fiemap, and
lazytime mount option. The full list of noticeable changes can be
found below.
Enhancements:
- introduce device aliasing file
- add stats in debugfs to show multiple devices
- add a sysfs node to limit max read extent count per-inode
- modify f2fs_is_checkpoint_ready logic to allow more data to be
written with the CP disable
- decrease spare area for pinned files for zoned devices
Fixes:
- Revert "f2fs: remove unreachable lazytime mount option parsing"
- adjust unusable cap before checkpoint=disable mode
- fix to drop all discards after creating snapshot on lvm device
- fix to shrink read extent node in batches
- fix changing cursegs if recovery fails on zoned device
- fix to adjust appropriate length for fiemap
- fix fiemap failure issue when page size is 16KB
- fix to avoid forcing direct write to use buffered IO on inline_data
inode
- fix to map blocks correctly for direct write
- fix to account dirty data in __get_secs_required()
- fix null-ptr-deref in f2fs_submit_page_bio()
- fix inconsistent update of i_blocks in release_compress_blocks and
reserve_compress_blocks"
* tag 'f2fs-for-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (40 commits)
f2fs: fix to drop all discards after creating snapshot on lvm device
f2fs: add a sysfs node to limit max read extent count per-inode
f2fs: fix to shrink read extent node in batches
f2fs: print message if fscorrupted was found in f2fs_new_node_page()
f2fs: clear SBI_POR_DOING before initing inmem curseg
f2fs: fix changing cursegs if recovery fails on zoned device
f2fs: adjust unusable cap before checkpoint=disable mode
f2fs: fix to requery extent which cross boundary of inquiry
f2fs: fix to adjust appropriate length for fiemap
f2fs: clean up w/ F2FS_{BLK_TO_BYTES,BTYES_TO_BLK}
f2fs: fix to do cast in F2FS_{BLK_TO_BYTES, BTYES_TO_BLK} to avoid overflow
f2fs: replace deprecated strcpy with strscpy
Revert "f2fs: remove unreachable lazytime mount option parsing"
f2fs: fix to avoid forcing direct write to use buffered IO on inline_data inode
f2fs: fix to map blocks correctly for direct write
f2fs: fix race in concurrent f2fs_stop_gc_thread
f2fs: fix fiemap failure issue when page size is 16KB
f2fs: remove redundant atomic file check in defragment
f2fs: fix to convert log type to segment data type correctly
f2fs: clean up the unused variable additional_reserved_segments
...
- Add and document TLP Processing Hints (TPH) support so drivers can enable
and disable TPH and the kernel can save/restore TPH configuration (Wei
Huang)
- Add TPH Steering Tag support so drivers can retrieve Steering Tag values
associated with specific CPUs via an ACPI _DSM to direct DMA writes
closer to their consumers (Wei Huang)
* pci/tph:
PCI/TPH: Add TPH documentation
PCI/TPH: Add Steering Tag support
PCI: Add TLP Processing Hints (TPH) support
Pull kvm updates from Paolo Bonzini:
"The biggest change here is eliminating the awful idea that KVM had of
essentially guessing which pfns are refcounted pages.
The reason to do so was that KVM needs to map both non-refcounted
pages (for example BARs of VFIO devices) and VM_PFNMAP/VM_MIXMEDMAP
VMAs that contain refcounted pages.
However, the result was security issues in the past, and more recently
the inability to map VM_IO and VM_PFNMAP memory that _is_ backed by
struct page but is not refcounted. In particular this broke virtio-gpu
blob resources (which directly map host graphics buffers into the
guest as "vram" for the virtio-gpu device) with the amdgpu driver,
because amdgpu allocates non-compound higher order pages and the tail
pages could not be mapped into KVM.
This requires adjusting all uses of struct page in the
per-architecture code, to always work on the pfn whenever possible.
The large series that did this, from David Stevens and Sean
Christopherson, also cleaned up substantially the set of functions
that provided arch code with the pfn for a host virtual addresses.
The previous maze of twisty little passages, all different, is
replaced by five functions (__gfn_to_page, __kvm_faultin_pfn, the
non-__ versions of these two, and kvm_prefetch_pages) saving almost
200 lines of code.
ARM:
- Support for stage-1 permission indirection (FEAT_S1PIE) and
permission overlays (FEAT_S1POE), including nested virt + the
emulated page table walker
- Introduce PSCI SYSTEM_OFF2 support to KVM + client driver. This
call was introduced in PSCIv1.3 as a mechanism to request
hibernation, similar to the S4 state in ACPI
- Explicitly trap + hide FEAT_MPAM (QoS controls) from KVM guests. As
part of it, introduce trivial initialization of the host's MPAM
context so KVM can use the corresponding traps
- PMU support under nested virtualization, honoring the guest
hypervisor's trap configuration and event filtering when running a
nested guest
- Fixes to vgic ITS serialization where stale device/interrupt table
entries are not zeroed when the mapping is invalidated by the VM
- Avoid emulated MMIO completion if userspace has requested
synchronous external abort injection
- Various fixes and cleanups affecting pKVM, vCPU initialization, and
selftests
LoongArch:
- Add iocsr and mmio bus simulation in kernel.
- Add in-kernel interrupt controller emulation.
- Add support for virtualization extensions to the eiointc irqchip.
PPC:
- Drop lingering and utterly obsolete references to PPC970 KVM, which
was removed 10 years ago.
- Fix incorrect documentation references to non-existing ioctls
RISC-V:
- Accelerate KVM RISC-V when running as a guest
- Perf support to collect KVM guest statistics from host side
s390:
- New selftests: more ucontrol selftests and CPU model sanity checks
- Support for the gen17 CPU model
- List registers supported by KVM_GET/SET_ONE_REG in the
documentation
x86:
- Cleanup KVM's handling of Accessed and Dirty bits to dedup code,
improve documentation, harden against unexpected changes.
Even if the hardware A/D tracking is disabled, it is possible to
use the hardware-defined A/D bits to track if a PFN is Accessed
and/or Dirty, and that removes a lot of special cases.
- Elide TLB flushes when aging secondary PTEs, as has been done in
x86's primary MMU for over 10 years.
- Recover huge pages in-place in the TDP MMU when dirty page logging
is toggled off, instead of zapping them and waiting until the page
is re-accessed to create a huge mapping. This reduces vCPU jitter.
- Batch TLB flushes when dirty page logging is toggled off. This
reduces the time it takes to disable dirty logging by ~3x.
- Remove the shrinker that was (poorly) attempting to reclaim shadow
page tables in low-memory situations.
- Clean up and optimize KVM's handling of writes to
MSR_IA32_APICBASE.
- Advertise CPUIDs for new instructions in Clearwater Forest
- Quirk KVM's misguided behavior of initialized certain feature MSRs
to their maximum supported feature set, which can result in KVM
creating invalid vCPU state. E.g. initializing PERF_CAPABILITIES to
a non-zero value results in the vCPU having invalid state if
userspace hides PDCM from the guest, which in turn can lead to
save/restore failures.
- Fix KVM's handling of non-canonical checks for vCPUs that support
LA57 to better follow the "architecture", in quotes because the
actual behavior is poorly documented. E.g. most MSR writes and
descriptor table loads ignore CR4.LA57 and operate purely on
whether the CPU supports LA57.
- Bypass the register cache when querying CPL from kvm_sched_out(),
as filling the cache from IRQ context is generally unsafe; harden
the cache accessors to try to prevent similar issues from occuring
in the future. The issue that triggered this change was already
fixed in 6.12, but was still kinda latent.
- Advertise AMD_IBPB_RET to userspace, and fix a related bug where
KVM over-advertises SPEC_CTRL when trying to support cross-vendor
VMs.
- Minor cleanups
- Switch hugepage recovery thread to use vhost_task.
These kthreads can consume significant amounts of CPU time on
behalf of a VM or in response to how the VM behaves (for example
how it accesses its memory); therefore KVM tried to place the
thread in the VM's cgroups and charge the CPU time consumed by that
work to the VM's container.
However the kthreads did not process SIGSTOP/SIGCONT, and therefore
cgroups which had KVM instances inside could not complete freezing.
Fix this by replacing the kthread with a PF_USER_WORKER thread, via
the vhost_task abstraction. Another 100+ lines removed, with
generally better behavior too like having these threads properly
parented in the process tree.
- Revert a workaround for an old CPU erratum (Nehalem/Westmere) that
didn't really work; there was really nothing to work around anyway:
the broken patch was meant to fix nested virtualization, but the
PERF_GLOBAL_CTRL MSR is virtualized and therefore unaffected by the
erratum.
- Fix 6.12 regression where CONFIG_KVM will be built as a module even
if asked to be builtin, as long as neither KVM_INTEL nor KVM_AMD is
'y'.
x86 selftests:
- x86 selftests can now use AVX.
Documentation:
- Use rST internal links
- Reorganize the introduction to the API document
Generic:
- Protect vcpu->pid accesses outside of vcpu->mutex with a rwlock
instead of RCU, so that running a vCPU on a different task doesn't
encounter long due to having to wait for all CPUs become quiescent.
In general both reads and writes are rare, but userspace that
supports confidential computing is introducing the use of "helper"
vCPUs that may jump from one host processor to another. Those will
be very happy to trigger a synchronize_rcu(), and the effect on
performance is quite the disaster"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (298 commits)
KVM: x86: Break CONFIG_KVM_X86's direct dependency on KVM_INTEL || KVM_AMD
KVM: x86: add back X86_LOCAL_APIC dependency
Revert "KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config()"
KVM: x86: switch hugepage recovery thread to vhost_task
KVM: x86: expose MSR_PLATFORM_INFO as a feature MSR
x86: KVM: Advertise CPUIDs for new instructions in Clearwater Forest
Documentation: KVM: fix malformed table
irqchip/loongson-eiointc: Add virt extension support
LoongArch: KVM: Add irqfd support
LoongArch: KVM: Add PCHPIC user mode read and write functions
LoongArch: KVM: Add PCHPIC read and write functions
LoongArch: KVM: Add PCHPIC device support
LoongArch: KVM: Add EIOINTC user mode read and write functions
LoongArch: KVM: Add EIOINTC read and write functions
LoongArch: KVM: Add EIOINTC device support
LoongArch: KVM: Add IPI user mode read and write function
LoongArch: KVM: Add IPI read and write function
LoongArch: KVM: Add IPI device support
LoongArch: KVM: Add iocsr and mmio bus simulation in kernel
KVM: arm64: Pass on SVE mapping failures
...