Compare commits

...

878 Commits

Author SHA1 Message Date
Linus Torvalds
0ff41df1cb Linux 6.15 2025-05-25 16:09:23 -07:00
Linus Torvalds
478ad02d68 Disable FOP_DONTCACHE for now due to bugs
This is kind of last-minute, but Al Viro reported that the new
FOP_DONTCACHE flag causes memory corruption due to use-after-free
issues.

This was triggered by commit 974c5e6139 ("xfs: flag as supporting
FOP_DONTCACHE"), but that is not the underlying bug - it is just the
first user of the flag.

Vlastimil Babka suspects the underlying problem stems from the
folio_end_writeback() logic introduced in commit fb7d3bc414
("mm/filemap: drop streaming/uncached pages when writeback completes").

The most straightforward fix would be to just revert the commit that
exposed this, but Matthew Wilcox points out that other filesystems are
also starting to enable the FOP_DONTCACHE logic, so this instead
disables that bit globally for now.

The fix will hopefully end up being trivial and we can just re-enable
this logic after more testing, but until such a time we'll have to
disable the new FOP_DONTCACHE flag.

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/all/20250525083209.GS2023217@ZenIV/
Triggered-by: 974c5e6139 ("xfs: flag as supporting FOP_DONTCACHE")
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-05-25 15:43:36 -07:00
Linus Torvalds
0f8c0258bf Merge tag 'mm-hotfixes-stable-2025-05-25-00-58' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull hotfixes from Andrew Morton:
 "22 hotfixes.

  13 are cc:stable and the remainder address post-6.14 issues or aren't
  considered necessary for -stable kernels. 19 are for MM"

* tag 'mm-hotfixes-stable-2025-05-25-00-58' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (22 commits)
  mailmap: add Jarkko's employer email address
  mm: fix copy_vma() error handling for hugetlb mappings
  memcg: always call cond_resched() after fn()
  mm/hugetlb: fix kernel NULL pointer dereference when replacing free hugetlb folios
  mm: vmalloc: only zero-init on vrealloc shrink
  mm: vmalloc: actually use the in-place vrealloc region
  alloc_tag: allocate percpu counters for module tags dynamically
  module: release codetag section when module load fails
  mm/cma: make detection of highmem_start more robust
  MAINTAINERS: add mm memory policy section
  MAINTAINERS: add mm ksm section
  kasan: avoid sleepable page allocation from atomic context
  highmem: add folio_test_partial_kmap()
  MAINTAINERS: add hung-task detector section
  taskstats: fix struct taskstats breaks backward compatibility since version 15
  mm/truncate: fix out-of-bounds when doing a right-aligned split
  MAINTAINERS: add mm reclaim section
  MAINTAINERS: update page allocator section
  mm: fix VM_UFFD_MINOR == VM_SHADOW_STACK on USERFAULTFD=y && ARM64_GCS=y
  mm: mmap: map MAP_STACK to VM_NOHUGEPAGE only if THP is enabled
  ...
2025-05-25 07:48:35 -07:00
Jarkko Sakkinen
1ec971da1c mailmap: add Jarkko's employer email address
Add the current employer email address to mailmap.

Link: https://lkml.kernel.org/r/20250523121105.15850-1-jarkko@kernel.org
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Cc: Antonio Quartulli <antonio@openvpn.net>
Cc: Carlos Bilbao <carlos.bilbao@kernel.org>
Cc: Kees Cook <kees@kernel.org>
Cc: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-25 00:53:49 -07:00
Ricardo Cañuelo Navarro
ee40c9920a mm: fix copy_vma() error handling for hugetlb mappings
If, during a mremap() operation for a hugetlb-backed memory mapping,
copy_vma() fails after the source vma has been duplicated and opened (ie. 
vma_link() fails), the error is handled by closing the new vma.  This
updates the hugetlbfs reservation counter of the reservation map which at
this point is referenced by both the source vma and the new copy.  As a
result, once the new vma has been freed and copy_vma() returns, the
reservation counter for the source vma will be incorrect.

This patch addresses this corner case by clearing the hugetlb private page
reservation reference for the new vma and decrementing the reference
before closing the vma, so that vma_close() won't update the reservation
counter.  This is also what copy_vma_and_data() does with the source vma
if copy_vma() succeeds, so a helper function has been added to do the
fixup in both functions.

The issue was reported by a private syzbot instance and can be reproduced
using the C reproducer in [1].  It's also a possible duplicate of public
syzbot report [2].  The WARNING report is:

============================================================
page_counter underflow: -1024 nr_pages=1024
WARNING: CPU: 0 PID: 3287 at mm/page_counter.c:61 page_counter_cancel+0xf6/0x120
Modules linked in:
CPU: 0 UID: 0 PID: 3287 Comm: repro__WARNING_ Not tainted 6.15.0-rc7+ #54 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-2-gc13ff2cd-prebuilt.qemu.org 04/01/2014
RIP: 0010:page_counter_cancel+0xf6/0x120
Code: ff 5b 41 5e 41 5f 5d c3 cc cc cc cc e8 f3 4f 8f ff c6 05 64 01 27 06 01 48 c7 c7 60 15 f8 85 48 89 de 4c 89 fa e8 2a a7 51 ff <0f> 0b e9 66 ff ff ff 44 89 f9 80 e1 07 38 c1 7c 9d 4c 81
RSP: 0018:ffffc900025df6a0 EFLAGS: 00010246
RAX: 2edfc409ebb44e00 RBX: fffffffffffffc00 RCX: ffff8880155f0000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: dffffc0000000000 R08: ffffffff81c4a23c R09: 1ffff1100330482a
R10: dffffc0000000000 R11: ffffed100330482b R12: 0000000000000000
R13: ffff888058a882c0 R14: ffff888058a882c0 R15: 0000000000000400
FS:  0000000000000000(0000) GS:ffff88808fc53000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004b33e0 CR3: 00000000076d6000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 page_counter_uncharge+0x33/0x80
 hugetlb_cgroup_uncharge_counter+0xcb/0x120
 hugetlb_vm_op_close+0x579/0x960
 ? __pfx_hugetlb_vm_op_close+0x10/0x10
 remove_vma+0x88/0x130
 exit_mmap+0x71e/0xe00
 ? __pfx_exit_mmap+0x10/0x10
 ? __mutex_unlock_slowpath+0x22e/0x7f0
 ? __pfx_exit_aio+0x10/0x10
 ? __up_read+0x256/0x690
 ? uprobe_clear_state+0x274/0x290
 ? mm_update_next_owner+0xa9/0x810
 __mmput+0xc9/0x370
 exit_mm+0x203/0x2f0
 ? __pfx_exit_mm+0x10/0x10
 ? taskstats_exit+0x32b/0xa60
 do_exit+0x921/0x2740
 ? do_raw_spin_lock+0x155/0x3b0
 ? __pfx_do_exit+0x10/0x10
 ? __pfx_do_raw_spin_lock+0x10/0x10
 ? _raw_spin_lock_irq+0xc5/0x100
 do_group_exit+0x20c/0x2c0
 get_signal+0x168c/0x1720
 ? __pfx_get_signal+0x10/0x10
 ? schedule+0x165/0x360
 arch_do_signal_or_restart+0x8e/0x7d0
 ? __pfx_arch_do_signal_or_restart+0x10/0x10
 ? __pfx___se_sys_futex+0x10/0x10
 syscall_exit_to_user_mode+0xb8/0x2c0
 do_syscall_64+0x75/0x120
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x422dcd
Code: Unable to access opcode bytes at 0x422da3.
RSP: 002b:00007ff266cdb208 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: 0000000000000001 RBX: 00007ff266cdbcdc RCX: 0000000000422dcd
RDX: 00000000000f4240 RSI: 0000000000000081 RDI: 00000000004c7bec
RBP: 00007ff266cdb220 R08: 203a6362696c6720 R09: 203a6362696c6720
R10: 0000200000c00000 R11: 0000000000000246 R12: ffffffffffffffd0
R13: 0000000000000002 R14: 00007ffe1cb5f520 R15: 00007ff266cbb000
 </TASK>
============================================================

Link: https://lkml.kernel.org/r/20250523-warning_in_page_counter_cancel-v2-1-b6df1a8cfefd@igalia.com
Link: https://people.igalia.com/rcn/kernel_logs/20250422__WARNING_in_page_counter_cancel__repro.c [1]
Link: https://lore.kernel.org/all/67000a50.050a0220.49194.048d.GAE@google.com/ [2]
Signed-off-by: Ricardo Cañuelo Navarro <rcn@igalia.com>
Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Florent Revest <revest@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-25 00:53:49 -07:00
Breno Leitao
06717a7b6c memcg: always call cond_resched() after fn()
I am seeing soft lockup on certain machine types when a cgroup OOMs.  This
is happening because killing the process in certain machine might be very
slow, which causes the soft lockup and RCU stalls.  This happens usually
when the cgroup has MANY processes and memory.oom.group is set.

Example I am seeing in real production:

       [462012.244552] Memory cgroup out of memory: Killed process 3370438 (crosvm) ....
       ....
       [462037.318059] Memory cgroup out of memory: Killed process 4171372 (adb) ....
       [462037.348314] watchdog: BUG: soft lockup - CPU#64 stuck for 26s! [stat_manager-ag:1618982]
       ....

Quick look at why this is so slow, it seems to be related to serial flush
for certain machine types.  For all the crashes I saw, the target CPU was
at console_flush_all().

In the case above, there are thousands of processes in the cgroup, and it
is soft locking up before it reaches the 1024 limit in the code (which
would call the cond_resched()).  So, cond_resched() in 1024 blocks is not
sufficient.

Remove the counter-based conditional rescheduling logic and call
cond_resched() unconditionally after each task iteration, after fn() is
called.  This avoids the lockup independently of how slow fn() is.

Link: https://lkml.kernel.org/r/20250523-memcg_fix-v1-1-ad3eafb60477@debian.org
Fixes: ade81479c7 ("memcg: fix soft lockup in the OOM process")
Signed-off-by: Breno Leitao <leitao@debian.org>
Suggested-by: Rik van Riel <riel@surriel.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Michael van der Westhuizen <rmikey@meta.com>
Cc: Usama Arif <usamaarif642@gmail.com>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Chen Ridong <chenridong@huawei.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-25 00:53:49 -07:00
Ge Yang
113ed54ad2 mm/hugetlb: fix kernel NULL pointer dereference when replacing free hugetlb folios
A kernel crash was observed when replacing free hugetlb folios:

BUG: kernel NULL pointer dereference, address: 0000000000000028
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 28 UID: 0 PID: 29639 Comm: test_cma.sh Tainted 6.15.0-rc6-zp #41 PREEMPT(voluntary)
RIP: 0010:alloc_and_dissolve_hugetlb_folio+0x1d/0x1f0
RSP: 0018:ffffc9000b30fa90 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 0000000000342cca RCX: ffffea0043000000
RDX: ffffc9000b30fb08 RSI: ffffea0043000000 RDI: 0000000000000000
RBP: ffffc9000b30fb20 R08: 0000000000001000 R09: 0000000000000000
R10: ffff88886f92eb00 R11: 0000000000000000 R12: ffffea0043000000
R13: 0000000000000000 R14: 00000000010c0200 R15: 0000000000000004
FS:  00007fcda5f14740(0000) GS:ffff8888ec1d8000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000028 CR3: 0000000391402000 CR4: 0000000000350ef0
Call Trace:
<TASK>
 replace_free_hugepage_folios+0xb6/0x100
 alloc_contig_range_noprof+0x18a/0x590
 ? srso_return_thunk+0x5/0x5f
 ? down_read+0x12/0xa0
 ? srso_return_thunk+0x5/0x5f
 cma_range_alloc.constprop.0+0x131/0x290
 __cma_alloc+0xcf/0x2c0
 cma_alloc_write+0x43/0xb0
 simple_attr_write_xsigned.constprop.0.isra.0+0xb2/0x110
 debugfs_attr_write+0x46/0x70
 full_proxy_write+0x62/0xa0
 vfs_write+0xf8/0x420
 ? srso_return_thunk+0x5/0x5f
 ? filp_flush+0x86/0xa0
 ? srso_return_thunk+0x5/0x5f
 ? filp_close+0x1f/0x30
 ? srso_return_thunk+0x5/0x5f
 ? do_dup2+0xaf/0x160
 ? srso_return_thunk+0x5/0x5f
 ksys_write+0x65/0xe0
 do_syscall_64+0x64/0x170
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

There is a potential race between __update_and_free_hugetlb_folio() and
replace_free_hugepage_folios():

CPU1                              CPU2
__update_and_free_hugetlb_folio   replace_free_hugepage_folios
                                    folio_test_hugetlb(folio)
                                    -- It's still hugetlb folio.

  __folio_clear_hugetlb(folio)
  hugetlb_free_folio(folio)
                                    h = folio_hstate(folio)
                                    -- Here, h is NULL pointer

When the above race condition occurs, folio_hstate(folio) returns NULL,
and subsequent access to this NULL pointer will cause the system to crash.
To resolve this issue, execute folio_hstate(folio) under the protection
of the hugetlb_lock lock, ensuring that folio_hstate(folio) does not
return NULL.

Link: https://lkml.kernel.org/r/1747884137-26685-1-git-send-email-yangge1116@126.com
Fixes: 04f13d241b ("mm: replace free hugepage folios after migration")
Signed-off-by: Ge Yang <yangge1116@126.com>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <21cnbao@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-25 00:53:48 -07:00
Kees Cook
70d1eb031a mm: vmalloc: only zero-init on vrealloc shrink
The common case is to grow reallocations, and since init_on_alloc will
have already zeroed the whole allocation, we only need to zero when
shrinking the allocation.

Link: https://lkml.kernel.org/r/20250515214217.619685-2-kees@kernel.org
Fixes: a0309faf1c ("mm: vmalloc: support more granular vrealloc() sizing")
Signed-off-by: Kees Cook <kees@kernel.org>
Tested-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: "Erhard F." <erhard_f@mailbox.org>
Cc: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-25 00:53:48 -07:00
Kees Cook
f7a35a3c36 mm: vmalloc: actually use the in-place vrealloc region
Patch series "mm: vmalloc: Actually use the in-place vrealloc region".

This fixes a performance regression[1] with vrealloc()[1].


The refactoring to not build a new vmalloc region only actually worked
when shrinking.  Actually return the resized area when it grows.  Ugh.

Link: https://lkml.kernel.org/r/20250515214217.619685-1-kees@kernel.org
Fixes: a0309faf1c ("mm: vmalloc: support more granular vrealloc() sizing")
Signed-off-by: Kees Cook <kees@kernel.org>
Reported-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Closes: https://lore.kernel.org/all/20250515-bpf-verifier-slowdown-vwo2meju4cgp2su5ckj@6gi6ssxbnfqg [1]
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Tested-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Tested-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Reviewed-by: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Cc: "Erhard F." <erhard_f@mailbox.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-25 00:53:48 -07:00
Suren Baghdasaryan
12ca42c237 alloc_tag: allocate percpu counters for module tags dynamically
When a module gets unloaded it checks whether any of its tags are still in
use and if so, we keep the memory containing module's allocation tags
alive until all tags are unused.  However percpu counters referenced by
the tags are freed by free_module().  This will lead to UAF if the memory
allocated by a module is accessed after module was unloaded.

To fix this we allocate percpu counters for module allocation tags
dynamically and we keep it alive for tags which are still in use after
module unloading.  This also removes the requirement of a larger
PERCPU_MODULE_RESERVE when memory allocation profiling is enabled because
percpu memory for counters does not need to be reserved anymore.

Link: https://lkml.kernel.org/r/20250517000739.5930-1-surenb@google.com
Fixes: 0db6f8d782 ("alloc_tag: load module tags into separate contiguous memory")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reported-by: David Wang <00107082@163.com>
Closes: https://lore.kernel.org/all/20250516131246.6244-1-00107082@163.com/
Tested-by: David Wang <00107082@163.com>
Cc: Christoph Lameter (Ampere) <cl@gentwo.org>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-25 00:53:48 -07:00
David Wang
221fcbf775 module: release codetag section when module load fails
When module load fails after memory for codetag section is ready, codetag
section memory will not be properly released.  This causes memory leak,
and if next module load happens to get the same module address, codetag
may pick the uninitialized section when manipulating tags during module
unload, and leads to "unable to handle page fault" BUG.

Link: https://lkml.kernel.org/r/20250519163823.7540-1-00107082@163.com
Fixes: 0db6f8d782 ("alloc_tag: load module tags into separate contiguous memory")
Closes: https://lore.kernel.org/all/20250516131246.6244-1-00107082@163.com/
Signed-off-by: David Wang <00107082@163.com>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Cc: Petr Pavlu <petr.pavlu@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-25 00:53:47 -07:00
Mike Rapoport (Microsoft)
07c9214c79 mm/cma: make detection of highmem_start more robust
Pratyush Yadav reports the following crash:

    ------------[ cut here ]------------
    kernel BUG at arch/x86/mm/physaddr.c:23!
    ception 0x06 IP 10:ffffffff812ebbf8 error 0 cr2 0xffff88903ffff000
    CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.15.0-rc6+ #231 PREEMPT(undef)
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
    RIP: 0010:__phys_addr+0x58/0x60
    Code: 01 48 89 c2 48 d3 ea 48 85 d2 75 05 e9 91 52 cf 00 0f 0b 48 3d ff ff ff 1f 77 0f 48 8b 05 20 54 55 01 48 01 d0 e9 78 52 cf 00 <0f> 0b 90 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90
    RSP: 0000:ffffffff82803dd8 EFLAGS: 00010006 ORIG_RAX: 0000000000000000
    RAX: 000000007fffffff RBX: 00000000ffffffff RCX: 0000000000000000
    RDX: 000000007fffffff RSI: 0000000280000000 RDI: ffffffffffffffff
    RBP: ffffffff82803e68 R08: 0000000000000000 R09: 0000000000000000
    R10: ffffffff83153180 R11: ffffffff82803e48 R12: ffffffff83c9aed0
    R13: 0000000000000000 R14: 0000001040000000 R15: 0000000000000000
    FS:  0000000000000000(0000) GS:0000000000000000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffff88903ffff000 CR3: 0000000002838000 CR4: 00000000000000b0
    Call Trace:
     <TASK>
     ? __cma_declare_contiguous_nid+0x6e/0x340
     ? cma_declare_contiguous_nid+0x33/0x70
     ? dma_contiguous_reserve_area+0x2f/0x70
     ? setup_arch+0x6f1/0x870
     ? start_kernel+0x52/0x4b0
     ? x86_64_start_reservations+0x29/0x30
     ? x86_64_start_kernel+0x7c/0x80
     ? common_startup_64+0x13e/0x141

  The reason is that __cma_declare_contiguous_nid() does:

          highmem_start = __pa(high_memory - 1) + 1;

  If dma_contiguous_reserve_area() (or any other CMA declaration) is
  called before free_area_init(), high_memory is uninitialized. Without
  CONFIG_DEBUG_VIRTUAL, it will likely work but use the wrong value for
  highmem_start.

The issue occurs because commit e120d1bc12 ("arch, mm: set high_memory
in free_area_init()") moved initialization of high_memory after the call
to dma_contiguous_reserve() -> __cma_declare_contiguous_nid() on several
architectures.

In the case CONFIG_HIGHMEM is enabled, some architectures that actually
support HIGHMEM (arm, powerpc and x86) have initialization of high_memory
before a possible call to __cma_declare_contiguous_nid() and some
initialized high_memory late anyway (arc, csky, microblase, mips, sparc,
xtensa) even before the commit e120d1bc12 so they are fine with using
uninitialized value of high_memory.

And in the case CONFIG_HIGHMEM is disabled high_memory essentially becomes
the first address after memory end, so instead of relying on high_memory
to calculate highmem_start use memblock_end_of_DRAM() and eliminate the
dependency of CMA area creation on high_memory in majority of
configurations.

Link: https://lkml.kernel.org/r/20250519171805.1288393-1-rppt@kernel.org
Fixes: e120d1bc12 ("arch, mm: set high_memory in free_area_init()")
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reported-by: Pratyush Yadav <ptyadav@amazon.de>
Tested-by: Pratyush Yadav <ptyadav@amazon.de>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-25 00:53:47 -07:00
Linus Torvalds
d0c22de999 Merge tag 'input-for-v6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:

 - even more Xbox controllers added to xpad driver: Turtle Beach Recon
   Wired Controller, Turtle Beach Stealth Ultra, and PowerA Wired
   Controller

 - a fix to Synaptics RMI driver to not crash if controller reports
   unsupported version of F34 (firmware flash) function

* tag 'input-for-v6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: synaptics-rmi - fix crash with unsupported versions of F34
  Input: xpad - add more controllers
2025-05-24 18:54:18 -07:00
Linus Torvalds
95a9580d58 Merge tag 'spi-fix-v6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
 "A few final fixes for v6.15, some driver fixes for the Freescale DSPI
  driver pulled over from their vendor code and another instance of the
  fixes Greg has been sending throughout the kernel for constification
  of the bus_type in driver core match() functions"

* tag 'spi-fix-v6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: spi-fsl-dspi: Reset SR flags before sending a new message
  spi: spi-fsl-dspi: Halt the module after a new message transfer
  spi: spi-fsl-dspi: restrict register range for regmap access
  spi: use container_of_cont() for to_spi_device()
2025-05-24 18:48:17 -07:00
Linus Torvalds
b1427432d3 Merge tag 'iommu-fixes-v6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu fix from Joerg Roedel:

 - core: skip PASID validation for devices without PASID support

* tag 'iommu-fixes-v6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
  iommu: Skip PASID validation for devices without PASID capability
2025-05-24 09:01:41 -07:00
Linus Torvalds
4856ebd997 Merge tag 'drm-fixes-2025-05-24' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
 "Weekly drm fixes pull, on target to be quiet, just one amdgpu, one
  edid and a few minor xe fixes.

  edid:
   - fix HDR metadata reset

  amdgpu:
   - Hibernate fix

  xe:
   - Make sure to check all forcewakes when dumping mocs
   - Fix wrong use of read64 on 32b register
   - Synchronize Panther Lake PCI IDs"

* tag 'drm-fixes-2025-05-24' of https://gitlab.freedesktop.org/drm/kernel:
  drm/xe/ptl: Update the PTL pci id table
  drm/xe: Use xe_mmio_read32() to read mtcfg register
  drm/xe/mocs: Check if all domains awake
  Revert "drm/amd: Keep display off while going into S4"
  drm/edid: fixed the bug that hdr metadata was not reset
2025-05-23 15:17:55 -07:00
Dave Airlie
fe1e5a1f2d Merge tag 'drm-xe-fixes-2025-05-23' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- Make sure to check all forcewakes when dumping mocs
- Fix wrong use of read64 on 32b register
- Synchronize Panther Lake PCI IDs

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/uixp5cq7emz32lmwwvq4vbujppugfozhyj3cm2aqzx4lcg7ivn@m2khvf4kvz5p
2025-05-24 07:42:23 +10:00
Dave Airlie
4731d5e835 Merge tag 'amd-drm-fixes-6.15-2025-05-22' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.15-2025-05-22:

amdgpu:
- Hibernate fix

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250522183941.9606-1-alexander.deucher@amd.com
2025-05-24 07:42:00 +10:00
Dave Airlie
79061ca8bf Merge tag 'drm-misc-fixes-2025-05-22' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:

edid:
- fix HDR metadata reset

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20250522113902.GA7000@localhost.localdomain
2025-05-24 07:37:53 +10:00
Linus Torvalds
7586ac7c34 Merge tag 'thermal-6.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fix from Rafael Wysocki:
 "This fixes a coding mistake in the x86_pkg_temp_thermal Intel thermal
  driver that was introduced by an incorrect conflict resolution during
  a merge (Zhang Rui)"

* tag 'thermal-6.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: intel: x86_pkg_temp_thermal: Fix bogus trip temperature
2025-05-23 09:47:43 -07:00
Linus Torvalds
e0f8e1a7c1 Merge tag 'v6.15-rc8-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:

 - Fix for rename regression due to the recent VFS lookup changes

 - Fix write failure

 - locking fix for oplock handling

* tag 'v6.15-rc8-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: use list_first_entry_or_null for opinfo_get_list()
  ksmbd: fix rename failure
  ksmbd: fix stream write failure
2025-05-23 08:42:29 -07:00
Linus Torvalds
3d0ebc36b0 Merge tag 'soc-fixes-6.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
 "A few last minute fixes:

   - two driver fixes for samsung/google platforms, both addressing
     mistakes in changes from the 6.15 merge window

   - a revert for an allwinner devicetree change that caused problems

   - a fix for an older regression with the LEDs on Marvell Armada 3720

   - a defconfig change to enable chacha20 again after a crypto
     subsystem change in 6.15 inadventently turned it off"

* tag 'soc-fixes-6.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  arm64: defconfig: Ensure CRYPTO_CHACHA20_NEON is selected
  arm64: dts: marvell: uDPU: define pinctrl state for alarm LEDs
  Revert "arm64: dts: allwinner: h6: Use RSB for AXP805 PMIC connection"
  soc: samsung: usi: prevent wrong bits inversion during unconfiguring
  firmware: exynos-acpm: check saved RX before bailing out on empty RX queue
2025-05-23 08:04:13 -07:00
Linus Torvalds
79cd89207c Merge tag 'platform-drivers-x86-v6.15-6' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform drivers fixes from Ilpo Järvinen:

 - dell-wmi-sysman: Avoid buffer overflow in current_password_store()

 - fujitsu-laptop: Support Lifebook S2110 hotkeys

 - intel/pmc: Fix Arrow Lake U/H NPU PCI ID

 - think-lmi: Fix attribute name usage for non-compliant items

 - thinkpad_acpi: Ignore battery threshold change event notification

* tag 'platform-drivers-x86-v6.15-6' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86/intel/pmc: Fix Arrow Lake U/H NPU PCI ID
  platform/x86: think-lmi: Fix attribute name usage for non-compliant items
  platform/x86: thinkpad_acpi: Ignore battery threshold change event notification
  platform/x86: dell-wmi-sysman: Avoid buffer overflow in current_password_store()
  platform/x86: fujitsu-laptop: Support Lifebook S2110 hotkeys
2025-05-23 07:59:39 -07:00
Linus Torvalds
eccf6f2f6a Merge tag 'vfs-6.15-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
 "This contains a small set of fixes for the blocking buffer lookup
  conversion done earlier this cycle.

  It adds a missing conversion in the getblk slowpath and a few minor
  optimizations and cleanups"

* tag 'vfs-6.15-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs/buffer: optimize discard_buffer()
  fs/buffer: remove superfluous statements
  fs/buffer: avoid redundant lookup in getblk slowpath
  fs/buffer: use sleeping lookup in __getblk_slowpath()
2025-05-23 07:51:05 -07:00
Todd Brandt
f2eae58c44 platform/x86/intel/pmc: Fix Arrow Lake U/H NPU PCI ID
The ARL requires that the GMA and NPU devices both be in D3Hot in order
for PC10 and S0iX to be achieved in S2idle. The original ARL-H/U addition
to the intel_pmc_core driver attempted to do this by switching them to D3
in the init and resume calls of the intel_pmc_core driver.

The problem is the ARL-H/U have a different NPU device and thus are not
being properly set and thus S0iX does not work properly in ARL-H/U. This
patch creates a new ARL-H specific device id that is correct and also
adds the D3 fixup to the suspend callback. This way if the PCI devies
drop from D3 to D0 after resume they can be corrected for the next
suspend. Thus there is no dropout in S0iX.

Fixes: bd820906ea ("platform/x86/intel/pmc: Add Arrow Lake U/H support to intel_pmc_core driver")
Signed-off-by: Todd Brandt <todd.e.brandt@intel.com>
Link: https://lore.kernel.org/r/a61f78be45c13f39e122dcc684b636f4b21e79a0.1747737446.git.todd.e.brandt@intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-05-23 12:04:54 +03:00
Matt Atwood
027a362fb3 drm/xe/ptl: Update the PTL pci id table
Update to current bspec table.

Bspec: 72574

Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com>
Link: https://lore.kernel.org/r/20250520195749.371748-1-matthew.s.atwood@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit 49c6dc74b5)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-22 20:54:15 -07:00
Shuicheng Lin
84b6f8503b drm/xe: Use xe_mmio_read32() to read mtcfg register
The mtcfg register is a 32-bit register and should therefore be
accessed using xe_mmio_read32().

Other 3 changes per codestyle suggestion:
"
xe_mmio.c:83: CHECK: Alignment should match open parenthesis
xe_mmio.c:131: CHECK: Comparison to NULL could be written "!xe->mmio.regs"
xe_mmio.c:315: CHECK: line length of 103 exceeds 100 columns
"

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://lore.kernel.org/r/20250513153010.3464767-1-shuicheng.lin@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit d2662cf8f4)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-22 20:49:55 -07:00
Tejas Upadhyay
57b34cba8e drm/xe/mocs: Check if all domains awake
Check if all domains are awake specially for
LNCF regs

Fixes: 298661cd9c ("drm/xe: Fix MOCS debugfs LNCF readout")
Improvements-suggested-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250506142300.1865783-1-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
(cherry picked from commit a383cf218e)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-22 20:48:50 -07:00
Linus Torvalds
040c0f6a18 Merge tag 'bcachefs-2025-05-22' of git://evilpiepirate.org/bcachefs
Pull bcachefs fixes from Kent Overstreet:
 "Small stuff, main ones users will be interested in:

   - Couple more casefolding fixes; we can now detect and repair
     casefolded dirents in non-casefolded dir and vice versa

   - Fix for massive write inflation with mmapped io, which hit certain
     databases"

* tag 'bcachefs-2025-05-22' of git://evilpiepirate.org/bcachefs:
  bcachefs: Check for casefolded dirents in non casefolded dirs
  bcachefs: Fix bch2_dirent_create_snapshot() for casefolding
  bcachefs: Fix casefold opt via xattr interface
  bcachefs: mkwrite() now only dirties one page
  bcachefs: fix extent_has_stripe_ptr()
  bcachefs: Fix bch2_btree_path_traverse_cached() when paths realloced
2025-05-22 19:17:55 -07:00
Linus Torvalds
94305e83ec Merge tag 'pmdomain-v6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain fixes from Ulf Hansson:
 "Core:

   - Fix error checking in genpd_dev_pm_attach_by_id()

  Providers:

   - renesas: Remove obsolete nullify checks for rcar domains"

* tag 'pmdomain-v6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
  pmdomain: core: Fix error checking in genpd_dev_pm_attach_by_id()
  pmdomain: renesas: rcar: Remove obsolete nullify checks
2025-05-22 16:25:23 -07:00
Linus Torvalds
a5b2c67af7 Merge tag 'mmc-v6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC host fixes from Ulf Hansson:

 - sdhci_am654: Fix MMC init failures on am62x boards

 - sdhci-of-dwcmshc: Add PD workaround on RK3576 to avoid hang

* tag 'mmc-v6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: sdhci_am654: Add SDHCI_QUIRK2_SUPPRESS_V1P8_ENA quirk to am62 compatible
  mmc: sdhci-of-dwcmshc: add PD workaround on RK3576
2025-05-22 16:15:59 -07:00
Linus Torvalds
a11a722298 Merge tag 'block-6.15-20250522' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:

 - Fix for a regression with setting up loop on a file system
   without ->write_iter()

 - Fix for an nvme sysfs regression

* tag 'block-6.15-20250522' of git://git.kernel.dk/linux:
  nvme: avoid creating multipath sysfs group under namespace path devices
  loop: don't require ->write_iter for writable files in loop_configure
2025-05-22 13:08:21 -07:00
Linus Torvalds
ab719cc7f5 Merge tag 'io_uring-6.15-20250522' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:

 - Kill a duplicate function definition, which can cause linking issues
   in certain .config configurations. Introduced in this cycle.

 - Fix for a potential overflow CQE reordering issue if a re-schedule is
   done during posting. Heading to stable.

 - Fix for an issue with recv bundles, where certain conditions can lead
   to gaps in the buffers, where a contiguous buffer range was expected.
   Heading to stable.

* tag 'io_uring-6.15-20250522' of git://git.kernel.dk/linux:
  io_uring/net: only retry recv bundle for a full transfer
  io_uring: fix overflow resched cqe reordering
  io_uring/cmd: axe duplicate io_uring_cmd_import_fixed_vec() declaration
2025-05-22 13:05:00 -07:00
Linus Torvalds
e85dea591f Merge tag '6.15-rc8-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:

 - Two fixes for use after free in readdir code paths

* tag '6.15-rc8-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb: client: Reset all search buffer pointers when releasing buffer
  smb: client: Fix use-after-free in cifs_fill_dirent
2025-05-22 12:35:16 -07:00
Linus Torvalds
5cdb2c77c4 Merge tag 'net-6.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
 "This is somewhat larger than what I hoped for, with a few PRs from
  subsystems and follow-ups for the recent netdev locking changes,
  anyhow there are no known pending regressions.

  Including fixes from bluetooth, ipsec and CAN.

  Current release - regressions:

   - eth: team: grab team lock during team_change_rx_flags

   - eth: bnxt_en: fix netdev locking in ULP IRQ functions

  Current release - new code bugs:

   - xfrm: ipcomp: fix truesize computation on receive

   - eth: airoha: fix page recycling in airoha_qdma_rx_process()

  Previous releases - regressions:

   - sched: hfsc: fix qlen accounting bug when using peek in
     hfsc_enqueue()

   - mr: consolidate the ipmr_can_free_table() checks.

   - bridge: netfilter: fix forwarding of fragmented packets

   - xsk: bring back busy polling support in XDP_COPY

   - can:
       - add missing rcu read protection for procfs content
       - kvaser_pciefd: force IRQ edge in case of nested IRQ

  Previous releases - always broken:

   - xfrm: espintcp: remove encap socket caching to avoid reference leak

   - bluetooth: use skb_pull to avoid unsafe access in QCA dump handling

   - eth: idpf:
       - fix null-ptr-deref in idpf_features_check
       - fix idpf_vport_splitq_napi_poll()

   - eth: hibmcge: fix wrong ndo.open() after reset fail issue"

* tag 'net-6.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (40 commits)
  octeontx2-af: Fix APR entry mapping based on APR_LMT_CFG
  octeontx2-af: Set LMT_ENA bit for APR table entries
  net/tipc: fix slab-use-after-free Read in tipc_aead_encrypt_done
  octeontx2-pf: Avoid adding dcbnl_ops for LBK and SDP vf
  selftests/tc-testing: Add an HFSC qlen accounting test
  sch_hfsc: Fix qlen accounting bug when using peek in hfsc_enqueue()
  idpf: fix idpf_vport_splitq_napi_poll()
  net: hibmcge: fix wrong ndo.open() after reset fail issue.
  net: hibmcge: fix incorrect statistics update issue
  xsk: Bring back busy polling support in XDP_COPY
  can: slcan: allow reception of short error messages
  net: lan743x: Restore SGMII CTRL register on resume
  bnxt_en: Fix netdev locking in ULP IRQ functions
  MAINTAINERS: Drop myself to reviewer for ravb driver
  net: dwmac-sun8i: Use parsed internal PHY address instead of 1
  net: ethernet: ti: am65-cpsw: Lower random mac address error print to info
  can: kvaser_pciefd: Continue parsing DMA buf after dropped RX
  can: kvaser_pciefd: Fix echo_skb race
  can: kvaser_pciefd: Force IRQ edge in case of nested IRQ
  idpf: fix null-ptr-deref in idpf_features_check
  ...
2025-05-22 09:15:19 -07:00
Mario Limonciello
7e7cb7a13c Revert "drm/amd: Keep display off while going into S4"
commit 68bfdc8dc0 ("drm/amd: Keep display off while going into S4")
attempted to keep displays off during the S4 sequence by not resuming
display IP.  This however leads to hangs because DRM clients such as the
console can try to access registers and cause a hang.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4155
Fixes: 68bfdc8dc0 ("drm/amd: Keep display off while going into S4")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250522141328.115095-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e485502c37)
Cc: stable@vger.kernel.org
2025-05-22 12:13:51 -04:00
Linus Torvalds
b1819ae85e Merge tag 'pinctrl-v6.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
 "This deals with a crash in the Qualcomm pin controller GPIO
  parts when using hogs.

  The first patch to gpiolib makes gpiochip_line_is_valid()
  NULL-tolerant.

  The second patch fixes the actual problem"

* tag 'pinctrl-v6.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: qcom: switch to devm_register_sys_off_handler()
  gpiolib: don't crash on enabling GPIO HOG pins
2025-05-22 09:08:54 -07:00
Linus Torvalds
0e9579507f Merge tag 'sound-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "A collection of small fixes for 6.15 final. It became slightly a
  higher amount than expected, but all look easy and safe to apply:

   - A fix for PCM core race spotted by fuzzing

   - ASoC topology fix for single DAI link

   - UAF fix for ASoC SOF Intel HD-audio at reloading

   - ASoC SOF Intel and Mediatek fixes

   - Trivial HD-audio quirks as usual"

* tag 'sound-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek - Add new HP ZBook laptop with micmute led fixup
  ALSA: hda/realtek: Add support for HP Agusta using CS35L41 HDA
  ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7 14ASP10
  ALSA: hda/realtek - restore auto-mute mode for Dell Chrome platform
  ALSA: pcm: Fix race of buffer access at PCM OSS layer
  ASoC: SOF: Intel: hda: Fix UAF when reloading module
  ASoc: SOF: topology: connect DAI to a single DAI link
  ASoC: SOF: Intel: hda-bus: Use PIO mode on ACE2+ platforms
  ASoC: SOF: ipc4-pcm: Delay reporting is only supported for playback direction
  ASoC: SOF: ipc4-control: Use SOF_CTRL_CMD_BINARY as numid for bytes_ext
  ASoC: mediatek: mt8188-mt6359: Depend on MT6359_ACCDET set or disabled
  ASoC: mediatek: mt8188-mt6359: select CONFIG_SND_SOC_MT6359_ACCDET
2025-05-22 09:05:29 -07:00
Jens Axboe
115c011f5d Merge tag 'nvme-6.15-2025-05-22' of git://git.infradead.org/nvme into block-6.15
Pull NVMe fix from Christoph:

"nvme fixes for Linux 6.15

 - do not create the newly added multipath sysfs group for
   non-multipath nodes (Nilay Shroff)"

* tag 'nvme-6.15-2025-05-22' of git://git.infradead.org/nvme:
  nvme: avoid creating multipath sysfs group under namespace path devices
2025-05-22 09:25:47 -06:00
Larisa Grigore
7aba292eb1 spi: spi-fsl-dspi: Reset SR flags before sending a new message
If, in a previous transfer, the controller sends more data than expected
by the DSPI target, SR.RFDF (RX FIFO is not empty) will remain asserted.
When flushing the FIFOs at the beginning of a new transfer (writing 1
into MCR.CLR_TXF and MCR.CLR_RXF), SR.RFDF should also be cleared.
Otherwise, when running in target mode with DMA, if SR.RFDF remains
asserted, the DMA callback will be fired before the controller sends any
data.

Take this opportunity to reset all Status Register fields.

Fixes: 5ce3cc5674 ("spi: spi-fsl-dspi: Provide support for DSPI slave mode operation (Vybryd vf610)")
Signed-off-by: Larisa Grigore <larisa.grigore@nxp.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Link: https://patch.msgid.link/20250522-james-nxp-spi-v2-3-bea884630cfb@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-22 16:05:26 +01:00
Bogdan-Gabriel Roman
8a30a6d35a spi: spi-fsl-dspi: Halt the module after a new message transfer
The XSPI mode implementation in this driver still uses the EOQ flag to
signal the last word in a transmission and deassert the PCS signal.
However, at speeds lower than ~200kHZ, the PCS signal seems to remain
asserted even when SR[EOQF] = 1 indicates the end of a transmission.
This is a problem for target devices which require the deassertation of
the PCS signal between transfers.

Hence, this commit 'forces' the deassertation of the PCS by stopping the
module through MCR[HALT] after completing a new transfer. According to
the reference manual, the module stops or transitions from the Running
state to the Stopped state after the current frame, when any one of the
following conditions exist:
- The value of SR[EOQF] = 1.
- The chip is in Debug mode and the value of MCR[FRZ] = 1.
- The value of MCR[HALT] = 1.

This shouldn't be done if the last transfer in the message has cs_change
set.

Fixes: ea93ed4c18 ("spi: spi-fsl-dspi: Use EOQ for last word in buffer even for XSPI mode")
Signed-off-by: Bogdan-Gabriel Roman <bogdan-gabriel.roman@nxp.com>
Signed-off-by: Larisa Grigore <larisa.grigore@nxp.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Link: https://patch.msgid.link/20250522-james-nxp-spi-v2-2-bea884630cfb@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-22 16:05:25 +01:00
Larisa Grigore
283ae0c65e spi: spi-fsl-dspi: restrict register range for regmap access
DSPI registers are NOT continuous, some registers are reserved and
accessing them from userspace will trigger external abort, add regmap
register access table to avoid below abort.

  For example on S32G:

  # cat /sys/kernel/debug/regmap/401d8000.spi/registers

  Internal error: synchronous external abort: 96000210 1 PREEMPT SMP
  ...
  Call trace:
  regmap_mmio_read32le+0x24/0x48
  regmap_mmio_read+0x48/0x70
  _regmap_bus_reg_read+0x38/0x48
  _regmap_read+0x68/0x1b0
  regmap_read+0x50/0x78
  regmap_read_debugfs+0x120/0x338

Fixes: 1acbdeb92c ("spi/fsl-dspi: Convert to use regmap and add big-endian support")
Co-developed-by: Xulin Sun <xulin.sun@windriver.com>
Signed-off-by: Xulin Sun <xulin.sun@windriver.com>
Signed-off-by: Larisa Grigore <larisa.grigore@nxp.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Link: https://patch.msgid.link/20250522-james-nxp-spi-v2-1-bea884630cfb@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-22 16:05:24 +01:00
Greg Kroah-Hartman
1007ae0d46 spi: use container_of_cont() for to_spi_device()
Some places in the spi core pass in a const pointer to a device and the
default container_of() casts that away, which is not a good idea.
Preserve the proper const attribute by using container_of_const() for
to_spi_device() instead, which is what it was designed for.

Note, this removes the NULL check for a device pointer in the call, but
no one was ever checking for that return value, and a device pointer
should never be NULL overall anyway, so this should be a safe change.

Cc: Mark Brown <broonie@kernel.org>
Fixes: d69d804845 ("driver core: have match() callback in struct bus_type take a const *")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/2025052230-fidgeting-stooge-66f5@gregkh
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-22 12:03:28 +01:00
Paolo Abeni
3fab2d2d90 Merge tag 'linux-can-fixes-for-6.15-20250521' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:

====================
pull-request: can 2025-05-22

this is a pull request of 4 patches for net/main.

The first 3 patches are by Axel Forsman and fix a ISR race condition
in the kvaser_pciefd driver.

The last patch is by Carlos Sanchez and fixes the reception of short
error messages in the slcan driver.

* tag 'linux-can-fixes-for-6.15-20250521' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: slcan: allow reception of short error messages
  can: kvaser_pciefd: Continue parsing DMA buf after dropped RX
  can: kvaser_pciefd: Fix echo_skb race
  can: kvaser_pciefd: Force IRQ edge in case of nested IRQ
====================

Link: https://patch.msgid.link/20250522082344.490913-1-mkl@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-22 12:32:38 +02:00
Paolo Abeni
ab94003fcf Merge branch 'octeontx2-af-apr-mapping-fixes'
Geetha sowjanya says:

====================
octeontx2-af: APR Mapping Fixes

This patch series includes fixes related to APR (LMT)
mapping and debugfs support.

Changes include:

Patch 1:Set LMT_ENA bit for APR table entries.
	Enables the LMT line for each PF/VF by setting
	the LMT_ENA bit in the APR_LMT_MAP_ENTRY_S
	structure.

Patch-2:Fix APR entry in debugfs
	The APR table was previously mapped using a fixed size,
	which could lead to incorrect mappings when the number
	of PFs and VFs differed from the assumed value.
	This patch updates the logic to calculate the APR table
	size dynamically, based on values from the APR_LMT_CFG
	register, ensuring correct representation in debugfs.
====================

Link: https://patch.msgid.link/20250521060834.19780-1-gakula@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-22 12:17:45 +02:00
Geetha sowjanya
a6ae712981 octeontx2-af: Fix APR entry mapping based on APR_LMT_CFG
The current implementation maps the APR table using a fixed size,
which can lead to incorrect mapping when the number of PFs and VFs
varies.
This patch corrects the mapping by calculating the APR table
size dynamically based on the values configured in the
APR_LMT_CFG register, ensuring accurate representation
of APR entries in debugfs.

Fixes: 0daa55d033 ("octeontx2-af: cn10k: debugfs for dumping LMTST map table").
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Link: https://patch.msgid.link/20250521060834.19780-3-gakula@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-22 12:17:44 +02:00
Subbaraya Sundeep
0eefa27b49 octeontx2-af: Set LMT_ENA bit for APR table entries
This patch enables the LMT line for a PF/VF by setting the
LMT_ENA bit in the APR_LMT_MAP_ENTRY_S structure.

Additionally, it simplifies the logic for calculating the
LMTST table index by consistently using the maximum
number of hw supported VFs (i.e., 256).

Fixes: 873a1e3d20 ("octeontx2-af: cn10k: Setting up lmtst map table").
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250521060834.19780-2-gakula@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-22 12:17:35 +02:00
Paolo Abeni
bd2ec34d00 Merge tag 'ipsec-2025-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2025-05-21

1) Fix some missing kfree_skb in the error paths of espintcp.
   From Sabrina Dubroca.

2) Fix a reference leak in espintcp.
   From Sabrina Dubroca.

3) Fix UDP GRO handling for ESPINUDP.
   From Tobias Brunner.

4) Fix ipcomp truesize computation on the receive path.
   From Sabrina Dubroca.

5) Sanitize marks before policy/state insertation.
   From Paul Chaignon.

* tag 'ipsec-2025-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
  xfrm: Sanitize marks before insert
  xfrm: ipcomp: fix truesize computation on receive
  xfrm: Fix UDP GRO handling for some corner cases
  espintcp: remove encap socket caching to avoid reference leak
  espintcp: fix skb leaks
====================

Link: https://patch.msgid.link/20250521054348.4057269-1-steffen.klassert@secunet.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-22 11:49:53 +02:00
Wang Liang
e279024617 net/tipc: fix slab-use-after-free Read in tipc_aead_encrypt_done
Syzbot reported a slab-use-after-free with the following call trace:

  ==================================================================
  BUG: KASAN: slab-use-after-free in tipc_aead_encrypt_done+0x4bd/0x510 net/tipc/crypto.c:840
  Read of size 8 at addr ffff88807a733000 by task kworker/1:0/25

  Call Trace:
   kasan_report+0xd9/0x110 mm/kasan/report.c:601
   tipc_aead_encrypt_done+0x4bd/0x510 net/tipc/crypto.c:840
   crypto_request_complete include/crypto/algapi.h:266
   aead_request_complete include/crypto/internal/aead.h:85
   cryptd_aead_crypt+0x3b8/0x750 crypto/cryptd.c:772
   crypto_request_complete include/crypto/algapi.h:266
   cryptd_queue_worker+0x131/0x200 crypto/cryptd.c:181
   process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231

  Allocated by task 8355:
   kzalloc_noprof include/linux/slab.h:778
   tipc_crypto_start+0xcc/0x9e0 net/tipc/crypto.c:1466
   tipc_init_net+0x2dd/0x430 net/tipc/core.c:72
   ops_init+0xb9/0x650 net/core/net_namespace.c:139
   setup_net+0x435/0xb40 net/core/net_namespace.c:343
   copy_net_ns+0x2f0/0x670 net/core/net_namespace.c:508
   create_new_namespaces+0x3ea/0xb10 kernel/nsproxy.c:110
   unshare_nsproxy_namespaces+0xc0/0x1f0 kernel/nsproxy.c:228
   ksys_unshare+0x419/0x970 kernel/fork.c:3323
   __do_sys_unshare kernel/fork.c:3394

  Freed by task 63:
   kfree+0x12a/0x3b0 mm/slub.c:4557
   tipc_crypto_stop+0x23c/0x500 net/tipc/crypto.c:1539
   tipc_exit_net+0x8c/0x110 net/tipc/core.c:119
   ops_exit_list+0xb0/0x180 net/core/net_namespace.c:173
   cleanup_net+0x5b7/0xbf0 net/core/net_namespace.c:640
   process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231

After freed the tipc_crypto tx by delete namespace, tipc_aead_encrypt_done
may still visit it in cryptd_queue_worker workqueue.

I reproduce this issue by:
  ip netns add ns1
  ip link add veth1 type veth peer name veth2
  ip link set veth1 netns ns1
  ip netns exec ns1 tipc bearer enable media eth dev veth1
  ip netns exec ns1 tipc node set key this_is_a_master_key master
  ip netns exec ns1 tipc bearer disable media eth dev veth1
  ip netns del ns1

The key of reproduction is that, simd_aead_encrypt is interrupted, leading
to crypto_simd_usable() return false. Thus, the cryptd_queue_worker is
triggered, and the tipc_crypto tx will be visited.

  tipc_disc_timeout
    tipc_bearer_xmit_skb
      tipc_crypto_xmit
        tipc_aead_encrypt
          crypto_aead_encrypt
            // encrypt()
            simd_aead_encrypt
              // crypto_simd_usable() is false
              child = &ctx->cryptd_tfm->base;

  simd_aead_encrypt
    crypto_aead_encrypt
      // encrypt()
      cryptd_aead_encrypt_enqueue
        cryptd_aead_enqueue
          cryptd_enqueue_request
            // trigger cryptd_queue_worker
            queue_work_on(smp_processor_id(), cryptd_wq, &cpu_queue->work)

Fix this by holding net reference count before encrypt.

Reported-by: syzbot+55c12726619ff85ce1f6@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=55c12726619ff85ce1f6
Fixes: fc1b6d6de2 ("tipc: introduce TIPC encryption & authentication")
Signed-off-by: Wang Liang <wangliang74@huawei.com>
Link: https://patch.msgid.link/20250520101404.1341730-1-wangliang74@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-22 11:33:12 +02:00
Suman Ghosh
184fb40f73 octeontx2-pf: Avoid adding dcbnl_ops for LBK and SDP vf
Priority flow control is not supported for LBK and SDP vf. This patch
adds support to not add dcbnl_ops for LBK and SDP vf.

Fixes: 8e67558177 ("octeontx2-pf: PFC config support with DCBx")
Signed-off-by: Suman Ghosh <sumang@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250519072658.2960851-1-sumang@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-22 11:22:18 +02:00
Paolo Abeni
625b40dcf4 Merge branch 'net_sched-fix-hfsc-qlen-backlog-accounting-bug-and-add-selftest'
Cong Wang says:

====================
net_sched: Fix HFSC qlen/backlog accounting bug and add selftest

This series addresses a long-standing bug in the HFSC qdisc where queue length
and backlog accounting could become inconsistent if a packet is dropped during
a peek-induced dequeue operation, and adds a corresponding selftest to tc-testing.
====================

Link: https://patch.msgid.link/20250518222038.58538-1-xiyou.wangcong@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-22 11:16:53 +02:00
Cong Wang
c3572acffb selftests/tc-testing: Add an HFSC qlen accounting test
This test reproduces a scenario where HFSC queue length and backlog accounting
can become inconsistent when a peek operation triggers a dequeue and possible
drop before the parent qdisc updates its counters. The test sets up a DRR root
qdisc with an HFSC class, netem, and blackhole children, and uses Scapy to
inject a packet. It helps to verify that HFSC correctly tracks qlen and backlog
even when packets are dropped during peek-induced dequeue.

Cc: Mingi Cho <mincho@theori.io>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250518222038.58538-3-xiyou.wangcong@gmail.com
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-22 11:16:51 +02:00
Cong Wang
3f98113810 sch_hfsc: Fix qlen accounting bug when using peek in hfsc_enqueue()
When enqueuing the first packet to an HFSC class, hfsc_enqueue() calls the
child qdisc's peek() operation before incrementing sch->q.qlen and
sch->qstats.backlog. If the child qdisc uses qdisc_peek_dequeued(), this may
trigger an immediate dequeue and potential packet drop. In such cases,
qdisc_tree_reduce_backlog() is called, but the HFSC qdisc's qlen and backlog
have not yet been updated, leading to inconsistent queue accounting. This
can leave an empty HFSC class in the active list, causing further
consequences like use-after-free.

This patch fixes the bug by moving the increment of sch->q.qlen and
sch->qstats.backlog before the call to the child qdisc's peek() operation.
This ensures that queue length and backlog are always accurate when packet
drops or dequeues are triggered during the peek.

Fixes: 12d0ad3be9 ("net/sched/sch_hfsc.c: handle corner cases where head may change invalidating calculated deadline")
Reported-by: Mingi Cho <mincho@theori.io>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250518222038.58538-2-xiyou.wangcong@gmail.com
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-22 11:16:44 +02:00
Tushar Dave
b3f6fcd840 iommu: Skip PASID validation for devices without PASID capability
Generally PASID support requires ACS settings that usually create
single device groups, but there are some niche cases where we can get
multi-device groups and still have working PASID support. The primary
issue is that PCI switches are not required to treat PASID tagged TLPs
specially so appropriate ACS settings are required to route all TLPs to
the host bridge if PASID is going to work properly.

pci_enable_pasid() does check that each device that will use PASID has
the proper ACS settings to achieve this routing.

However, no-PASID devices can be combined with PASID capable devices
within the same topology using non-uniform ACS settings. In this case
the no-PASID devices may not have strict route to host ACS flags and
end up being grouped with the PASID devices.

This configuration fails to allow use of the PASID within the iommu
core code which wrongly checks if the no-PASID device supports PASID.

Fix this by ignoring no-PASID devices during the PASID validation. They
will never issue a PASID TLP anyhow so they can be ignored.

Fixes: c404f55c26 ("iommu: Validate the PASID in iommu_attach_device_pasid()")
Cc: stable@vger.kernel.org
Signed-off-by: Tushar Dave <tdave@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20250520011937.3230557-1-tdave@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-22 09:10:00 +02:00
Eric Dumazet
407e0efdf8 idpf: fix idpf_vport_splitq_napi_poll()
idpf_vport_splitq_napi_poll() can incorrectly return @budget
after napi_complete_done() has been called.

This violates NAPI rules, because after napi_complete_done(),
current thread lost napi ownership.

Move the test against POLL_MODE before the napi_complete_done().

Fixes: c2d548cad1 ("idpf: add TX splitq napi poll support")
Reported-by: Peter Newman <peternewman@google.com>
Closes: https://lore.kernel.org/netdev/20250520121908.1805732-1-edumazet@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Joshua Hay <joshua.a.hay@intel.com>
Cc: Alan Brady <alan.brady@intel.com>
Cc: Madhu Chittim <madhu.chittim@intel.com>
Cc: Phani Burra <phani.r.burra@intel.com>
Cc: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Link: https://patch.msgid.link/20250520124030.1983936-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-21 20:43:15 -07:00
Namjae Jeon
10379171f3 ksmbd: use list_first_entry_or_null for opinfo_get_list()
The list_first_entry() macro never returns NULL.  If the list is
empty then it returns an invalid pointer.  Use list_first_entry_or_null()
to check if the list is empty.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202505080231.7OXwq4Te-lkp@intel.com/
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-21 22:30:39 -05:00
Namjae Jeon
68477b5dc5 ksmbd: fix rename failure
I found that rename fails after cifs mount due to update of
lookup_one_qstr_excl().

 mv a/c b/
mv: cannot move 'a/c' to 'b/c': No such file or directory

In order to rename to a new name regardless of whether the dentry is
negative, we need to get the dentry through lookup_one_qstr_excl().
So It will not return error if the name doesn't exist.

Fixes: 204a575e91 ("VFS: add common error checks to lookup_one_qstr_excl()")
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-21 22:30:30 -05:00
Jens Axboe
3a08988123 io_uring/net: only retry recv bundle for a full transfer
If a shorter than assumed transfer was seen, a partial buffer will have
been filled. For that case it isn't sane to attempt to fill more into
the bundle before posting a completion, as that will cause a gap in
the received data.

Check if the iterator has hit zero and only allow to continue a bundle
operation if that is the case.

Also ensure that for putting finished buffers, only the current transfer
is accounted. Otherwise too many buffers may be put for a short transfer.

Link: https://github.com/axboe/liburing/issues/1409
Cc: stable@vger.kernel.org
Fixes: 7c71a0af81 ("io_uring/net: improve recv bundles")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-21 19:24:18 -06:00
Linus Torvalds
d608703fcd Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
 "Fixes for some SoC clk drivers:

   - Define the gate clk for the OTG PHY on Rockchip RK3576 so the nvmem
     driver actually works

   - Initialize clk_hw_onecell_data::num before accessing the 'hws'
     array to keep UBSAN happy

   - Fix a perf degradation on the Allwinner D1 MMC clk that was making
     things half bad

   - Fix the Allwinner SNXI_CCU_MP_DATA_WITH_MUX_GATE_FEAT macro to have
     proper order of arguments"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: sunxi-ng: d1: Add missing divider for MMC mod clocks
  clk: s2mps11: initialise clk_hw_onecell_data::num before accessing ::hws[] in probe()
  clk: sunxi-ng: fix order of arguments in clock macro
  clk: rockchip: rk3576: define clk_otp_phy_g
2025-05-21 17:24:18 -07:00
Kent Overstreet
010c894681 bcachefs: Check for casefolded dirents in non casefolded dirs
Check for mismatches between casefold dirents and casefold directories.

A mismatch will cause lookups to fail, as we'll be doing the lookup with
the casefolded name, which won't match the non-casefolded dirent, and
vice versa.

Reported-by: Christopher Snowhill <chris@kode54.net>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:13:14 -04:00
Kent Overstreet
ecd76c5f10 bcachefs: Fix bch2_dirent_create_snapshot() for casefolding
bch2_dirent_create_snapshot(), used in fsck, neglected to create a
casefolded dirent.

Just move this into dirent_create_key().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:13:13 -04:00
Kent Overstreet
8d5ac187da bcachefs: Fix casefold opt via xattr interface
Changing the casefold option requires extra checks/work - factor out a
helper from bch2_fileattr_set() for the xattr code to use.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:13:09 -04:00
Jakub Kicinski
d83ec0b284 Merge branch 'there-are-some-bugfix-for-hibmcge-driver'
Jijie Shao says:

====================
There are some bugfix for hibmcge driver

v1: https://lore.kernel.org/20250430093127.2400813-1-shaojijie@huawei.com
====================

Link: https://patch.msgid.link/20250517095828.1763126-1-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-21 15:53:54 -07:00
Jijie Shao
1b45443b84 net: hibmcge: fix wrong ndo.open() after reset fail issue.
If the driver reset fails, it may not work properly.
Therefore, the ndo.open() operation should be rejected.

In this patch, the driver calls netif_device_detach()
before the reset and calls netif_device_attach()
after the reset succeeds. If the reset fails,
netif_device_attach() is not called. Therefore,
netdev does not present and cannot be opened.

If reset fails, only the PCI reset (via sysfs)
can be used to attempt recovery.

Fixes: 3f5a61f6d5 ("net: hibmcge: Add reset supported in this module")
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250517095828.1763126-3-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-21 15:53:51 -07:00
Jijie Shao
009970506c net: hibmcge: fix incorrect statistics update issue
When the user dumps statistics, the hibmcge driver automatically
updates all statistics. If the driver is performing the reset operation,
the error data of 0xFFFFFFFF is updated.

Therefore, if the driver is resetting, the hbg_update_stats_by_info()
needs to return directly.

Fixes: c0bf9bf31e ("net: hibmcge: Add support for dump statistics")
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250517095828.1763126-2-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-21 15:53:51 -07:00
Arnd Bergmann
070d04f002 Merge tag 'mvebu-fixes-6.15-1' of https://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu into arm/fixes
mvebu fixes for 6.15 (part 1)

Fix uDPU board LEDs by correcting pinctrl state

* tag 'mvebu-fixes-6.15-1' of https://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu:
  arm64: dts: marvell: uDPU: define pinctrl state for alarm LEDs

Link: https://lore.kernel.org/r/87wmagpr0a.fsf@BLaptop.bootlin.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-05-21 23:26:54 +02:00
Arnd Bergmann
4e3d2c4b7f Merge tag 'sunxi-fixes-for-6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into arm/fixes
Allwinner fixes for 6.15

Only one fix:

Switch back to I2C for PMICs on Allwinner H6 devices. Apparently using
Allwinner's proprietary bus ended up causing issues when the PMIC was
sharing the bus with other devices.

* tag 'sunxi-fixes-for-6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
  Revert "arm64: dts: allwinner: h6: Use RSB for AXP805 PMIC connection"

Link: https://lore.kernel.org/r/aCaeLgjZllV7bauX@wens.tw
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-05-21 23:26:39 +02:00
Fabio Estevam
6c9ab81187 arm64: defconfig: Ensure CRYPTO_CHACHA20_NEON is selected
Since commit 17ec3e71ba ("crypto: lib/Kconfig - Hide arch options from
user"), the CRYPTO_CHACHA20_NEON option is no longer selected by default
due to changes in how crypto library options are exposed and selected.

To restore the previous behavior and ensure CRYPTO_CHACHA20_NEON is
enabled, explicitly select CONFIG_CRYPTO_CHACHA20 in the defconfig. This
pulls in CRYPTO_LIB_CHACHA_INTERNAL and allows CRYPTO_CHACHA20_NEON to be
selected automatically as before.

Fixes: 17ec3e71ba ("crypto: lib/Kconfig - Hide arch options from user")
Signed-off-by: Fabio Estevam <festevam@denx.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-05-21 23:25:49 +02:00
Arnd Bergmann
7c1bda0993 Merge tag 'samsung-fixes-6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into arm/fixes
Samsung SoC driver fixes for v6.15

1. Exynos ACPM driver (used on Google GS101): Fix timeout due to missing
   responses from the firmware part.

2. Samsung USI (serial engines) driver: Correct ineffective
   unconfiguring of the interface during probe removal.

* tag 'samsung-fixes-6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux:
  soc: samsung: usi: prevent wrong bits inversion during unconfiguring
  firmware: exynos-acpm: check saved RX before bailing out on empty RX queue

Link: https://lore.kernel.org/r/20250513101023.21552-5-krzysztof.kozlowski@linaro.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-05-21 19:02:32 +02:00
Pavel Begunkov
a7d755ed9c io_uring: fix overflow resched cqe reordering
Leaving the CQ critical section in the middle of a overflow flushing
can cause cqe reordering since the cache cq pointers are reset and any
new cqe emitters that might get called in between are not going to be
forced into io_cqe_cache_refill().

Fixes: eac2ca2d68 ("io_uring: check if we need to reschedule during overflow flush")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/90ba817f1a458f091f355f407de1c911d2b93bbf.1747483784.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-21 07:01:54 -06:00
Nilay Shroff
49b9f86a59 nvme: avoid creating multipath sysfs group under namespace path devices
Commit 4dbd2b2ebe ("nvme-multipath: Add visibility for round-robin
io-policy") introduced the creation of the multipath sysfs group under
the NVMe head gendisk device node. However, it also inadvertently added
the same sysfs group under each namespace path device which head node
refers to and that is incorrect.

The multipath sysfs group should only be exposed through the namespace
head gendisk node. This is sufficient, as the head device already
provides symbolic links to the individual namespace paths it manages.

This patch fixes the issue by preventing the creation of the multipath
sysfs group under namespace path devices, ensuring it only appears under
the head disk node.

Fixes: 4dbd2b2ebe ("nvme-multipath: Add visibility for round-robin io-policy")
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-21 14:55:46 +02:00
Samiullah Khawaja
b95ed55173 xsk: Bring back busy polling support in XDP_COPY
Commit 5ef44b3cb4 ("xsk: Bring back busy polling support") fixed the
busy polling support in xsk for XDP_ZEROCOPY after it was broken in
commit 86e25f40aa ("net: napi: Add napi_config"). The busy polling
support with XDP_COPY remained broken since the napi_id setup in
xsk_rcv_check was removed.

Bring back the setup of napi_id for XDP_COPY so socket level SO_BUSYPOLL
can be used to poll the underlying napi.

Do the setup of napi_id for XDP_COPY in xsk_bind, as it is done
currently for XDP_ZEROCOPY. The setup of napi_id for XDP_COPY in
xsk_bind is safe because xsk_rcv_check checks that the rx queue at which
the packet arrives is equal to the queue_id that was supplied in bind.
This is done for both XDP_COPY and XDP_ZEROCOPY mode.

Tested using AF_XDP support in virtio-net by running the xsk_rr AF_XDP
benchmarking tool shared here:
https://lore.kernel.org/all/20250320163523.3501305-1-skhawaja@google.com/T/

Enabled socket busy polling using following commands in qemu,

```
sudo ethtool -L eth0 combined 1
echo 400 | sudo tee /proc/sys/net/core/busy_read
echo 100 | sudo tee /sys/class/net/eth0/napi_defer_hard_irqs
echo 15000   | sudo tee /sys/class/net/eth0/gro_flush_timeout
```

Fixes: 5ef44b3cb4 ("xsk: Bring back busy polling support")
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-21 10:28:23 +01:00
Carlos Sanchez
ef0841e4cb can: slcan: allow reception of short error messages
Allows slcan to receive short messages (typically errors) from the serial
interface.

When error support was added to slcan protocol in
b32ff46685 ("can: slcan: extend the protocol
with error info") the minimum valid message size changed from 5 (minimum
standard can frame tIII0) to 3 ("e1a" is a valid protocol message, it is
one of the examples given in the comments for slcan_bump_err() ), but the
check for minimum message length prodicating all decoding was not adjusted.
This makes short error messages discarded and error frames not being
generated.

This patch changes the minimum length to the new minimum (3 characters,
excluding terminator, is now a valid message).

Signed-off-by: Carlos Sanchez <carlossanchez@geotab.com>
Fixes: b32ff46685 ("can: slcan: extend the protocol with error info")
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://patch.msgid.link/20250520102305.1097494-1-carlossanchez@geotab.com
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-21 10:12:05 +02:00
Christian Brauner
7e69dd62bc Merge patch series "fs/buffer: misc optimizations"
Davidlohr Bueso <dave@stgolabs.net> says:

Four small patches - the first could be sent to Linus for v6.15
considering it is a missing nonblocking lookup conversion in the getblk
slowpath I had missed. The other two patches are small optimizations
found while reading the code, and one rocket science cleanup patch.

* patches from https://lore.kernel.org/20250515173925.147823-1-dave@stgolabs.net:
  fs/buffer: optimize discard_buffer()
  fs/buffer: remove superfluous statements
  fs/buffer: avoid redundant lookup in getblk slowpath
  fs/buffer: use sleeping lookup in __getblk_slowpath()

Link: https://lore.kernel.org/20250515173925.147823-1-dave@stgolabs.net
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-21 09:34:31 +02:00
Davidlohr Bueso
8e184bf1cd fs/buffer: optimize discard_buffer()
While invalidating, the clearing of the bits in discard_buffer()
is done in one fully ordered CAS operation. In the past this was
done via individual clear_bit(), until e7470ee89f (fs: buffer:
do not use unnecessary atomic operations when discarding buffers).
This implies that there were never strong ordering requirements
outside of being serialized by the buffer lock.

As such relax the ordering for archs that can benefit. Further,
the implied ordering in buffer_unlock() makes current cmpxchg
implied barrier redundant due to release semantics. And while in
theory the unlock could be part of the bulk clearing, it is
best to leave it explicit, but without the double barriers.

Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/20250515173925.147823-5-dave@stgolabs.net
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-21 09:34:29 +02:00
Davidlohr Bueso
d11a249996 fs/buffer: remove superfluous statements
Get rid of those unnecessary return statements.

Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/20250515173925.147823-4-dave@stgolabs.net
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-21 09:34:29 +02:00
Davidlohr Bueso
98a6ca1633 fs/buffer: avoid redundant lookup in getblk slowpath
__getblk_slow() already implies failing a first lookup
as the fastpath, so try to create the buffers immediately
and avoid the redundant lookup. This saves 5-10% of the
total cost/latency of the slowpath.

Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/20250515173925.147823-3-dave@stgolabs.net
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-21 09:34:29 +02:00
Davidlohr Bueso
fb27226c38 fs/buffer: use sleeping lookup in __getblk_slowpath()
Just as with the fast path, call the lookup variant depending
on the gfp flags.

Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/20250515173925.147823-2-dave@stgolabs.net
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-21 09:34:28 +02:00
Lorenzo Stoakes
bdc3f7e9e1 MAINTAINERS: add mm memory policy section
As part of the ongoing efforts to sub-divide memory management
maintainership and reviewership, establish a section for memory policy and
migration and add appropriate maintainers and reviewers.

[lorenzo.stoakes@oracle.com: add Ying as reviewer]
  Link: https://lkml.kernel.org/r/ed6f0fc2-5608-4eea-b1be-07e3e19be263@lucifer.local
Link: https://lkml.kernel.org/r/20250515191358.205684-1-lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Rakie Kim <rakie.kim@sk.com>
Acked-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Joshua Hahn <joshua.hahnjy@gmail.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Acked-by: Huang Ying <ying.huang@linux.alibaba.com>
Acked-by: Byungchul Park <byungchul@sk.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Gregory Price <gourry@gourry.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-20 22:49:40 -07:00
Lorenzo Stoakes
62bec60be2 MAINTAINERS: add mm ksm section
As part of the ongoing efforts to sub-divide memory management
maintainership and reviewership, establish a section for Kernel Samepage
Merging (KSM) and add appropriate maintainers and reviewers.

Link: https://lkml.kernel.org/r/20250515190404.203596-1-lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Chengming Zhou <chengming.zhou@linux.dev>
Acked-by: Xu Xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-20 22:49:40 -07:00
Alexander Gordeev
b6ea95a34c kasan: avoid sleepable page allocation from atomic context
apply_to_pte_range() enters the lazy MMU mode and then invokes
kasan_populate_vmalloc_pte() callback on each page table walk iteration. 
However, the callback can go into sleep when trying to allocate a single
page, e.g.  if an architecutre disables preemption on lazy MMU mode enter.

On s390 if make arch_enter_lazy_mmu_mode() -> preempt_enable() and
arch_leave_lazy_mmu_mode() -> preempt_disable(), such crash occurs:

[    0.663336] BUG: sleeping function called from invalid context at ./include/linux/sched/mm.h:321
[    0.663348] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2, name: kthreadd
[    0.663358] preempt_count: 1, expected: 0
[    0.663366] RCU nest depth: 0, expected: 0
[    0.663375] no locks held by kthreadd/2.
[    0.663383] Preemption disabled at:
[    0.663386] [<0002f3284cbb4eda>] apply_to_pte_range+0xfa/0x4a0
[    0.663405] CPU: 0 UID: 0 PID: 2 Comm: kthreadd Not tainted 6.15.0-rc5-gcc-kasan-00043-gd76bb1ebb558-dirty #162 PREEMPT
[    0.663408] Hardware name: IBM 3931 A01 701 (KVM/Linux)
[    0.663409] Call Trace:
[    0.663410]  [<0002f3284c385f58>] dump_stack_lvl+0xe8/0x140
[    0.663413]  [<0002f3284c507b9e>] __might_resched+0x66e/0x700
[    0.663415]  [<0002f3284cc4f6c0>] __alloc_frozen_pages_noprof+0x370/0x4b0
[    0.663419]  [<0002f3284ccc73c0>] alloc_pages_mpol+0x1a0/0x4a0
[    0.663421]  [<0002f3284ccc8518>] alloc_frozen_pages_noprof+0x88/0xc0
[    0.663424]  [<0002f3284ccc8572>] alloc_pages_noprof+0x22/0x120
[    0.663427]  [<0002f3284cc341ac>] get_free_pages_noprof+0x2c/0xc0
[    0.663429]  [<0002f3284cceba70>] kasan_populate_vmalloc_pte+0x50/0x120
[    0.663433]  [<0002f3284cbb4ef8>] apply_to_pte_range+0x118/0x4a0
[    0.663435]  [<0002f3284cbc7c14>] apply_to_pmd_range+0x194/0x3e0
[    0.663437]  [<0002f3284cbc99be>] __apply_to_page_range+0x2fe/0x7a0
[    0.663440]  [<0002f3284cbc9e88>] apply_to_page_range+0x28/0x40
[    0.663442]  [<0002f3284ccebf12>] kasan_populate_vmalloc+0x82/0xa0
[    0.663445]  [<0002f3284cc1578c>] alloc_vmap_area+0x34c/0xc10
[    0.663448]  [<0002f3284cc1c2a6>] __get_vm_area_node+0x186/0x2a0
[    0.663451]  [<0002f3284cc1e696>] __vmalloc_node_range_noprof+0x116/0x310
[    0.663454]  [<0002f3284cc1d950>] __vmalloc_node_noprof+0xd0/0x110
[    0.663457]  [<0002f3284c454b88>] alloc_thread_stack_node+0xf8/0x330
[    0.663460]  [<0002f3284c458d56>] dup_task_struct+0x66/0x4d0
[    0.663463]  [<0002f3284c45be90>] copy_process+0x280/0x4b90
[    0.663465]  [<0002f3284c460940>] kernel_clone+0xd0/0x4b0
[    0.663467]  [<0002f3284c46115e>] kernel_thread+0xbe/0xe0
[    0.663469]  [<0002f3284c4e440e>] kthreadd+0x50e/0x7f0
[    0.663472]  [<0002f3284c38c04a>] __ret_from_fork+0x8a/0xf0
[    0.663475]  [<0002f3284ed57ff2>] ret_from_fork+0xa/0x38

Instead of allocating single pages per-PTE, bulk-allocate the shadow
memory prior to applying kasan_populate_vmalloc_pte() callback on a page
range.

Link: https://lkml.kernel.org/r/c61d3560297c93ed044f0b1af085610353a06a58.1747316918.git.agordeev@linux.ibm.com
Fixes: 3c5c3cfb9e ("kasan: support backing vmalloc space with real shadow memory")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Suggested-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-20 22:49:40 -07:00
Matthew Wilcox (Oracle)
97dfbbd135 highmem: add folio_test_partial_kmap()
In commit c749d9b7eb ("iov_iter: fix copy_page_from_iter_atomic() if
KMAP_LOCAL_FORCE_MAP"), Hugh correctly noted that if KMAP_LOCAL_FORCE_MAP
is enabled, we must limit ourselves to PAGE_SIZE bytes per call to
kmap_local().  The same problem exists in memcpy_from_folio(),
memcpy_to_folio(), folio_zero_tail(), folio_fill_tail() and
memcpy_from_file_folio(), so add folio_test_partial_kmap() to do this more
succinctly.

Link: https://lkml.kernel.org/r/20250514170607.3000994-2-willy@infradead.org
Fixes: 00cdf76012 ("mm: add memcpy_from_file_folio()")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-20 22:49:39 -07:00
Lance Yang
6fa04511f1 MAINTAINERS: add hung-task detector section
The hung-task detector is missing in MAINTAINERS.  While it's been quiet
recently, I'm actively working on it and volunteering to review patches. 
Adding this section will make it easier for contributors to know who to
contact.

Link: https://lkml.kernel.org/r/20250513052234.46463-1-lance.yang@linux.dev
Signed-off-by: Lance Yang <lance.yang@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-20 22:49:39 -07:00
Wang Yaxin
0bf2d838de taskstats: fix struct taskstats breaks backward compatibility since version 15
Problem
========
commit 658eb5ab91 ("delayacct: add delay max to record delay peak")
  - adding more fields
commit f65c64f311 ("delayacct: add delay min to record delay peak")
  - adding more fields
commit b016d08737 ("taskstats: modify taskstats version")
 - version bump to 15

Since version 15 (TASKSTATS_VERSION=15) the new layout of the structure
adds fields in the middle of the structure, rendering all old software
incompatible with newer kernels and software compiled against the new
kernel headers incompatible with older kernels.

Solution
=========
move delay max and delay min to the end of taskstat, and bump
the version to 16 after the change

[wang.yaxin@zte.com.cn: adjust indentation]
  Link: https://lkml.kernel.org/r/202505192131489882NSciXV4EGd8zzjLuwoOK@zte.com.cn
Link: https://lkml.kernel.org/r/20250510155413259V4JNRXxukdDgzsaL0Fo6a@zte.com.cn
Fixes: f65c64f311 ("delayacct: add delay min to record delay peak")
Signed-off-by: Wang Yaxin <wang.yaxin@zte.com.cn>
Signed-off-by: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Kun Jiang <jiang.kun2@zte.com.cn>
Reviewed-by: Yang Yang <yang.yang29@zte.com.cn>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-20 22:49:39 -07:00
Zhang Yi
66f28ffb38 mm/truncate: fix out-of-bounds when doing a right-aligned split
When performing a right split on a folio, the split_at2 may point to a
not-present page if the offset + length equals the original folio size,
which will trigger the following error:

 BUG: unable to handle page fault for address: ffffea0006000008
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 143ffb9067 P4D 143ffb9067 PUD 143ffb8067 PMD 0
 Oops: Oops: 0000 [#1] SMP PTI
 CPU: 0 UID: 0 PID: 502640 Comm: fsx Not tainted 6.15.0-rc3-gc6156189fc6b #889 PR
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/4
 RIP: 0010:truncate_inode_partial_folio+0x208/0x620
 Code: ff 03 48 01 da e8 78 7e 13 00 48 83 05 10 b5 5a 0c 01 85 c0 0f 85 1c 02 001
 RSP: 0018:ffffc90005bafab0 EFLAGS: 00010286
 RAX: 0000000000000000 RBX: ffffea0005ffff00 RCX: 0000000000000002
 RDX: 000000000000000c RSI: 0000000000013975 RDI: ffffc90005bafa30
 RBP: ffffea0006000000 R08: 0000000000000000 R09: 00000000000009bf
 R10: 00000000000007e0 R11: 0000000000000000 R12: 0000000000001633
 R13: 0000000000000000 R14: ffffea0005ffff00 R15: fffffffffffffffe
 FS:  00007f9f9a161740(0000) GS:ffff8894971fd000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffffea0006000008 CR3: 000000017c2ae000 CR4: 00000000000006f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <TASK>
  truncate_inode_pages_range+0x226/0x720
  truncate_pagecache+0x57/0x90
  ...

Fix this issue by skipping the split if truncation aligns with the folio
size, make sure the split page number lies within the folio.

Link: https://lkml.kernel.org/r/20250512062825.3533342-1-yi.zhang@huaweicloud.com
Fixes: 7460b470a1 ("mm/truncate: use folio_split() in truncate operation")
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: ErKun Yang <yangerkun@huawei.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-20 22:49:39 -07:00
Lorenzo Stoakes
41f36b3912 MAINTAINERS: add mm reclaim section
In furtherance of ongoing efforts to ensure people are aware of who
de-facto maintains/has an interest in specific parts of mm, as well trying
to avoid get_maintainers.pl listing only Andrew and the mailing list for
mm files - establish a reclaim memory management section and add relevant
maintainers/reviewers.

This is a key part of memory management so sensibly deserves its own
section.

This encompasses both 'classical' reclaim and MGLRU and thus reflects this
in the reviewers from both, as well as those who have contributed
specifically on the memcg side of things.

Link: https://lkml.kernel.org/r/20250512143122.87740-1-lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: Qi Zheng <zhengqi.arch@bytedance.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-20 22:49:38 -07:00
Lorenzo Stoakes
e27126e0bc MAINTAINERS: update page allocator section
Make Vlastimil maintainer of this section (with thanks to Vlastimil for
agreeing to this!) and add page isolation files for which this section
seem most appropriate.

We may wish to, in future, refactor/rename some of these files to more
logically fit what is actually being performed, but for the time being
this seems the most sensible place.

Additionally, fix the alphabetical ordering of files.

Link: https://lkml.kernel.org/r/20250512144603.90379-1-lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Zi Yan <ziy@nvidia.com>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-20 22:49:38 -07:00
Florent Revest
0f518255bd mm: fix VM_UFFD_MINOR == VM_SHADOW_STACK on USERFAULTFD=y && ARM64_GCS=y
On configs with CONFIG_ARM64_GCS=y, VM_SHADOW_STACK is bit 38.  On configs
with CONFIG_HAVE_ARCH_USERFAULTFD_MINOR=y (selected by CONFIG_ARM64 when
CONFIG_USERFAULTFD=y), VM_UFFD_MINOR is _also_ bit 38.

This bit being shared by two different VMA flags could lead to all sorts
of unintended behaviors.  Presumably, a process could maybe call into
userfaultfd in a way that disables the shadow stack vma flag.  I can't
think of any attack where this would help (presumably, if an attacker
tries to disable shadow stacks, they are trying to hijack control flow so
can't arbitrarily call into userfaultfd yet anyway) but this still feels
somewhat scary.

Link: https://lkml.kernel.org/r/20250507131000.1204175-2-revest@chromium.org
Fixes: ae80e1629a ("mm: Define VM_SHADOW_STACK for arm64 when we support GCS")
Signed-off-by: Florent Revest <revest@chromium.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Borislav Betkov <bp@alien8.de>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Florent Revest <revest@chromium.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-20 22:49:38 -07:00
Ignacio Moreno Gonzalez
7190b3c8bd mm: mmap: map MAP_STACK to VM_NOHUGEPAGE only if THP is enabled
commit c4608d1bf7 ("mm: mmap: map MAP_STACK to VM_NOHUGEPAGE") maps the
mmap option MAP_STACK to VM_NOHUGEPAGE.  This is also done if
CONFIG_TRANSPARENT_HUGEPAGE is not defined.  But in that case, the
VM_NOHUGEPAGE does not make sense.

I discovered this issue when trying to use the tool CRIU to checkpoint and
restore a container.  Our running kernel is compiled without
CONFIG_TRANSPARENT_HUGEPAGE.  CRIU parses the output of /proc/<pid>/smaps
and saves the "nh" flag.  When trying to restore the container, CRIU fails
to restore the "nh" mappings, since madvise() MADV_NOHUGEPAGE always
returns an error because CONFIG_TRANSPARENT_HUGEPAGE is not defined.

Link: https://lkml.kernel.org/r/20250507-map-map_stack-to-vm_nohugepage-only-if-thp-is-enabled-v5-1-c6c38cfefd6e@kuka.com
Fixes: c4608d1bf7 ("mm: mmap: map MAP_STACK to VM_NOHUGEPAGE")
Signed-off-by: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Yang Shi <yang@os.amperecomputing.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-20 22:49:38 -07:00
Uladzislau Rezki (Sony)
910224c783 MAINTAINERS: add myself as vmalloc co-maintainer
I have been working on the vmalloc code for several years, contributing to
improvements and fixes.  Add myself as co-maintainer ("M") alongside
Andrew Morton.

Link: https://lkml.kernel.org/r/20250507150257.61485-1-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Christop Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-20 22:49:37 -07:00
Tianyang Zhang
e05741fb10 mm/page_alloc.c: avoid infinite retries caused by cpuset race
__alloc_pages_slowpath has no change detection for ac->nodemask in the
part of retry path, while cpuset can modify it in parallel.  For some
processes that set mempolicy as MPOL_BIND, this results ac->nodemask
changes, and then the should_reclaim_retry will judge based on the latest
nodemask and jump to retry, while the get_page_from_freelist only
traverses the zonelist from ac->preferred_zoneref, which selected by a
expired nodemask and may cause infinite retries in some cases

cpu 64:
__alloc_pages_slowpath {
        /* ..... */
retry:
        /* ac->nodemask = 0x1, ac->preferred->zone->nid = 1 */
        if (alloc_flags & ALLOC_KSWAPD)
                wake_all_kswapds(order, gfp_mask, ac);
        /* cpu 1:
        cpuset_write_resmask
            update_nodemask
                update_nodemasks_hier
                    update_tasks_nodemask
                        mpol_rebind_task
                         mpol_rebind_policy
                          mpol_rebind_nodemask
		// mempolicy->nodes has been modified,
		// which ac->nodemask point to

        */
        /* ac->nodemask = 0x3, ac->preferred->zone->nid = 1 */
        if (should_reclaim_retry(gfp_mask, order, ac, alloc_flags,
                                 did_some_progress > 0, &no_progress_loops))
                goto retry;
}

Simultaneously starting multiple cpuset01 from LTP can quickly reproduce
this issue on a multi node server when the maximum memory pressure is
reached and the swap is enabled

Link: https://lkml.kernel.org/r/20250416082405.20988-1-zhangtianyang@loongson.cn
Fixes: c33d6c06f6 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice")
Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-20 22:49:37 -07:00
Linus Torvalds
4a95bc121c Merge tag 'rproc-v6.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux
Pull remoteproc fix from Bjorn Andersson:
 "Address a regression preventing the wireless subsystem remoteproc on
  some Qualcomm platforms (e.g. SDM632) from working"

* tag 'rproc-v6.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
  remoteproc: qcom_wcnss: Fix on platforms without fallback regulators
2025-05-20 20:12:20 -07:00
Linus Torvalds
5da472ae1f Merge tag 'v6.15-p7' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This fixes a regression in padata as well as an ancient double-free
  bug in af_alg"

* tag 'v6.15-p7' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: algif_hash - fix double free in hash_accept
  padata: do not leak refcount in reorder_work
2025-05-20 20:10:01 -07:00
Thangaraj Samynathan
293e38ff4e net: lan743x: Restore SGMII CTRL register on resume
SGMII_CTRL register, which specifies the active interface, was not
properly restored when resuming from suspend. This led to incorrect
interface selection after resume particularly in scenarios involving
the FPGA.

To fix this:
- Move the SGMII_CTRL setup out of the probe function.
- Initialize the register in the hardware initialization helper function,
which is called during both device initialization and resume.

This ensures the interface configuration is consistently restored after
suspend/resume cycles.

Fixes: a46d9d37c4 ("net: lan743x: Add support for SGMII interface")
Signed-off-by: Thangaraj Samynathan <thangaraj.s@microchip.com>
Link: https://patch.msgid.link/20250516035719.117960-1-thangaraj.s@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-20 19:44:17 -07:00
Michael Chan
aed031da7e bnxt_en: Fix netdev locking in ULP IRQ functions
netdev_lock is already held when calling bnxt_ulp_irq_stop() and
bnxt_ulp_irq_restart().  When converting rtnl_lock to netdev_lock,
the original code was rtnl_dereference() to indicate that rtnl_lock
was already held.  rcu_dereference_protected() is the correct
conversion after replacing rtnl_lock with netdev_lock.

Add a new helper netdev_lock_dereference() similar to
rtnl_dereference().

Fixes: 004b500801 ("eth: bnxt: remove most dependencies on RTNL")
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250519204130.3097027-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-20 18:52:11 -07:00
Paul Barker
48a6285507 MAINTAINERS: Drop myself to reviewer for ravb driver
Maintenance of the ravb driver will be handled by Niklas for now. I
still intend to review patches, and will be using my own email address
going forward.

Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Acked-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://patch.msgid.link/20250519133354.6564-1-paul.barker.ct@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-20 18:37:21 -07:00
Paul Kocialkowski
47653e4243 net: dwmac-sun8i: Use parsed internal PHY address instead of 1
While the MDIO address of the internal PHY on Allwinner sun8i chips is
generally 1, of_mdio_parse_addr is used to cleanly parse the address
from the device-tree instead of hardcoding it.

A commit reworking the code ditched the parsed value and hardcoded the
value 1 instead, which didn't really break anything but is more fragile
and not future-proof.

Restore the initial behavior using the parsed address returned from the
helper.

Fixes: 634db83b82 ("net: stmmac: dwmac-sun8i: Handle integrated/external MDIOs")
Signed-off-by: Paul Kocialkowski <paulk@sys-base.io>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Corentin LABBE <clabbe.montjoie@gmail.com>
Tested-by: Corentin LABBE <clabbe.montjoie@gmail.com>
Link: https://patch.msgid.link/20250519164936.4172658-1-paulk@sys-base.io
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-20 18:34:21 -07:00
Jakub Kicinski
b80b43fea2 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2025-05-19 (ice, idpf)

For ice:
Jake removes incorrect incrementing of MAC filter count.

Dave adds check for, prerequisite, switchdev mode before setting up LAG.

For idpf:
Pavan stores max_tx_hdr_size to prevent NULL pointer dereference during
reset.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  idpf: fix null-ptr-deref in idpf_features_check
  ice: Fix LACP bonds without SRIOV environment
  ice: fix vf->num_mac count with port representors
====================

Link: https://patch.msgid.link/20250519210523.1866503-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-20 18:29:50 -07:00
Nishanth Menon
50980d8da7 net: ethernet: ti: am65-cpsw: Lower random mac address error print to info
Using random mac address is not an error since the driver continues to
function, it should be informative that the system has not assigned
a MAC address. This is inline with other drivers such as ax88796c,
dm9051 etc. Drop the error level to info level.

Signed-off-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Link: https://patch.msgid.link/20250516122655.442808-1-nm@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-20 18:12:06 -07:00
Dmitry Baryshkov
41e452e693 pinctrl: qcom: switch to devm_register_sys_off_handler()
Error-handling paths in msm_pinctrl_probe() don't call
a function required to unroll restart handler registration,
unregister_restart_handler(). Instead of adding calls to this function,
switch the msm pinctrl code into using devm_register_sys_off_handler().

Fixes: cf1fc18762 ("pinctrl: qcom: use restart_notifier mechanism for ps_hold")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Link: https://lore.kernel.org/20250513-pinctrl-msm-fix-v2-2-249999af0fc1@oss.qualcomm.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2025-05-20 23:41:29 +02:00
Dmitry Baryshkov
fdf6cab17f gpiolib: don't crash on enabling GPIO HOG pins
On Qualcomm platforms if the board uses GPIO hogs msm_pinmux_request()
calls gpiochip_line_is_valid(). After commit 8015443e24 ("gpio: Hide
valid_mask from direct assignments") gpiochip_line_is_valid() uses
gc->gpiodev, which is NULL when GPIO hog pins are being processed.
Thus after this commit using GPIO hogs causes the following crash. In
order to fix this, verify that gc->gpiodev is not NULL.

Note: it is not possible to reorder calls (e.g. by calling
msm_gpio_init() before pinctrl registration or by splitting
pinctrl_register() into _and_init() and pinctrl_enable() and calling the
latter function after msm_gpio_init()) because GPIO chip registration
would fail with EPROBE_DEFER if pinctrl is not enabled at the time of
registration.

pc : gpiochip_line_is_valid+0x4/0x28
lr : msm_pinmux_request+0x24/0x40
sp : ffff8000808eb870
x29: ffff8000808eb870 x28: 0000000000000000 x27: 0000000000000000
x26: 0000000000000000 x25: ffff726240f9d040 x24: 0000000000000000
x23: ffff7262438c0510 x22: 0000000000000080 x21: ffff726243ea7000
x20: ffffab13f2c4e698 x19: 0000000000000080 x18: 00000000ffffffff
x17: ffff726242ba6000 x16: 0000000000000100 x15: 0000000000000028
x14: 0000000000000000 x13: 0000000000002948 x12: 0000000000000003
x11: 0000000000000078 x10: 0000000000002948 x9 : ffffab13f50eb5e8
x8 : 0000000003ecb21b x7 : 000000000000002d x6 : 0000000000000b68
x5 : 0000007fffffffff x4 : ffffab13f52f84a8 x3 : ffff8000808eb804
x2 : ffffab13f1de8190 x1 : 0000000000000080 x0 : 0000000000000000
Call trace:
 gpiochip_line_is_valid+0x4/0x28 (P)
 pin_request+0x208/0x2c0
 pinmux_enable_setting+0xa0/0x2e0
 pinctrl_commit_state+0x150/0x26c
 pinctrl_enable+0x6c/0x2a4
 pinctrl_register+0x3c/0xb0
 devm_pinctrl_register+0x58/0xa0
 msm_pinctrl_probe+0x2a8/0x584
 sdm845_pinctrl_probe+0x20/0x88
 platform_probe+0x68/0xc0
 really_probe+0xbc/0x298
 __driver_probe_device+0x78/0x12c
 driver_probe_device+0x3c/0x160
 __device_attach_driver+0xb8/0x138
 bus_for_each_drv+0x84/0xe0
 __device_attach+0x9c/0x188
 device_initial_probe+0x14/0x20
 bus_probe_device+0xac/0xb0
 deferred_probe_work_func+0x8c/0xc8
 process_one_work+0x208/0x5e8
 worker_thread+0x1b4/0x35c
 kthread+0x144/0x220
 ret_from_fork+0x10/0x20
Code: b5fffba0 17fffff2 9432ec27 f9400400 (f9428800)

Fixes: 8015443e24 ("gpio: Hide valid_mask from direct assignments")
Reported-by: Doug Anderson <dianders@chromium.org>
Closes: https://lore.kernel.org/r/CAD=FV=Vg8_ZOLgLoC4WhFPzhVsxXFC19NrF38W6cW_W_3nFjbw@mail.gmail.com
Tested-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Matti Vaittinen <mazziesaccount@gmail.com>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/20250513-pinctrl-msm-fix-v2-1-249999af0fc1@oss.qualcomm.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2025-05-20 23:41:29 +02:00
Caleb Sander Mateos
f1774d9d4e io_uring/cmd: axe duplicate io_uring_cmd_import_fixed_vec() declaration
io_uring_cmd_import_fixed_vec() is declared in both
include/linux/io_uring/cmd.h and io_uring/uring_cmd.h. The declarations
are identical (if redundant) for CONFIG_IO_URING=y. But if
CONFIG_IO_URING=N, include/linux/io_uring/cmd.h declares the function as
static inline while io_uring/uring_cmd.h declares it as extern. This
causes linker errors if the declaration in io_uring/uring_cmd.h is used.

Remove the declaration in io_uring/uring_cmd.h to avoid linker errors
and prevent the declarations getting out of sync.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Fixes: ef49027529 ("io_uring/cmd: introduce io_uring_cmd_import_fixed_vec")
Link: https://lore.kernel.org/r/20250520193337.1374509-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-20 14:36:41 -06:00
Marc Kleine-Budde
be19a9228d Merge patch series "can: kvaser_pciefd: Fix ISR race conditions"
Axel Forsman <axfo@kvaser.com> says:

This patch series fixes a couple of race conditions in the
kvaser_pciefd driver surfaced by enabling MSI interrupts and the new
Kvaser PCIe 8xCAN.

Changes since version 2:
* Rebase onto linux-can/main to resolve del_timer()/timer_delete()
  merge conflict.
* Reword 2nd commit message slightly.

Changes since version 1:
* Change type of srb_cmd_reg from "__le32 __iomem *" to
  "void __iomem *".
* Maintain TX FIFO count in driver instead of querying HW.
* Stop queue at end of .start_xmit() if full.

Link: https://patch.msgid.link/20250520114332.8961-1-axfo@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-20 21:35:41 +02:00
Axel Forsman
6d820b81c4 can: kvaser_pciefd: Continue parsing DMA buf after dropped RX
Going bus-off on a channel doing RX could result in dropped packets.

As netif_running() gets cleared before the channel abort procedure,
the handling of any last RDATA packets would see netif_rx() return
non-zero to signal a dropped packet. kvaser_pciefd_read_buffer() dealt
with this "error" by breaking out of processing the remaining DMA RX
buffer.

Only return an error from kvaser_pciefd_read_buffer() due to packet
corruption, otherwise handle it internally.

Cc: stable@vger.kernel.org
Signed-off-by: Axel Forsman <axfo@kvaser.com>
Tested-by: Jimmy Assarsson <extja@kvaser.com>
Reviewed-by: Jimmy Assarsson <extja@kvaser.com>
Link: https://patch.msgid.link/20250520114332.8961-4-axfo@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-20 21:35:39 +02:00
Axel Forsman
8256e0ca60 can: kvaser_pciefd: Fix echo_skb race
The functions kvaser_pciefd_start_xmit() and
kvaser_pciefd_handle_ack_packet() raced to stop/wake TX queues and
get/put echo skbs, as kvaser_pciefd_can->echo_lock was only ever taken
when transmitting and KCAN_TX_NR_PACKETS_CURRENT gets decremented
prior to handling of ACKs. E.g., this caused the following error:

    can_put_echo_skb: BUG! echo_skb 5 is occupied!

Instead, use the synchronization helpers in netdev_queues.h. As those
piggyback on BQL barriers, start updating in-flight packets and bytes
counts as well.

Cc: stable@vger.kernel.org
Signed-off-by: Axel Forsman <axfo@kvaser.com>
Tested-by: Jimmy Assarsson <extja@kvaser.com>
Reviewed-by: Jimmy Assarsson <extja@kvaser.com>
Link: https://patch.msgid.link/20250520114332.8961-3-axfo@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-20 21:35:39 +02:00
Axel Forsman
9176bd205e can: kvaser_pciefd: Force IRQ edge in case of nested IRQ
Avoid the driver missing IRQs by temporarily masking IRQs in the ISR
to enforce an edge even if a different IRQ is signalled before handled
IRQs are cleared.

Fixes: 48f827d4f4 ("can: kvaser_pciefd: Move reset of DMA RX buffers to the end of the ISR")
Cc: stable@vger.kernel.org
Signed-off-by: Axel Forsman <axfo@kvaser.com>
Tested-by: Jimmy Assarsson <extja@kvaser.com>
Reviewed-by: Jimmy Assarsson <extja@kvaser.com>
Link: https://patch.msgid.link/20250520114332.8961-2-axfo@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-20 21:35:39 +02:00
Linus Torvalds
b36ddb9210 Merge tag 'for-linus-6.15-ofs2' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux
Pull orangefs fix from Mike Marshall:
 "Fix for orangefs page writeout counting"

* tag 'for-linus-6.15-ofs2' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
  orangefs: adjust counting code to recover from 665575cf
2025-05-20 09:03:34 -07:00
Christoph Hellwig
355341e435 loop: don't require ->write_iter for writable files in loop_configure
Block devices can be opened read-write even if they can't be written to
for historic reasons.  Remove the check requiring file->f_op->write_iter
when the block devices was opened in loop_configure. The call to
loop_check_backing_file just below ensures the ->write_iter is present
for backing files opened for writing, which is the only check that is
actually needed.

Fixes: f5c84eff63 ("loop: Add sanity check for read/write_iter")
Reported-by: Christian Hesse <mail@eworm.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250520135420.1177312-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-20 09:16:23 -06:00
Mike Marshall
219bf6edd7 orangefs: adjust counting code to recover from 665575cf
A late commit to 6.14-rc7! broke orangefs. 665575cf seems like a
good change, but maybe should have been introduced during the merge
window. This patch adjusts the counting code associated with
writing out pages so that orangefs works in a 665575cf world.

Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2025-05-20 11:07:00 -04:00
Chris Chiu
f709b78aec ALSA: hda/realtek - Add new HP ZBook laptop with micmute led fixup
New HP ZBook with Realtek HDA codec ALC3247 needs the quirk
ALC236_FIXUP_HP_GPIO_LED to fix the micmute LED.

Signed-off-by: Chris Chiu <chris.chiu@canonical.com>
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20250520132101.120685-1-chris.chiu@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-05-20 16:09:06 +02:00
Stefan Binding
7150d57c37 ALSA: hda/realtek: Add support for HP Agusta using CS35L41 HDA
Add support for HP Agusta.

Laptops use 2 CS35L41 Amps with HDA, using Internal boost, with I2C

Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20250520124757.12597-1-sbinding@opensource.cirrus.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-05-20 16:08:36 +02:00
Paolo Abeni
9e89db3d84 Merge tag 'linux-can-fixes-for-6.15-20250520' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:

====================
pull-request: can 2025-05-20

this is a pull request of 3 patches for net/main.

The 1st patch is by Rob Herring, and fixes the $id path in the
microchip,mcp2510.yaml device tree bindinds documentation.

The last 2 patches are from Oliver Hartkopp and fix a use-after-free
read and an out-of-bounds read in the CAN Broadcast Manager (BCM)
protocol.

linux-can-fixes-for-6.15-20250520

* tag 'linux-can-fixes-for-6.15-20250520' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: bcm: add missing rcu read protection for procfs content
  can: bcm: add locking for bcm_op runtime updates
  dt-bindings: can: microchip,mcp2510: Fix $id path
====================

Link: https://patch.msgid.link/20250520091424.142121-1-mkl@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-20 15:54:37 +02:00
Mark Pearson
8508427a6e platform/x86: think-lmi: Fix attribute name usage for non-compliant items
A few, quite rare, WMI attributes have names that are not compatible with
filenames, e.g. "Intel VT for Directed I/O (VT-d)".
For these cases the '/' gets replaced with '\' for display, but doesn't
get switched again when doing the WMI access.

Fix this by keeping the original attribute name and using that for sending
commands to the BIOS

Fixes: a40cd7ef22 ("platform/x86: think-lmi: Add WMI interface support on Lenovo platforms")
Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Link: https://lore.kernel.org/r/20250520005027.3840705-1-mpearson-lenovo@squebb.ca
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-05-20 12:40:03 +03:00
Mark Pearson
29e4e6b423 platform/x86: thinkpad_acpi: Ignore battery threshold change event notification
If user modifies the battery charge threshold an ACPI event is generated.
Confirmed with Lenovo FW team this is only generated on user event. As no
action is needed, ignore the event and prevent spurious kernel logs.

Reported-by: Derek Barbosa <debarbos@redhat.com>
Closes: https://lore.kernel.org/platform-driver-x86/7e9a1c47-5d9c-4978-af20-3949d53fb5dc@app.fastmail.com/T/#m5f5b9ae31d3fbf30d7d9a9d76c15fb3502dfd903
Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Armin Wolf <W_Armin@gmx.de>
Link: https://lore.kernel.org/r/20250517023348.2962591-1-mpearson-lenovo@squebb.ca
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-05-20 12:35:10 +03:00
Ed Burcher
8d70503068 ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7 14ASP10
Lenovo Yoga Pro 7 (gen 10) with Realtek ALC3306 and combined CS35L56
amplifiers need quirk ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN to
enable bass

Signed-off-by: Ed Burcher <git@edburcher.com>
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20250519224907.31265-2-git@edburcher.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-05-20 08:01:19 +02:00
Namjae Jeon
1f4bbedd4e ksmbd: fix stream write failure
If there is no stream data in file, v_len is zero.
So, If position(*pos) is zero, stream write will fail
due to stream write position validation check.
This patch reorganize stream write position validation.

Fixes: 0ca6df4f40 ("ksmbd: prevent out-of-bounds stream writes by validating *pos")
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-19 20:35:08 -05:00
Wang Zhaolong
e48f9d849b smb: client: Reset all search buffer pointers when releasing buffer
Multiple pointers in struct cifs_search_info (ntwrk_buf_start,
srch_entries_start, and last_entry) point to the same allocated buffer.
However, when freeing this buffer, only ntwrk_buf_start was set to NULL,
while the other pointers remained pointing to freed memory.

This is defensive programming to prevent potential issues with stale
pointers. While the active UAF vulnerability is fixed by the previous
patch, this change ensures consistent pointer state and more robust error
handling.

Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com>
Cc: stable@vger.kernel.org
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-19 20:29:06 -05:00
Zhang Rui
cf948c8e27 thermal: intel: x86_pkg_temp_thermal: Fix bogus trip temperature
The tj_max value obtained from the Intel TCC library are in Celsius,
whereas the thermal subsystem operates in milli-Celsius.

This discrepancy leads to incorrect trip temperature calculations.

Fix bogus trip temperature by converting tj_max to milli-Celsius Unit.

Fixes: 8ef0ca4a17 ("Merge back other thermal control material for 6.3.")
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reported-by: zhang ning <zhangn1985@outlook.com>
Closes: https://lore.kernel.org/all/TY2PR01MB3786EF0FE24353026293F5ACCD97A@TY2PR01MB3786.jpnprd01.prod.outlook.com/
Tested-by: zhang ning <zhangn1985@outlook.com>
Cc: 6.3+ <stable@vger.kernel.org> # 6.3+
Link: https://patch.msgid.link/20250519070901.1031233-1-rui.zhang@intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-05-19 21:08:39 +02:00
Dmitry Torokhov
ca39500f6a Input: synaptics-rmi - fix crash with unsupported versions of F34
Sysfs interface for updating firmware for RMI devices is available even
when F34 probe fails. The code checks for presence of F34 "container"
pointer and then tries to use the function data attached to the
sub-device. F34 assigns the function data early, before it knows if
probe will succeed, leaving behind a stale pointer.

Fix this by expanding checks to not only test for presence of F34
"container" but also check if there is driver data assigned to the
sub-device, and call dev_set_drvdata() only after we are certain that
probe is successful.

This is not a complete fix, since F34 will be freed during firmware
update, so there is still a race when fetching and accessing this
pointer. This race will be addressed in follow-up changes.

Reported-by: Hanno Böck <hanno@hboeck.de>
Fixes: 29fd0ec2bd ("Input: synaptics-rmi4 - add support for F34 device reflash")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/aBlAl6sGulam-Qcx@google.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-05-19 11:58:36 -07:00
Pavan Kumar Linga
2dabe349f7 idpf: fix null-ptr-deref in idpf_features_check
idpf_features_check is used to validate the TX packet. skb header
length is compared with the hardware supported value received from
the device control plane. The value is stored in the adapter structure
and to access it, vport pointer is used. During reset all the vports
are released and the vport pointer that the netdev private structure
points to is NULL.

To avoid null-ptr-deref, store the max header length value in netdev
private structure. This also helps to cache the value and avoid
accessing adapter pointer in hot path.

BUG: kernel NULL pointer dereference, address: 0000000000000068
...
RIP: 0010:idpf_features_check+0x6d/0xe0 [idpf]
Call Trace:
 <TASK>
 ? __die+0x23/0x70
 ? page_fault_oops+0x154/0x520
 ? exc_page_fault+0x76/0x190
 ? asm_exc_page_fault+0x26/0x30
 ? idpf_features_check+0x6d/0xe0 [idpf]
 netif_skb_features+0x88/0x310
 validate_xmit_skb+0x2a/0x2b0
 validate_xmit_skb_list+0x4c/0x70
 sch_direct_xmit+0x19d/0x3a0
 __dev_queue_xmit+0xb74/0xe70
 ...

Fixes: a251eee621 ("idpf: add SRIOV support and other ndo_ops")
Reviewed-by: Madhu Chititm <madhu.chittim@intel.com>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-05-19 08:38:22 -07:00
Dave Ertman
6c778f1b83 ice: Fix LACP bonds without SRIOV environment
If an aggregate has the following conditions:
- The SRIOV LAG DDP package has been enabled
- The bond is in 802.3ad LACP mode
- The bond is disqualified from supporting SRIOV VF LAG
- Both interfaces were added simultaneously to the bond (same command)

Then there is a chance that the two interfaces will be assigned different
LACP Aggregator ID's.  This will cause a failure of the LACP control over
the bond.

To fix this, we can detect if the primary interface for the bond (as
defined by the driver) is not in switchdev mode, and exit the setup flow
if so.

Reproduction steps:

%> ip link add bond0 type bond mode 802.3ad miimon 100
%> ip link set bond0 up
%> ifenslave bond0 eth0 eth1
%> cat /proc/net/bonding/bond0 | grep Agg

Check for Aggregator IDs that differ.

Fixes: ec5a6c5f79 ("ice: process events created by lag netdev event handler")
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-05-19 08:38:22 -07:00
Jacob Keller
bbd95160a0 ice: fix vf->num_mac count with port representors
The ice_vc_repr_add_mac() function indicates that it does not store the MAC
address filters in the firmware. However, it still increments vf->num_mac.
This is incorrect, as vf->num_mac should represent the number of MAC
filters currently programmed to firmware.

Indeed, we only perform this increment if the requested filter is a unicast
address that doesn't match the existing vf->hw_lan_addr. In addition,
ice_vc_repr_del_mac() does not decrement the vf->num_mac counter. This
results in the counter becoming out of sync with the actual count.

As it turns out, vf->num_mac is currently only used in legacy made without
port representors. The single place where the value is checked is for
enforcing a filter limit on untrusted VFs.

Upcoming patches to support VF Live Migration will use this value when
determining the size of the TLV for MAC address filters. Fix the
representor mode function to stop incrementing the counter incorrectly.

Fixes: ac19e03ef7 ("ice: allow process VF opcodes in different ways")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-05-19 08:38:22 -07:00
Marc Kleine-Budde
8283fd51e6 Merge patch series "can: bcm: add locking for bcm_op runtime updates"
This series fixes an use-after-free read, and an out-of-bounds read in
the CAN Broadcast Manager (BCM) protocol found by Anderson Nascimento.

Link: https://patch.msgid.link/20250519125027.11900-1-socketcan@hartkopp.net
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-19 17:09:33 +02:00
Oliver Hartkopp
dac5e62491 can: bcm: add missing rcu read protection for procfs content
When the procfs content is generated for a bcm_op which is in the process
to be removed the procfs output might show unreliable data (UAF).

As the removal of bcm_op's is already implemented with rcu handling this
patch adds the missing rcu_read_lock() and makes sure the list entries
are properly removed under rcu protection.

Fixes: f1b4e32aca ("can: bcm: use call_rcu() instead of costly synchronize_rcu()")
Reported-by: Anderson Nascimento <anderson@allelesecurity.com>
Suggested-by: Anderson Nascimento <anderson@allelesecurity.com>
Tested-by: Anderson Nascimento <anderson@allelesecurity.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://patch.msgid.link/20250519125027.11900-2-socketcan@hartkopp.net
Cc: stable@vger.kernel.org # >= 5.4
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-19 16:58:19 +02:00
Oliver Hartkopp
c2aba69d0c can: bcm: add locking for bcm_op runtime updates
The CAN broadcast manager (CAN BCM) can send a sequence of CAN frames via
hrtimer. The content and also the length of the sequence can be changed
resp reduced at runtime where the 'currframe' counter is then set to zero.

Although this appeared to be a safe operation the updates of 'currframe'
can be triggered from user space and hrtimer context in bcm_can_tx().
Anderson Nascimento created a proof of concept that triggered a KASAN
slab-out-of-bounds read access which can be prevented with a spin_lock_bh.

At the rework of bcm_can_tx() the 'count' variable has been moved into
the protected section as this variable can be modified from both contexts
too.

Fixes: ffd980f976 ("[CAN]: Add broadcast manager (bcm) protocol")
Reported-by: Anderson Nascimento <anderson@allelesecurity.com>
Tested-by: Anderson Nascimento <anderson@allelesecurity.com>
Reviewed-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://patch.msgid.link/20250519125027.11900-1-socketcan@hartkopp.net
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-19 16:58:12 +02:00
Kent Overstreet
cbed8287e5 bcachefs: mkwrite() now only dirties one page
Don't dirty the whole folio - fixes write amplification with
applications doing mmaped writes.

https://www.reddit.com/r/bcachefs/comments/1klzcg1/incredible_amounts_of_write_amplification_when/

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-19 08:28:41 -04:00
Judith Mendez
71c9475b1e mmc: sdhci_am654: Add SDHCI_QUIRK2_SUPPRESS_V1P8_ENA quirk to am62 compatible
Add a new struct for platform data for the ti,am62-sdhci compatible to
apply additional quirks, namely "SDHCI_QUIRK2_SUPPRESS_V1P8_ENA", to
host controllers with am62 compatible.

Note, the fix was originally introduced by commit 941a7abd46
("mmc: sdhci_am654: Add sdhci_am654_start_signal_voltage_switch") but was
found to be applied too broadly and had to be reverted.

This fixes MMC init failures seen across am62x boards.

Fixes: ac5a41b472 ("Revert "mmc: sdhci_am654: Add sdhci_am654_start_signal_voltage_switch"")
Fixes: 941a7abd46 ("mmc: sdhci_am654: Add sdhci_am654_start_signal_voltage_switch")
Cc: stable@vger.kernel.org
Suggested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Judith Mendez <jm@ti.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20250516203121.3736379-1-jm@ti.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2025-05-19 14:26:12 +02:00
Rob Herring (Arm)
69c6d83d71 dt-bindings: can: microchip,mcp2510: Fix $id path
The "$id" value must match the relative path under bindings/ and is
missing the "net" sub-directory.

Fixes: 09328600c2 ("dt-bindings: can: convert microchip,mcp251x.txt to yaml")
Signed-off-by: "Rob Herring (Arm)" <robh@kernel.org>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20250507154201.1589542-1-robh@kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-19 14:06:42 +02:00
Ilia Gavrilov
239af1970b llc: fix data loss when reading from a socket in llc_ui_recvmsg()
For SOCK_STREAM sockets, if user buffer size (len) is less
than skb size (skb->len), the remaining data from skb
will be lost after calling kfree_skb().

To fix this, move the statement for partial reading
above skb deletion.

Found by InfoTeCS on behalf of Linux Verification Center (linuxtesting.org)

Fixes: 30a584d944 ("[LLX]: SOCK_DGRAM interface fixes")
Cc: stable@vger.kernel.org
Signed-off-by: Ilia Gavrilov <Ilia.Gavrilov@infotecs.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-19 12:12:54 +01:00
feijuan.li
6692dbc15e drm/edid: fixed the bug that hdr metadata was not reset
When DP connected to a device with HDR capability,
the hdr structure was filled.Then connected to another
sink device without hdr capability, but the hdr info
still exist.

Fixes: e85959d6cb ("drm: Parse HDR metadata info from EDID")
Cc: <stable@vger.kernel.org> # v5.3+
Signed-off-by: "feijuan.li" <feijuan.li@samsung.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/20250514063511.4151780-1-feijuan.li@samsung.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-05-19 12:26:08 +03:00
Ivan Pravdin
b2df03ed40 crypto: algif_hash - fix double free in hash_accept
If accept(2) is called on socket type algif_hash with
MSG_MORE flag set and crypto_ahash_import fails,
sk2 is freed. However, it is also freed in af_alg_release,
leading to slab-use-after-free error.

Fixes: fe869cdb89 ("crypto: algif_hash - User-space interface for hash operations")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ivan Pravdin <ipravdin.official@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-19 13:44:16 +08:00
Dominik Grzegorzek
d6ebcde6d4 padata: do not leak refcount in reorder_work
A recent patch that addressed a UAF introduced a reference count leak:
the parallel_data refcount is incremented unconditionally, regardless
of the return value of queue_work(). If the work item is already queued,
the incremented refcount is never decremented.

Fix this by checking the return value of queue_work() and decrementing
the refcount when necessary.

Resolves:

Unreferenced object 0xffff9d9f421e3d80 (size 192):
  comm "cryptomgr_probe", pid 157, jiffies 4294694003
  hex dump (first 32 bytes):
    80 8b cf 41 9f 9d ff ff b8 97 e0 89 ff ff ff ff  ...A............
    d0 97 e0 89 ff ff ff ff 19 00 00 00 1f 88 23 00  ..............#.
  backtrace (crc 838fb36):
    __kmalloc_cache_noprof+0x284/0x320
    padata_alloc_pd+0x20/0x1e0
    padata_alloc_shell+0x3b/0xa0
    0xffffffffc040a54d
    cryptomgr_probe+0x43/0xc0
    kthread+0xf6/0x1f0
    ret_from_fork+0x2f/0x50
    ret_from_fork_asm+0x1a/0x30

Fixes: dd7d37ccf6 ("padata: avoid UAF for reorder_work")
Cc: <stable@vger.kernel.org>
Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-19 13:44:16 +08:00
Kent Overstreet
494d458cfa bcachefs: fix extent_has_stripe_ptr()
This wasn't checking indirect extents.

Fixes: https://github.com/koverstreet/bcachefs/issues/887
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-18 22:35:33 -04:00
Wang Zhaolong
a7a8fe56e9 smb: client: Fix use-after-free in cifs_fill_dirent
There is a race condition in the readdir concurrency process, which may
access the rsp buffer after it has been released, triggering the
following KASAN warning.

 ==================================================================
 BUG: KASAN: slab-use-after-free in cifs_fill_dirent+0xb03/0xb60 [cifs]
 Read of size 4 at addr ffff8880099b819c by task a.out/342975

 CPU: 2 UID: 0 PID: 342975 Comm: a.out Not tainted 6.15.0-rc6+ #240 PREEMPT(full)
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014
 Call Trace:
  <TASK>
  dump_stack_lvl+0x53/0x70
  print_report+0xce/0x640
  kasan_report+0xb8/0xf0
  cifs_fill_dirent+0xb03/0xb60 [cifs]
  cifs_readdir+0x12cb/0x3190 [cifs]
  iterate_dir+0x1a1/0x520
  __x64_sys_getdents+0x134/0x220
  do_syscall_64+0x4b/0x110
  entry_SYSCALL_64_after_hwframe+0x76/0x7e
 RIP: 0033:0x7f996f64b9f9
 Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89
 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
 f0 ff ff  0d f7 c3 0c 00 f7 d8 64 89 8
 RSP: 002b:00007f996f53de78 EFLAGS: 00000207 ORIG_RAX: 000000000000004e
 RAX: ffffffffffffffda RBX: 00007f996f53ecdc RCX: 00007f996f64b9f9
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
 RBP: 00007f996f53dea0 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000207 R12: ffffffffffffff88
 R13: 0000000000000000 R14: 00007ffc8cd9a500 R15: 00007f996f51e000
  </TASK>

 Allocated by task 408:
  kasan_save_stack+0x20/0x40
  kasan_save_track+0x14/0x30
  __kasan_slab_alloc+0x6e/0x70
  kmem_cache_alloc_noprof+0x117/0x3d0
  mempool_alloc_noprof+0xf2/0x2c0
  cifs_buf_get+0x36/0x80 [cifs]
  allocate_buffers+0x1d2/0x330 [cifs]
  cifs_demultiplex_thread+0x22b/0x2690 [cifs]
  kthread+0x394/0x720
  ret_from_fork+0x34/0x70
  ret_from_fork_asm+0x1a/0x30

 Freed by task 342979:
  kasan_save_stack+0x20/0x40
  kasan_save_track+0x14/0x30
  kasan_save_free_info+0x3b/0x60
  __kasan_slab_free+0x37/0x50
  kmem_cache_free+0x2b8/0x500
  cifs_buf_release+0x3c/0x70 [cifs]
  cifs_readdir+0x1c97/0x3190 [cifs]
  iterate_dir+0x1a1/0x520
  __x64_sys_getdents64+0x134/0x220
  do_syscall_64+0x4b/0x110
  entry_SYSCALL_64_after_hwframe+0x76/0x7e

 The buggy address belongs to the object at ffff8880099b8000
  which belongs to the cache cifs_request of size 16588
 The buggy address is located 412 bytes inside of
  freed 16588-byte region [ffff8880099b8000, ffff8880099bc0cc)

 The buggy address belongs to the physical page:
 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x99b8
 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
 anon flags: 0x80000000000040(head|node=0|zone=1)
 page_type: f5(slab)
 raw: 0080000000000040 ffff888001e03400 0000000000000000 dead000000000001
 raw: 0000000000000000 0000000000010001 00000000f5000000 0000000000000000
 head: 0080000000000040 ffff888001e03400 0000000000000000 dead000000000001
 head: 0000000000000000 0000000000010001 00000000f5000000 0000000000000000
 head: 0080000000000003 ffffea0000266e01 00000000ffffffff 00000000ffffffff
 head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000008
 page dumped because: kasan: bad access detected

 Memory state around the buggy address:
  ffff8880099b8080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff8880099b8100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 >ffff8880099b8180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                             ^
  ffff8880099b8200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff8880099b8280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ==================================================================

POC is available in the link [1].

The problem triggering process is as follows:

Process 1                       Process 2
-----------------------------------------------------------------
cifs_readdir
  /* file->private_data == NULL */
  initiate_cifs_search
    cifsFile = kzalloc(sizeof(struct cifsFileInfo), GFP_KERNEL);
    smb2_query_dir_first ->query_dir_first()
      SMB2_query_directory
        SMB2_query_directory_init
        cifs_send_recv
        smb2_parse_query_directory
          srch_inf->ntwrk_buf_start = (char *)rsp;
          srch_inf->srch_entries_start = (char *)rsp + ...
          srch_inf->last_entry = (char *)rsp + ...
          srch_inf->smallBuf = true;
  find_cifs_entry
    /* if (cfile->srch_inf.ntwrk_buf_start) */
    cifs_small_buf_release(cfile->srch_inf // free

                        cifs_readdir  ->iterate_shared()
                          /* file->private_data != NULL */
                          find_cifs_entry
                            /* in while (...) loop */
                            smb2_query_dir_next  ->query_dir_next()
                              SMB2_query_directory
                                SMB2_query_directory_init
                                cifs_send_recv
                                  compound_send_recv
                                    smb_send_rqst
                                    __smb_send_rqst
                                      rc = -ERESTARTSYS;
                                      /* if (fatal_signal_pending()) */
                                      goto out;
                                      return rc
                            /* if (cfile->srch_inf.last_entry) */
                            cifs_save_resume_key()
                              cifs_fill_dirent // UAF
                            /* if (rc) */
                            return -ENOENT;

Fix this by ensuring the return code is checked before using pointers
from the srch_inf.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=220131 [1]
Fixes: a364bc0b37 ("[CIFS] fix saving of resume key before CIFSFindNext")
Cc: stable@vger.kernel.org
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-18 16:53:06 -05:00
Linus Torvalds
a5806cd506 Linux 6.15-rc7 2025-05-18 13:57:29 -07:00
Stephen Boyd
6a56880562 Merge tag 'sunxi-clk-fixes-for-6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into clk-fixes
Pull Allwinner clk driver fixes from Chen-Yu Tsai:

Only two changes:

- Fix the order of arguments in clk macro for
  SUNXI_CCU_MP_DATA_WITH_MUX_GATE_FEAT that was recently introduced in
  v6.15-rc1

- Add missing post-divider for D1 MMC clocks to correct halved
  performance

* tag 'sunxi-clk-fixes-for-6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
  clk: sunxi-ng: d1: Add missing divider for MMC mod clocks
  clk: sunxi-ng: fix order of arguments in clock macro
2025-05-17 20:46:38 -07:00
Kent Overstreet
49771a7578 bcachefs: Fix bch2_btree_path_traverse_cached() when paths realloced
btree_key_cache_fill() will allocate and traverse another path (for the
underlying btree), so we can't hold pointers to paths across a call - we
have to pass indices.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-17 18:46:17 -04:00
Linus Torvalds
5723cc3450 Merge tag 'dmaengine-fix-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine fixes from Vinod Koul:
 "This has a bunch of idxd driver fixes, dmatest revert and bunch of
  smaller driver fixes:

   - a bunch of idxd potential mem leak fixes

   - dmatest revert for waiting for interrupt fix as that causes issue

   - a couple of ti k3 udma fixes for locking and cap_mask

   - mediatek deadlock fix and unused variable cleanup fix"

* tag 'dmaengine-fix-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
  dmaengine: mediatek: drop unused variable
  dmaengine: fsl-edma: Fix return code for unhandled interrupts
  dmaengine: mediatek: Fix a possible deadlock error in mtk_cqdma_tx_status()
  dmaengine: idxd: Fix ->poll() return value
  dmaengine: idxd: Refactor remove call with idxd_cleanup() helper
  dmaengine: idxd: Add missing idxd cleanup to fix memory leak in remove call
  dmaengine: idxd: fix memory leak in error handling path of idxd_pci_probe
  dmaengine: idxd: fix memory leak in error handling path of idxd_alloc
  dmaengine: idxd: Add missing cleanups in cleanup internals
  dmaengine: idxd: Add missing cleanup for early error out in idxd_setup_internals
  dmaengine: idxd: fix memory leak in error handling path of idxd_setup_groups
  dmaengine: idxd: fix memory leak in error handling path of idxd_setup_engines
  dmaengine: idxd: fix memory leak in error handling path of idxd_setup_wqs
  dmaengine: ptdma: Move variable condition check to the first place and remove redundancy
  dmaengine: idxd: Fix allowing write() from different address spaces
  dmaengine: ti: k3-udma: Add missing locking
  dmaengine: ti: k3-udma: Use cap_mask directly from dma_device structure instead of a local copy
  dmaengine: Revert "dmaengine: dmatest: Fix dmatest waiting less when interrupted"
  dmaengine: idxd: cdev: Fix uninitialized use of sva in idxd_cdev_open
2025-05-17 12:53:39 -07:00
Linus Torvalds
21eeefe769 Merge tag 'phy-fixes-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
Pull phy fixes from Vinod Koul:
 "A bunch of renesas fixes and few smaller fixes in other drivers:

   - Rensas fixes for unbind ole detection, irq, locking etc

   - tegra fixes for error handling at init and UTMI power states and
     stray unlock fix

   - rockchip missing assignment and pll output fixes

   - startfive usb host detection fixes"

* tag 'phy-fixes-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
  phy: Fix error handling in tegra_xusb_port_init
  phy: renesas: rcar-gen3-usb2: Set timing registers only once
  phy: renesas: rcar-gen3-usb2: Assert PLL reset on PHY power off
  phy: renesas: rcar-gen3-usb2: Lock around hardware registers and driver data
  phy: renesas: rcar-gen3-usb2: Move IRQ request in probe
  phy: renesas: rcar-gen3-usb2: Fix role detection on unbind/bind
  phy: tegra: xusb: remove a stray unlock
  phy: phy-rockchip-samsung-hdptx: Fix PHY PLL output 50.25MHz error
  phy: starfive: jh7110-usb: Fix USB 2.0 host occasional detection failure
  phy: rockchip-samsung-dcphy: Add missing assignment
  phy: can-transceiver: Re-instate "mux-states" property presence check
  phy: qcom-qmp-ufs: check for mode type for phy setting
  phy: tegra: xusb: Use a bitmask for UTMI pad power state tracking
2025-05-17 12:49:47 -07:00
Linus Torvalds
6aa6f8ca08 Merge tag 'soundwire-6.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire
Pull soundwire fix from Vinod Koul:

 - Fix for irq domain creation race in the core

* tag 'soundwire-6.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
  soundwire: bus: Fix race on the creation of the IRQ domain
2025-05-17 12:04:19 -07:00
Linus Torvalds
e72e784fb1 Merge tag 'mm-hotfixes-stable-2025-05-17-09-41' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull hotfixes from Andrew Morton:
 "Nine singleton hotfixes, all MM.  Four are cc:stable"

* tag 'mm-hotfixes-stable-2025-05-17-09-41' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm: userfaultfd: correct dirty flags set for both present and swap pte
  zsmalloc: don't underflow size calculation in zs_obj_write()
  mm/page_alloc: fix race condition in unaccepted memory handling
  mm/page_alloc: ensure try_alloc_pages() plays well with unaccepted memory
  MAINTAINERS: add mm GUP section
  mm/codetag: move tag retrieval back upfront in __free_pages()
  mm/memory: fix mapcount / refcount sanity check for mTHP reuse
  kernel/fork: only call untrack_pfn_clear() on VMAs duplicated for fork()
  mm: hugetlb: fix incorrect fallback for subpool
2025-05-17 10:56:52 -07:00
Linus Torvalds
205b2bd793 Merge tag 'irq-urgent-2025-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc irqchip driver fixes from Ingo Molnar:

 - Remove the MSI_CHIP_FLAG_SET_ACK flag from 5 irqchip drivers
   that did not require it

 - Fix IRQ handling delays in the riscv-imsic irqchip driver

* tag 'irq-urgent-2025-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/riscv-imsic: Start local sync timer on correct CPU
  irqchip: Drop MSI_CHIP_FLAG_SET_ACK from unsuspecting MSI drivers
2025-05-17 09:02:11 -07:00
Linus Torvalds
56b2b1fc90 Merge tag 'x86-urgent-2025-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar:

 - Fix SEV-SNP kdump bugs

 - Update the email address of Alexey Makhalov in MAINTAINERS

 - Add the CPU feature flag for the Zen6 microarchitecture

 - Fix typo in system message

* tag 'x86-urgent-2025-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Remove duplicated word in warning message
  x86/CPU/AMD: Add X86_FEATURE_ZEN6
  x86/sev: Make sure pages are not skipped during kdump
  x86/sev: Do not touch VMSA pages during SNP guest memory kdump
  MAINTAINERS: Update Alexey Makhalov's email address
  x86/sev: Fix operator precedence in GHCB_MSR_VMPL_REQ_LEVEL macro
2025-05-17 08:43:51 -07:00
Linus Torvalds
4bcaa59009 Merge tag 'perf-urgent-2025-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 perf event fix from Ingo Molnar:
 "Fix PEBS-via-PT crash"

* tag 'perf-urgent-2025-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Fix segfault with PEBS-via-PT with sample_freq
2025-05-17 08:38:11 -07:00
Linus Torvalds
c586c97dd8 Merge tag 'loongarch-fixes-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
 "Fix some bugs in kernel-fpu, cpu idle function, hibernation and
  uprobes"

* tag 'loongarch-fixes-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: uprobes: Remove redundant code about resume_era
  LoongArch: uprobes: Remove user_{en,dis}able_single_step()
  LoongArch: Save and restore CSR.CNTC for hibernation
  LoongArch: Move __arch_cpu_idle() to .cpuidle.text section
  LoongArch: Fix MAX_REG_OFFSET calculation
  LoongArch: Prevent cond_resched() occurring within kernel-fpu
2025-05-17 07:48:25 -07:00
Linus Torvalds
a1317e1cb2 Merge tag 'i2c-for-6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fix from Wolfram Sang:

 - designware: cleanup properly on probe failure

* tag 'i2c-for-6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: designware: Fix an error handling path in i2c_dw_pci_probe()
2025-05-17 07:46:42 -07:00
Wolfram Sang
6c72fc56ab Merge tag 'i2c-host-fixes-6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current
i2c-host-fixes for v6.15-rc7

- designware: cleanup properly on probe failure
2025-05-17 10:22:34 +02:00
Linus Torvalds
172a9d9433 Merge tag '6.15-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:

 - Fix memory leak in mkdir error path

 - Fix max rsize miscalculation after channel reconnect

* tag '6.15-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb: client: fix zero rsize error messages
  smb: client: fix memory leak during error handling for POSIX mkdir
2025-05-16 18:02:41 -07:00
Paolo Abeni
c46286fdd6 mr: consolidate the ipmr_can_free_table() checks.
Guoyu Yin reported a splat in the ipmr netns cleanup path:

WARNING: CPU: 2 PID: 14564 at net/ipv4/ipmr.c:440 ipmr_free_table net/ipv4/ipmr.c:440 [inline]
WARNING: CPU: 2 PID: 14564 at net/ipv4/ipmr.c:440 ipmr_rules_exit+0x135/0x1c0 net/ipv4/ipmr.c:361
Modules linked in:
CPU: 2 UID: 0 PID: 14564 Comm: syz.4.838 Not tainted 6.14.0 #1
Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:ipmr_free_table net/ipv4/ipmr.c:440 [inline]
RIP: 0010:ipmr_rules_exit+0x135/0x1c0 net/ipv4/ipmr.c:361
Code: ff df 48 c1 ea 03 80 3c 02 00 75 7d 48 c7 83 60 05 00 00 00 00 00 00 5b 5d 41 5c 41 5d 41 5e e9 71 67 7f 00 e8 4c 2d 8a fd 90 <0f> 0b 90 eb 93 e8 41 2d 8a fd 0f b6 2d 80 54 ea 01 31 ff 89 ee e8
RSP: 0018:ffff888109547c58 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff888108c12dc0 RCX: ffffffff83e09868
RDX: ffff8881022b3300 RSI: ffffffff83e098d4 RDI: 0000000000000005
RBP: ffff888104288000 R08: 0000000000000000 R09: ffffed10211825c9
R10: 0000000000000001 R11: ffff88801816c4a0 R12: 0000000000000001
R13: ffff888108c13320 R14: ffff888108c12dc0 R15: fffffbfff0b74058
FS:  00007f84f39316c0(0000) GS:ffff88811b100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f84f3930f98 CR3: 0000000113b56000 CR4: 0000000000350ef0
Call Trace:
 <TASK>
 ipmr_net_exit_batch+0x50/0x90 net/ipv4/ipmr.c:3160
 ops_exit_list+0x10c/0x160 net/core/net_namespace.c:177
 setup_net+0x47d/0x8e0 net/core/net_namespace.c:394
 copy_net_ns+0x25d/0x410 net/core/net_namespace.c:516
 create_new_namespaces+0x3f6/0xaf0 kernel/nsproxy.c:110
 unshare_nsproxy_namespaces+0xc3/0x180 kernel/nsproxy.c:228
 ksys_unshare+0x78d/0x9a0 kernel/fork.c:3342
 __do_sys_unshare kernel/fork.c:3413 [inline]
 __se_sys_unshare kernel/fork.c:3411 [inline]
 __x64_sys_unshare+0x31/0x40 kernel/fork.c:3411
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xa6/0x1a0 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f84f532cc29
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f84f3931038 EFLAGS: 00000246 ORIG_RAX: 0000000000000110
RAX: ffffffffffffffda RBX: 00007f84f5615fa0 RCX: 00007f84f532cc29
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000040000400
RBP: 00007f84f53fba18 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f84f5615fa0 R15: 00007fff51c5f328
 </TASK>

The running kernel has CONFIG_IP_MROUTE_MULTIPLE_TABLES disabled, and
the sanity check for such build is still too loose.

Address the issue consolidating the relevant sanity check in a single
helper regardless of the kernel configuration. Also share it between
the ipv4 and ipv6 code.

Reported-by: Guoyu Yin <y04609127@gmail.com>
Fixes: 50b9420444 ("ipmr: tune the ipmr_can_free_table() checks.")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/372dc261e1bf12742276e1b984fc5a071b7fc5a8.1747321903.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-16 17:53:48 -07:00
Lorenzo Bianconi
d6d2b0e153 net: airoha: Fix page recycling in airoha_qdma_rx_process()
Do not recycle the page twice in airoha_qdma_rx_process routine in case
of error. Just run dev_kfree_skb() if the skb has been allocated and marked
for recycling. Run page_pool_put_full_page() directly if the skb has not
been allocated yet.
Moreover, rely on DMA address from queue entry element instead of reading
it from the DMA descriptor for DMA syncing in airoha_qdma_rx_process().

Fixes: e12182ddb6 ("net: airoha: Enable Rx Scatter-Gather")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250515-airoha-fix-rx-process-error-condition-v2-1-657e92c894b9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-16 16:35:45 -07:00
Linus Torvalds
12b6c62c03 Merge tag 'drm-fixes-2025-05-17' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
 "Weekly drm fixes, I'll be honest and say I think this is larger than
  I'd prefer at this point, the main blow out point is that xe has two
  larger fixes.

  One is a fix for active context utilisation reporting, it's for a
  reported regression and will end up in stable anyways, so I don't see
  any point in holding it up.

  The second is a fix for mixed cpu/gpu atomics, which are currently
  broken, but are also not something your average desktop/laptop user is
  going to hit in normal operation, and having them fixed now is better
  than threading them through stable later.

  Other than those, it's mostly the usual, a bunch of amdgpu randoms and
  a few other minor fixes.

  dma-buf:
   - Avoid memory reordering in fence handling

  meson:
   - Avoid integer overflow in mode-clock calculations

  panel-mipi-dbi:
   - Fix output with drm_client_setup_with_fourcc()

  amdgpu:
   - Fix CSA unmap
   - Fix MALL size reporting on GFX11.5
   - AUX fix
   - DCN 3.5 fix
   - VRR fix
   - DP MST fix
   - DML 2.1 fixes
   - Silence DP AUX spam
   - DCN 4.0.1 cursor fix
   - VCN 4.0.5 fix

  ivpu:
   - Fix buffer size in debugfs code

  gpuvm:
   - Add timeslicing and allocation restriction for SVM

  xe:
   - Fix shrinker debugfs name
   - Add HW workaround to Xe2
   - Fix SVM when mixing GPU and CPU atomics
   - Fix per client engine utilization due to active contexts not saving
     timestamp with lite restore enabled"

* tag 'drm-fixes-2025-05-17' of https://gitlab.freedesktop.org/drm/kernel: (24 commits)
  drm/xe: Add WA BB to capture active context utilization
  drm/xe: Save the gt pointer in lrc and drop the tile
  drm/xe: Save CTX_TIMESTAMP mmio value instead of LRC value
  drm/xe: Timeslice GPU on atomic SVM fault
  drm/gpusvm: Add timeslicing support to GPU SVM
  drm/xe: Strict migration policy for atomic SVM faults
  drm/gpusvm: Introduce devmem_only flag for allocation
  drm/xe/xe2hpg: Add Wa_22021007897
  drm/amdgpu: read back register after written for VCN v4.0.5
  Revert "drm/amd/display: Hardware cursor changes color when switched to software cursor"
  dma-buf: insert memory barrier before updating num_fences
  drm/xe: Fix the gem shrinker name
  drm/amd/display: Avoid flooding unnecessary info messages
  drm/amd/display: Fix null check of pipe_ctx->plane_state for update_dchubp_dpp
  drm/amd/display: check stream id dml21 wrapper to get plane_id
  drm/amd/display: fix link_set_dpms_off multi-display MST corner case
  drm/amd/display: Defer BW-optimization-blocked DRR adjustments
  Revert: "drm/amd/display: Enable urgent latency adjustment on DCN35"
  drm/amd/display: Correct the reply value when AUX write incomplete
  drm/amdgpu: fix incorrect MALL size for GFX1151
  ...
2025-05-16 16:07:39 -07:00
Ronak Doshi
43f0999af0 vmxnet3: update MTU after device quiesce
Currently, when device mtu is updated, vmxnet3 updates netdev mtu, quiesces
the device and then reactivates it for the ESXi to know about the new mtu.
So, technically the OS stack can start using the new mtu before ESXi knows
about the new mtu.

This can lead to issues for TSO packets which use mss as per the new mtu
configured. This patch fixes this issue by moving the mtu write after
device quiesce.

Cc: stable@vger.kernel.org
Fixes: d1a890fa37 ("net: VMware virtual Ethernet NIC driver: vmxnet3")
Signed-off-by: Ronak Doshi <ronak.doshi@broadcom.com>
Acked-by: Guolin Yang <guolin.yang@broadcom.com>
Changes v1-> v2:
  Moved MTU write after destroy of rx rings
Link: https://patch.msgid.link/20250515190457.8597-1-ronak.doshi@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-16 16:04:27 -07:00
Ido Schimmel
91b6dbced0 bridge: netfilter: Fix forwarding of fragmented packets
When netfilter defrag hooks are loaded (due to the presence of conntrack
rules, for example), fragmented packets entering the bridge will be
defragged by the bridge's pre-routing hook (br_nf_pre_routing() ->
ipv4_conntrack_defrag()).

Later on, in the bridge's post-routing hook, the defragged packet will
be fragmented again. If the size of the largest fragment is larger than
what the kernel has determined as the destination MTU (using
ip_skb_dst_mtu()), the defragged packet will be dropped.

Before commit ac6627a28d ("net: ipv4: Consolidate ipv4_mtu and
ip_dst_mtu_maybe_forward"), ip_skb_dst_mtu() would return dst_mtu() as
the destination MTU. Assuming the dst entry attached to the packet is
the bridge's fake rtable one, this would simply be the bridge's MTU (see
fake_mtu()).

However, after above mentioned commit, ip_skb_dst_mtu() ends up
returning the route's MTU stored in the dst entry's metrics. Ideally, in
case the dst entry is the bridge's fake rtable one, this should be the
bridge's MTU as the bridge takes care of updating this metric when its
MTU changes (see br_change_mtu()).

Unfortunately, the last operation is a no-op given the metrics attached
to the fake rtable entry are marked as read-only. Therefore,
ip_skb_dst_mtu() ends up returning 1500 (the initial MTU value) and
defragged packets are dropped during fragmentation when dealing with
large fragments and high MTU (e.g., 9k).

Fix by moving the fake rtable entry's metrics to be per-bridge (in a
similar fashion to the fake rtable entry itself) and marking them as
writable, thereby allowing MTU changes to be reflected.

Fixes: 62fa8a846d ("net: Implement read-only protection and COW'ing of metrics.")
Fixes: 33eb9873a2 ("bridge: initialize fake_rtable metrics")
Reported-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Closes: https://lore.kernel.org/netdev/PH0PR10MB4504888284FF4CBA648197D0ACB82@PH0PR10MB4504.namprd10.prod.outlook.com/
Tested-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20250515084848.727706-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-16 16:02:06 -07:00
Jakob Unterwurzacher
ba54bce747 net: dsa: microchip: linearize skb for tail-tagging switches
The pointer arithmentic for accessing the tail tag only works
for linear skbs.

For nonlinear skbs, it reads uninitialized memory inside the
skb headroom, essentially randomizing the tag. I have observed
it gets set to 6 most of the time.

Example where ksz9477_rcv thinks that the packet from port 1 comes from port 6
(which does not exist for the ksz9896 that's in use), dropping the packet.
Debug prints added by me (not included in this patch):

	[  256.645337] ksz9477_rcv:323 tag0=6
	[  256.645349] skb len=47 headroom=78 headlen=0 tailroom=0
	               mac=(64,14) mac_len=14 net=(78,0) trans=78
	               shinfo(txflags=0 nr_frags=1 gso(size=0 type=0 segs=0))
	               csum(0x0 start=0 offset=0 ip_summed=0 complete_sw=0 valid=0 level=0)
	               hash(0x0 sw=0 l4=0) proto=0x00f8 pkttype=1 iif=3
	               priority=0x0 mark=0x0 alloc_cpu=0 vlan_all=0x0
	               encapsulation=0 inner(proto=0x0000, mac=0, net=0, trans=0)
	[  256.645377] dev name=end1 feat=0x0002e10200114bb3
	[  256.645386] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
	[  256.645395] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
	[  256.645403] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
	[  256.645411] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
	[  256.645420] skb headroom: 00000040: ff ff ff ff ff ff 00 1c 19 f2 e2 db 08 06
	[  256.645428] skb frag:     00000000: 00 01 08 00 06 04 00 01 00 1c 19 f2 e2 db 0a 02
	[  256.645436] skb frag:     00000010: 00 83 00 00 00 00 00 00 0a 02 a0 2f 00 00 00 00
	[  256.645444] skb frag:     00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
	[  256.645452] ksz_common_rcv:92 dsa_conduit_find_user returned NULL

Call skb_linearize before trying to access the tag.

This patch fixes ksz9477_rcv which is used by the ksz9896 I have at
hand, and also applies the same fix to ksz8795_rcv which seems to have
the same problem.

Signed-off-by: Jakob Unterwurzacher <jakob.unterwurzacher@cherry.de>
CC: stable@vger.kernel.org
Fixes: 016e43a26b ("net: dsa: ksz: Add KSZ8795 tag code")
Fixes: 8b8010fb78 ("dsa: add support for Microchip KSZ tail tagging")
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20250515072920.2313014-1-jakob.unterwurzacher@cherry.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-16 16:00:17 -07:00
Stanislav Fomichev
6b1d3c5f67 team: grab team lock during team_change_rx_flags
Syzkaller reports the following issue:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:578

 netdev_lock include/linux/netdevice.h:2751 [inline]
 netdev_lock_ops include/net/netdev_lock.h:42 [inline]
 dev_set_promiscuity+0x10e/0x260 net/core/dev_api.c:285
 bond_set_promiscuity drivers/net/bonding/bond_main.c:922 [inline]
 bond_change_rx_flags+0x219/0x690 drivers/net/bonding/bond_main.c:4732
 dev_change_rx_flags net/core/dev.c:9145 [inline]
 __dev_set_promiscuity+0x3f5/0x590 net/core/dev.c:9189
 netif_set_promiscuity+0x50/0xe0 net/core/dev.c:9201
 dev_set_promiscuity+0x126/0x260 net/core/dev_api.c:286
 ^^ all of the above is under rcu lock
 team_change_rx_flags+0x1b3/0x330 drivers/net/team/team_core.c:1785
 dev_change_rx_flags net/core/dev.c:9145 [inline]
 __dev_set_promiscuity+0x3f5/0x590 net/core/dev.c:9189
 netif_set_promiscuity+0x50/0xe0 net/core/dev.c:9201
 dev_set_promiscuity+0x126/0x260 net/core/dev_api.c:286
 hsr_del_port+0x25e/0x2d0 net/hsr/hsr_slave.c:233
 hsr_netdev_notify+0x827/0xb60 net/hsr/hsr_main.c:104
 notifier_call_chain+0x1b3/0x3e0 kernel/notifier.c:85
 call_netdevice_notifiers_extack net/core/dev.c:2214 [inline]
 call_netdevice_notifiers net/core/dev.c:2228 [inline]
 unregister_netdevice_many_notify+0x15d8/0x2330 net/core/dev.c:11970
 rtnl_delete_link net/core/rtnetlink.c:3522 [inline]
 rtnl_dellink+0x488/0x710 net/core/rtnetlink.c:3564
 rtnetlink_rcv_msg+0x7cc/0xb70 net/core/rtnetlink.c:6955
 netlink_rcv_skb+0x219/0x490 net/netlink/af_netlink.c:2534
 netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
 netlink_unicast+0x758/0x8d0 net/netlink/af_netlink.c:1339
 netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1883

team_change_rx_flags runs under rcu lock which means we can't grab
instance lock for the lower devices. Switch to team->lock, similar
to what we already do for team_set_mac_address and team_change_mtu.

Fixes: 78cd408356 ("net: add missing instance lock to dev_set_promiscuity")
Reported-by: syzbot+53485086a41dbb43270a@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=53485086a41dbb43270a
Link: https://lore.kernel.org/netdev/6822cc81.050a0220.f2294.00e8.GAE@google.com
Signed-off-by: Stanislav Fomichev <stfomichev@gmail.com>
Link: https://patch.msgid.link/20250514220319.3505158-1-stfomichev@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-16 15:47:48 -07:00
Sagi Maimon
c9e455581e ptp: ocp: Limit signal/freq counts in summary output functions
The debugfs summary output could access uninitialized elements in
the freq_in[] and signal_out[] arrays, causing NULL pointer
dereferences and triggering a kernel Oops (page_fault_oops).
This patch adds u8 fields (nr_freq_in, nr_signal_out) to track the
number of initialized elements, with a maximum of 4 per array.
The summary output functions are updated to respect these limits,
preventing out-of-bounds access and ensuring safe array handling.

Widen the label variables because the change confuses GCC about
max length of the strings.

Fixes: ef61f5528f ("ptp: ocp: add Adva timecard support")
Signed-off-by: Sagi Maimon <maimon.sagi@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250514073541.35817-1-maimon.sagi@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-16 15:43:44 -07:00
Linus Torvalds
450d2f6e88 Merge tag 'nfs-for-6.15-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:

 - NFS: Fix a couple of missed handlers for the ENETDOWN and ENETUNREACH
   transport errors

 - NFS: Handle Oopsable failure of nfs_get_lock_context in the unlock
   path

 - NFSv4: Fix a race in nfs_local_open_fh()

 - NFSv4/pNFS: Fix a couple of layout segment leaks in layoutreturn

 - NFSv4/pNFS Avoid sharing pNFS DS connections between net namespaces
   since IP addresses are not guaranteed to refer to the same nodes

 - NFS: Don't flush file data while holding multiple directory locks in
   nfs_rename()

* tag 'nfs-for-6.15-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS: Avoid flushing data while holding directory locks in nfs_rename()
  NFS/pnfs: Fix the error path in pnfs_layoutreturn_retry_later_locked()
  NFSv4/pnfs: Reset the layout state after a layoutreturn
  NFS/localio: Fix a race in nfs_local_open_fh()
  nfs: nfs3acl: drop useless assignment in nfs3_get_acl()
  nfs: direct: drop useless initializer in nfs_direct_write_completion()
  nfs: move the nfs4_data_server_cache into struct nfs_net
  nfs: don't share pNFS DS connections between net namespaces
  nfs: handle failure of nfs_get_lock_context in unlock path
  pNFS/flexfiles: Record the RPC errors in the I/O tracepoints
  NFSv4/pnfs: Layoutreturn on close must handle fatal networking errors
  NFSv4: Handle fatal ENETDOWN and ENETUNREACH errors
2025-05-16 14:29:12 -07:00
Trond Myklebust
dcd21b609d NFS: Avoid flushing data while holding directory locks in nfs_rename()
The Linux client assumes that all filehandles are non-volatile for
renames within the same directory (otherwise sillyrename cannot work).
However, the existence of the Linux 'subtree_check' export option has
meant that nfs_rename() has always assumed it needs to flush writes
before attempting to rename.

Since NFSv4 does allow the client to query whether or not the server
exhibits this behaviour, and since knfsd does actually set the
appropriate flag when 'subtree_check' is enabled on an export, it
should be OK to optimise away the write flushing behaviour in the cases
where it is clearly not needed.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2025-05-16 22:31:35 +02:00
Trond Myklebust
28511504f3 NFS/pnfs: Fix the error path in pnfs_layoutreturn_retry_later_locked()
If there isn't a valid layout, or the layout stateid has changed, the
cleanup after a layout return should clear out the old data.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-05-16 22:31:35 +02:00
Trond Myklebust
6d6d7f91cc NFSv4/pnfs: Reset the layout state after a layoutreturn
If there are still layout segments in the layout plh_return_lsegs list
after a layout return, we should be resetting the state to ensure they
eventually get returned as well.

Fixes: 68f744797e ("pNFS: Do not free layout segments that are marked for return")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-05-16 22:31:35 +02:00
Linus Torvalds
83a896549f Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fix from James Bottomley:
 "Fix to zone block devices to make the maximum segment count match what
  the block layer is capable of"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: sd_zbc: block: Respect bio vector limits for REPORT ZONES buffer
2025-05-16 10:28:22 -07:00
Linus Torvalds
6462c247b2 Merge tag 'block-6.15-20250515' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:

 - NVMe pull request via Christoph:
      - fixes for atomic writes (Alan Adamson)
      - fixes for polled CQs in nvmet-epf (Damien Le Moal)
      - fix for polled CQs in nvme-pci (Keith Busch)
      - fix compile on odd configs that need to be forced to inline
        (Kees Cook)
      - one more quirk (Ilya Guterman)

 - Fix for missing allocation of an integrity buffer for some cases

 - Fix for a regression with ublk command cancelation

* tag 'block-6.15-20250515' of git://git.kernel.dk/linux:
  ublk: fix dead loop when canceling io command
  nvme-pci: add NVME_QUIRK_NO_DEEPEST_PS quirk for SOLIDIGM P44 Pro
  nvme: all namespaces in a subsystem must adhere to a common atomic write size
  nvme: multipath: enable BLK_FEAT_ATOMIC_WRITES for multipathing
  nvmet: pci-epf: remove NVMET_PCI_EPF_Q_IS_SQ
  nvmet: pci-epf: improve debug message
  nvmet: pci-epf: cleanup nvmet_pci_epf_raise_irq()
  nvmet: pci-epf: do not fall back to using INTX if not supported
  nvmet: pci-epf: clear completion queue IRQ flag on delete
  nvme-pci: acquire cq_poll_lock in nvme_poll_irqdisable
  nvme-pci: make nvme_pci_npages_prp() __always_inline
  block: always allocate integrity buffer when required
2025-05-16 10:21:25 -07:00
Linus Torvalds
e2661da1b3 Merge tag 'io_uring-6.15-20250515' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:

 - Fix a regression with highmem and mapping of regions, where
   the coalescing code assumes any page is directly mapped

 - Fix an issue with HYBRID_IOPOLL and passthrough commands,
   where the timer wasn't always setup correctly

 - Fix an issue with fdinfo not correctly locking around reading
   the rings, which can be an issue if the ring is being resized
   at the same time

* tag 'io_uring-6.15-20250515' of git://git.kernel.dk/linux:
  io_uring/fdinfo: grab ctx->uring_lock around io_uring_show_fdinfo()
  io_uring/memmap: don't use page_address() on a highmem page
  io_uring/uring_cmd: fix hybrid polling initialization issue
2025-05-16 09:59:24 -07:00
Linus Torvalds
1524cb2830 Merge tag 'xfs-fixes-6.15-rc7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Carlos Maiolino:
 "This includes a bug fix for a possible data corruption vector on the
  zoned allocator garbage collector"

* tag 'xfs-fixes-6.15-rc7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: Fix comment on xfs_trans_ail_update_bulk()
  xfs: Fix a comment on xfs_ail_delete
  xfs: Fail remount with noattr2 on a v5 with v4 enabled
  xfs: fix zoned GC data corruption due to wrong bv_offset
  xfs: free up mp->m_free[0].count in error case
2025-05-16 09:51:49 -07:00
Linus Torvalds
3c21441eef Merge tag 'acpi-6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
 "Fix ACPI PPTT parsing code to address a regression introduced recently
  and add more sanity checking of data supplied by the platform firmware
  to avoid using invalid data (Jeremy Linton)"

* tag 'acpi-6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: PPTT: Fix processor subtable walk
2025-05-16 09:40:07 -07:00
Linus Torvalds
bd8bb9f38a Merge tag 'spi-fix-v6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
 "A few small driver specific fixes, the most substantial one being the
  Tegra one which fixes spurious errors with default delays for chip
  select hold times"

* tag 'spi-fix-v6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: spi-sun4i: fix early activation
  spi: tegra114: Use value to check for invalid delays
  spi: loopback-test: Do not split 1024-byte hexdumps
2025-05-16 09:24:54 -07:00
Linus Torvalds
8303d1ad77 Merge tag 'regulator-fix-v6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
 "This fixes an invalid memory access in the MAX20086 driver which could
  occur during error handling for failed probe due to a hidden use of
  devres in the core DT parsing code"

* tag 'regulator-fix-v6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: max20086: fix invalid memory access
2025-05-16 09:19:37 -07:00
Linus Torvalds
7dc774fde7 Merge tag 'gpio-fixes-for-v6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:

 - fix an interrupt storm on system wake-up in gpio-pca953x

 - fix an out-of-bounds write in gpio-virtuser

 - update MAINTAINERS with an entry for the sloppy logic analyzer

* tag 'gpio-fixes-for-v6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: virtuser: fix potential out-of-bound write
  gpio: pca953x: fix IRQ storm on system wake up
  MAINTAINERS: add me as maintainer for the gpio sloppy logic analyzer
2025-05-16 09:13:51 -07:00
Linus Torvalds
04811c3eb6 Merge tag 'sound-6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "A handful small fixes. The only significant change is the fix for MIDI
  2.0 UMP handling in ALSA sequencer, but as MIDI 2.0 stuff is still new
  and rarely used, the impact should be pretty limited.

  Other than that, quirks for USB-audio and a few cosmetic fixes and
  changes in drivers that should be safe to apply"

* tag 'sound-6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: usb-audio: Add sample rate quirk for Microdia JP001 USB Camera
  ALSA: es1968: Add error handling for snd_pcm_hw_constraint_pow2()
  ALSA: sh: SND_AICA should depend on SH_DMA_API
  ALSA: usb-audio: Add sample rate quirk for Audioengine D1
  ALSA: ump: Fix a typo of snd_ump_stream_msg_device_info
  ALSA/hda: intel-sdw-acpi: Correct sdw_intel_acpi_scan() function parameter
  ALSA: seq: Fix delivery of UMP events to group ports
2025-05-16 09:06:12 -07:00
Takashi Iwai
7672135ddb Merge tag 'asoc-fix-v6.15-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.15

A collection of driver specific fixes that built up over the past few
weeks, plus one core patch which fixes handling of topology files where
some DAI names are substrings of others.
2025-05-16 14:46:53 +02:00
Kailang Yang
5ad8a4ddc4 ALSA: hda/realtek - restore auto-mute mode for Dell Chrome platform
This board need to shutdown Class-D amp to avoid EMI issue.
Restore the Auto-Mute mode item will off pin control when Auto-mute mode was enable.

Signed-off-by: Kailang Yang <kailang@realtek.com>
Links: https://lore.kernel.org/ee8bbe5236464c369719d96269ba8ef8@realtek.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-05-16 11:31:49 +02:00
Lukas Bulwahn
0368091374 x86/mm: Remove duplicated word in warning message
Commit bbeb69ce30 ("x86/mm: Remove CONFIG_HIGHMEM64G support") introduces
a new warning message MSG_HIGHMEM_TRIMMED, which accidentally introduces a
duplicated 'for for' in the warning message.

Remove this duplicated word.

This was noticed while reviewing for references to obsolete kernel build
config options.

Fixes: bbeb69ce30 ("x86/mm: Remove CONFIG_HIGHMEM64G support")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: https://lore.kernel.org/r/20250516090810.556623-1-lukas.bulwahn@redhat.com
2025-05-16 11:16:52 +02:00
Takashi Iwai
93a81ca065 ALSA: pcm: Fix race of buffer access at PCM OSS layer
The PCM OSS layer tries to clear the buffer with the silence data at
initialization (or reconfiguration) of a stream with the explicit call
of snd_pcm_format_set_silence() with runtime->dma_area.  But this may
lead to a UAF because the accessed runtime->dma_area might be freed
concurrently, as it's performed outside the PCM ops.

For avoiding it, move the code into the PCM core and perform it inside
the buffer access lock, so that it won't be changed during the
operation.

Reported-by: syzbot+32d4647f551007595173@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/68164d8e.050a0220.11da1b.0019.GAE@google.com
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20250516080817.20068-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-05-16 10:09:02 +02:00
Dave Airlie
c81dbc490b Merge tag 'drm-xe-fixes-2025-05-15-1' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Core Changes:
- Add timeslicing and allocation restriction for SVM

Driver Changes:
- Fix shrinker debugfs name
- Add HW workaround to Xe2
- Fix SVM when mixing GPU and CPU atomics
- Fix per client engine utilization due to active contexts
  not saving timestamp with lite restore enabled.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/qil4scyn6ucnt43u5ju64bi7r7n5r36k4pz5rsh2maz7isle6g@lac3jpsjrrvs
2025-05-16 11:21:35 +10:00
Dave Airlie
f7bf6bdbf0 Merge tag 'drm-misc-fixes-2025-05-15' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:

dma-buf:
- Avoid memory reordering in fence handling

ivpu:
- Fix buffer size in debugfs code

meson:
- Avoid integer overflow in mode-clock calculations

panel-mipi-dbi:
- Fix output with drm_client_setup_with_fourcc()

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20250515125534.GA41174@linux.fritz.box
2025-05-16 10:57:28 +10:00
Jakub Kicinski
c39b1bb5bc Merge tag 'for-net-2025-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

 - btusb: use skb_pull to avoid unsafe access in QCA dump handling
 - L2CAP: Fix not checking l2cap_chan security level

* tag 'for-net-2025-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
  Bluetooth: btusb: use skb_pull to avoid unsafe access in QCA dump handling
  Bluetooth: L2CAP: Fix not checking l2cap_chan security level
====================

Link: https://patch.msgid.link/20250515171909.1606243-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-15 17:37:12 -07:00
Dave Airlie
34f7151d99 Merge tag 'amd-drm-fixes-6.15-2025-05-14' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.15-2025-05-14:

amdgpu:
- Fix CSA unmap
- Fix MALL size reporting on GFX11.5
- AUX fix
- DCN 3.5 fix
- VRR fix
- DP MST fix
- DML 2.1 fixes
- Silence DP AUX spam
- DCN 4.0.1 cursor fix
- VCN 4.0.5 fix

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250514185117.758496-1-alexander.deucher@amd.com
2025-05-16 09:07:40 +10:00
Linus Torvalds
fee3e843b3 Merge tag 'bcachefs-2025-05-15' of git://evilpiepirate.org/bcachefs
Pull bcachefs fixes from Kent Overstreet:
 "The main user reported ones are:

   - Fix a btree iterator locking inconsistency that's been causing us
     to go emergency read-only in evacuate: "Fix broken btree_path lock
     invariants in next_node()"

   - Minor btree node cache reclaim tweak that should help with OOMs:
     don't set btree nodes as accessed on fill

   - Fix a bch2_bkey_clear_rebalance() issue that was causing rebalance
     to do needless work"

* tag 'bcachefs-2025-05-15' of git://evilpiepirate.org/bcachefs:
  bcachefs: fix wrong arg to fsck_err()
  bcachefs: Fix missing commit in backpointer to missing target
  bcachefs: Fix accidental O(n^2) in fiemap
  bcachefs: Fix set_should_be_locked() call in peek_slot()
  bcachefs: Fix self deadlock
  bcachefs: Don't set btree nodes as accessed on fill
  bcachefs: Fix livelock in journal_entry_open()
  bcachefs: Fix broken btree_path lock invariants in next_node()
  bcachefs: Don't strip rebalance_opts from indirect extents
2025-05-15 14:20:48 -07:00
Linus Torvalds
4d0be1aa26 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
 "Four small fixes for crashes:

   - Double free in rxe

   - UAF in irdma from early freeing the rf

   - Off by one undoing the IRQ allocations during error unwind in irdma

   - Another race with device rename and uevent generation. uevents
     accesses the struct device name and UAF when it is changed"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/core: Fix "KASAN: slab-use-after-free Read in ib_register_device" problem
  ice, irdma: fix an off by one in error handling code
  irdma: free iwdev->rf after removing MSI-X
  RDMA/rxe: Fix slab-use-after-free Read in rxe_queue_cleanup bug
2025-05-15 12:26:09 -07:00
Linus Torvalds
f1f2797f8a Merge tag 'landlock-6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock fixes from Mickaël Salaün:
 "This fixes a KUnit issue, simplifies code, and adds new tests"

* tag 'landlock-6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  landlock: Improve bit operations in audit code
  landlock: Remove KUnit test that triggers a warning
2025-05-15 11:04:36 -07:00
Linus Torvalds
fe661d01ab Merge tag 'hid-for-linus-2025051501' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Benjamin Tissoires:

 - fix a few potential memory leaks in the wacom driver (Qasim Ijaz)

 - AMD SFH fixes when there is only one SRA sensor (Mario Limonciello)

 - HID-BPF dispatch UAF fix that happens on removal of the Logitech DJ
   receiver (Rong Zhang)

 - various minor fixes and usual device ID additions

* tag 'hid-for-linus-2025051501' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: bpf: abort dispatch if device destroyed
  HID: quirks: Add ADATA XPG alpha wireless mouse support
  HID: hid-steam: Remove the unused variable connected
  HID: amd_sfh: Avoid clearing reports for SRA sensor
  HID: amd_sfh: Fix SRA sensor when it's the only sensor
  HID: wacom: fix shift OOB in kfifo allocation for zero pktlen
  HID: uclogic: Add NULL check in uclogic_input_configured()
  HID: wacom: fix memory leak on size mismatch in wacom_wac_queue_flush()
  HID: wacom: handle kzalloc() allocation failure in wacom_wac_queue_flush()
  HID: thrustmaster: fix memory leak in thrustmaster_interrupts()
  HID: hid-appletb-kbd: Fix wrong date and kernel version in sysfs interface docs
  HID: bpf: fix BTN_STYLUS for the XP Pen ACK05 remote
2025-05-15 10:57:24 -07:00
Linus Torvalds
ef935650e0 Merge tag 'net-6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from Bluetooth and wireless.

  A few more fixes for the locking changes trickling in. Nothing too
  alarming, I suspect those will continue for another release. Other
  than that things are slowing down nicely.

  Current release - fix to a fix:

   - Bluetooth: hci_event: use key encryption size when its known

   - tools: ynl-gen: allow multi-attr without nested-attributes again

  Current release - regressions:

   - locking fixes:
      - lock lower level devices when updating features
      - eth: bnxt_en: bring back rtnl_lock() in the bnxt_open() path
      - devmem: fix panic when Netlink socket closes after module unload

  Current release - new code bugs:

   - eth: txgbe: fixes for FW communication on new AML devices

  Previous releases - always broken:

   - sched: flush gso_skb list too during ->change(), avoid potential
     null-deref on reconfig

   - wifi: mt76: disable NAPI on driver removal

   - hv_netvsc: fix error 'nvsp_rndis_pkt_complete error status: 2'"

* tag 'net-6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (44 commits)
  net: devmem: fix kernel panic when netlink socket close after module unload
  tsnep: fix timestamping with a stacked DSA driver
  net/tls: fix kernel panic when alloc_page failed
  bnxt_en: bring back rtnl_lock() in the bnxt_open() path
  mlxsw: spectrum_router: Fix use-after-free when deleting GRE net devices
  wifi: mac80211: Set n_channels after allocating struct cfg80211_scan_request
  octeontx2-pf: Do not reallocate all ntuple filters
  wifi: mt76: mt7925: fix missing hdr_trans_tlv command for broadcast wtbl
  wifi: mt76: disable napi on driver removal
  Drivers: hv: vmbus: Remove vmbus_sendpacket_pagebuffer()
  hv_netvsc: Remove rmsg_pgcnt
  hv_netvsc: Preserve contiguous PFN grouping in the page buffer array
  hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages
  Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges
  octeontx2-af: Fix CGX Receive counters
  net: ethernet: mtk_eth_soc: fix typo for declaration MT7988 ESW capability
  net: libwx: Fix FW mailbox unknown command
  net: libwx: Fix FW mailbox reply timeout
  net: txgbe: Fix to calculate EEPROM checksum for AML devices
  octeontx2-pf: macsec: Fix incorrect max transmit size in TX secy
  ...
2025-05-15 10:40:49 -07:00
En-Wei Wu
4bcb0c7dc2 Bluetooth: btusb: use skb_pull to avoid unsafe access in QCA dump handling
Use skb_pull() and skb_pull_data() to safely parse QCA dump packets.

This avoids direct pointer math on skb->data, which could lead to
invalid access if the packet is shorter than expected.

Fixes: 20981ce2d5 ("Bluetooth: btusb: Add WCN6855 devcoredump support")
Signed-off-by: En-Wei Wu <en-wei.wu@canonical.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-05-15 13:12:54 -04:00
Luiz Augusto von Dentz
7af8479d9e Bluetooth: L2CAP: Fix not checking l2cap_chan security level
l2cap_check_enc_key_size shall check the security level of the
l2cap_chan rather than the hci_conn since for incoming connection
request that may be different as hci_conn may already been
encrypted using a different security level.

Fixes: 522e9ed157 ("Bluetooth: l2cap: Check encryption key size on incoming connection")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-05-15 13:09:46 -04:00
Ming Lei
dd24f87f65 ublk: fix dead loop when canceling io command
Commit:

f40139fde5 ("ublk: fix race between io_uring_cmd_complete_in_task and
		ublk_cancel_cmd")

adds a request state check in ublk_cancel_cmd(), and if the request is
started, skips canceling this uring_cmd.

However, the current uring_cmd may be in ACTIVE state, without block
request coming to the uring command. Meantime, if the cached request in
tag_set.tags[tag] has been delivered to ublk server and reycycled, then
this uring_cmd can't be canceled.

ublk requests are aborted in ublk char device release handler, which
depends on canceling all ACTIVE uring_cmd. So it causes a dead loop.

Fix this issue by not taking a stale request into account when canceling
uring_cmd in ublk_cancel_cmd().

Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Closes: https://lore.kernel.org/linux-block/mruqwpf4tqenkbtgezv5oxwq7ngyq24jzeyqy4ixzvivatbbxv@4oh2wzz4e6qn/
Fixes: f40139fde5 ("ublk: fix race between io_uring_cmd_complete_in_task and ublk_cancel_cmd")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250515162601.77346-1-ming.lei@redhat.com
[axboe: rewording of commit message]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-15 10:53:41 -06:00
Adrian Hunter
99bcd91fab perf/x86/intel: Fix segfault with PEBS-via-PT with sample_freq
Currently, using PEBS-via-PT with a sample frequency instead of a sample
period, causes a segfault.  For example:

    BUG: kernel NULL pointer dereference, address: 0000000000000195
    <NMI>
    ? __die_body.cold+0x19/0x27
    ? page_fault_oops+0xca/0x290
    ? exc_page_fault+0x7e/0x1b0
    ? asm_exc_page_fault+0x26/0x30
    ? intel_pmu_pebs_event_update_no_drain+0x40/0x60
    ? intel_pmu_pebs_event_update_no_drain+0x32/0x60
    intel_pmu_drain_pebs_icl+0x333/0x350
    handle_pmi_common+0x272/0x3c0
    intel_pmu_handle_irq+0x10a/0x2e0
    perf_event_nmi_handler+0x2a/0x50

That happens because intel_pmu_pebs_event_update_no_drain() assumes all the
pebs_enabled bits represent counter indexes, which is not always the case.
In this particular case, bits 60 and 61 are set for PEBS-via-PT purposes.

The behaviour of PEBS-via-PT with sample frequency is questionable because
although a PMI is generated (PEBS_PMI_AFTER_EACH_RECORD), the period is not
adjusted anyway.

Putting that aside, fix intel_pmu_pebs_event_update_no_drain() by passing
the mask of counter bits instead of 'size'.  Note, prior to the Fixes
commit, 'size' would be limited to the maximum counter index, so the issue
was not hit.

Fixes: 722e42e45c ("perf/x86: Support counter mask")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: linux-perf-users@vger.kernel.org
Link: https://lore.kernel.org/r/20250508134452.73960-1-adrian.hunter@intel.com
2025-05-15 18:15:54 +02:00
Taehee Yoo
0afc44d8cd net: devmem: fix kernel panic when netlink socket close after module unload
Kernel panic occurs when a devmem TCP socket is closed after NIC module
is unloaded.

This is Devmem TCP unregistration scenarios. number is an order.
(a)netlink socket close    (b)pp destroy    (c)uninstall    result
1                          2                3               OK
1                          3                2               (d)Impossible
2                          1                3               OK
3                          1                2               (e)Kernel panic
2                          3                1               (d)Impossible
3                          2                1               (d)Impossible

(a) netdev_nl_sock_priv_destroy() is called when devmem TCP socket is
    closed.
(b) page_pool_destroy() is called when the interface is down.
(c) mp_ops->uninstall() is called when an interface is unregistered.
(d) There is no scenario in mp_ops->uninstall() is called before
    page_pool_destroy().
    Because unregister_netdevice_many_notify() closes interfaces first
    and then calls mp_ops->uninstall().
(e) netdev_nl_sock_priv_destroy() accesses struct net_device to acquire
    netdev_lock().
    But if the interface module has already been removed, net_device
    pointer is invalid, so it causes kernel panic.

In summary, there are only 3 possible scenarios.
 A. sk close -> pp destroy -> uninstall.
 B. pp destroy -> sk close -> uninstall.
 C. pp destroy -> uninstall -> sk close.

Case C is a kernel panic scenario.

In order to fix this problem, It makes mp_dmabuf_devmem_uninstall() set
binding->dev to NULL.
It indicates an bound net_device was unregistered.

It makes netdev_nl_sock_priv_destroy() do not acquire netdev_lock()
if binding->dev is NULL.

A new binding->lock is added to protect a dev of a binding.
So, lock ordering is like below.
 priv->lock
 netdev_lock(dev)
 binding->lock

Tests:
Scenario A:
    ./ncdevmem -s 192.168.1.4 -c 192.168.1.2 -f $interface -l -p 8000 \
        -v 7 -t 1 -q 1 &
    pid=$!
    sleep 10
    kill $pid
    ip link set $interface down
    modprobe -rv $module

Scenario B:
    ./ncdevmem -s 192.168.1.4 -c 192.168.1.2 -f $interface -l -p 8000 \
        -v 7 -t 1 -q 1 &
    pid=$!
    sleep 10
    ip link set $interface down
    kill $pid
    modprobe -rv $module

Scenario C:
    ./ncdevmem -s 192.168.1.4 -c 192.168.1.2 -f $interface -l -p 8000 \
        -v 7 -t 1 -q 1 &
    pid=$!
    sleep 10
    modprobe -rv $module
    sleep 5
    kill $pid

Splat looks like:
Oops: general protection fault, probably for non-canonical address 0xdffffc001fffa9f7: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
KASAN: probably user-memory-access in range [0x00000000fffd4fb8-0x00000000fffd4fbf]
CPU: 0 UID: 0 PID: 2041 Comm: ncdevmem Tainted: G    B   W           6.15.0-rc1+ #2 PREEMPT(undef)  0947ec89efa0fd68838b78e36aa1617e97ff5d7f
Tainted: [B]=BAD_PAGE, [W]=WARN
RIP: 0010:__mutex_lock (./include/linux/sched.h:2244 kernel/locking/mutex.c:400 kernel/locking/mutex.c:443 kernel/locking/mutex.c:605 kernel/locking/mutex.c:746)
Code: ea 03 80 3c 02 00 0f 85 4f 13 00 00 49 8b 1e 48 83 e3 f8 74 6a 48 b8 00 00 00 00 00 fc ff df 48 8d 7b 34 48 89 fa 48 c1 ea 03 <0f> b6 f
RSP: 0018:ffff88826f7ef730 EFLAGS: 00010203
RAX: dffffc0000000000 RBX: 00000000fffd4f88 RCX: ffffffffaa9bc811
RDX: 000000001fffa9f7 RSI: 0000000000000008 RDI: 00000000fffd4fbc
RBP: ffff88826f7ef8b0 R08: 0000000000000000 R09: ffffed103e6aa1a4
R10: 0000000000000007 R11: ffff88826f7ef442 R12: fffffbfff669f65e
R13: ffff88812a830040 R14: ffff8881f3550d20 R15: 00000000fffd4f88
FS:  0000000000000000(0000) GS:ffff888866c05000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000563bed0cb288 CR3: 00000001a7c98000 CR4: 00000000007506f0
PKRU: 55555554
Call Trace:
<TASK>
 ...
 netdev_nl_sock_priv_destroy (net/core/netdev-genl.c:953 (discriminator 3))
 genl_release (net/netlink/genetlink.c:653 net/netlink/genetlink.c:694 net/netlink/genetlink.c:705)
 ...
 netlink_release (net/netlink/af_netlink.c:737)
 ...
 __sock_release (net/socket.c:647)
 sock_close (net/socket.c:1393)

Fixes: 1d22d3060b ("net: drop rtnl_lock for queue_mgmt operations")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250514154028.1062909-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-15 08:05:32 -07:00
Gerhard Engleder
b3ca9eef66 tsnep: fix timestamping with a stacked DSA driver
This driver is susceptible to a form of the bug explained in commit
c26a2c2ddc ("gianfar: Fix TX timestamping with a stacked DSA driver")
and in Documentation/networking/timestamping.rst section "Other caveats
for MAC drivers", specifically it timestamps any skb which has
SKBTX_HW_TSTAMP, and does not consider if timestamping has been enabled
in adapter->hwtstamp_config.tx_type.

Evaluate the proper TX timestamping condition only once on the TX
path (in tsnep_xmit_frame_ring()) and store the result in an additional
TX entry flag. Evaluate the new TX entry flag in the TX confirmation path
(in tsnep_tx_poll()).

This way SKBTX_IN_PROGRESS is set by the driver as required, but never
evaluated. SKBTX_IN_PROGRESS shall not be evaluated as it can be set
by a stacked DSA driver and evaluating it would lead to unwanted
timestamps.

Fixes: 403f69bbdb ("tsnep: Add TSN endpoint Ethernet MAC driver")
Suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20250514195657.25874-1-gerhard@engleder-embedded.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-15 07:58:18 -07:00
Pengtao He
491deb9b8c net/tls: fix kernel panic when alloc_page failed
We cannot set frag_list to NULL pointer when alloc_page failed.
It will be used in tls_strp_check_queue_ok when the next time
tls_strp_read_sock is called.

This is because we don't reset full_len in tls_strp_flush_anchor_copy()
so the recv path will try to continue handling the partial record
on the next call but we dettached the rcvq from the frag list.
Alternative fix would be to reset full_len.

Unable to handle kernel NULL pointer dereference
at virtual address 0000000000000028
 Call trace:
 tls_strp_check_rcv+0x128/0x27c
 tls_strp_data_ready+0x34/0x44
 tls_data_ready+0x3c/0x1f0
 tcp_data_ready+0x9c/0xe4
 tcp_data_queue+0xf6c/0x12d0
 tcp_rcv_established+0x52c/0x798

Fixes: 84c61fe1a7 ("tls: rx: do not use the standard strparser")
Signed-off-by: Pengtao He <hept.hept.hept@gmail.com>
Link: https://patch.msgid.link/20250514132013.17274-1-hept.hept.hept@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-15 07:40:51 -07:00
Jakub Kicinski
3933536c87 Merge tag 'wireless-2025-05-15' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:

====================
Couple of stragglers:
 - mac80211: fix syzbot/ubsan in scan counted-by
 - mt76: fix NAPI handling on driver remove
 - mt67: fix multicast/ipv6 receive

* tag 'wireless-2025-05-15' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: mac80211: Set n_channels after allocating struct cfg80211_scan_request
  wifi: mt76: mt7925: fix missing hdr_trans_tlv command for broadcast wtbl
  wifi: mt76: disable napi on driver removal
====================

Link: https://patch.msgid.link/20250515121749.61912-4-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-15 07:19:50 -07:00
Michael Chan
325eb217e4 bnxt_en: bring back rtnl_lock() in the bnxt_open() path
Error recovery, PCIe AER, resume, and TX timeout will invoke bnxt_open()
with netdev_lock only.  This will cause RTNL assert failure in
netif_set_real_num_tx_queues(), netif_set_real_num_tx_queues(),
and netif_set_real_num_tx_queues().

Example error recovery assert:

RTNL: assertion failed at net/core/dev.c (3178)
WARNING: CPU: 3 PID: 3392 at net/core/dev.c:3178 netif_set_real_num_tx_queues+0x1fd/0x210

Call Trace:
 <TASK>
 ? __pfx_bnxt_msix+0x10/0x10 [bnxt_en]
 __bnxt_open_nic+0x1ef/0xb20 [bnxt_en]
 bnxt_open+0xda/0x130 [bnxt_en]
 bnxt_fw_reset_task+0x21f/0x780 [bnxt_en]
 process_scheduled_works+0x9d/0x400

For now, bring back rtnl_lock() in all these code paths that can invoke
bnxt_open().  In the bnxt_queue_start() error path, we don't have
rtnl_lock held so we just change it to call netif_close() instead of
bnxt_reset_task() for simplicity.  This error path is unlikely so it
should be fine.

Fixes: 004b500801 ("eth: bnxt: remove most dependencies on RTNL")
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250514062908.2766677-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-15 07:14:15 -07:00
Ido Schimmel
92ec485503 mlxsw: spectrum_router: Fix use-after-free when deleting GRE net devices
The driver only offloads neighbors that are constructed on top of net
devices registered by it or their uppers (which are all Ethernet). The
device supports GRE encapsulation and decapsulation of forwarded
traffic, but the driver will not offload dummy neighbors constructed on
top of GRE net devices as they are not uppers of its net devices:

 # ip link add name gre1 up type gre tos inherit local 192.0.2.1 remote 198.51.100.1
 # ip neigh add 0.0.0.0 lladdr 0.0.0.0 nud noarp dev gre1
 $ ip neigh show dev gre1 nud noarp
 0.0.0.0 lladdr 0.0.0.0 NOARP

(Note that the neighbor is not marked with 'offload')

When the driver is reloaded and the existing configuration is replayed,
the driver does not perform the same check regarding existing neighbors
and offloads the previously added one:

 # devlink dev reload pci/0000:01:00.0
 $ ip neigh show dev gre1 nud noarp
 0.0.0.0 lladdr 0.0.0.0 offload NOARP

If the neighbor is later deleted, the driver will ignore the
notification (given the GRE net device is not its upper) and will
therefore keep referencing freed memory, resulting in a use-after-free
[1] when the net device is deleted:

 # ip neigh del 0.0.0.0 lladdr 0.0.0.0 dev gre1
 # ip link del dev gre1

Fix by skipping neighbor replay if the net device for which the replay
is performed is not our upper.

[1]
BUG: KASAN: slab-use-after-free in mlxsw_sp_neigh_entry_update+0x1ea/0x200
Read of size 8 at addr ffff888155b0e420 by task ip/2282
[...]
Call Trace:
 <TASK>
 dump_stack_lvl+0x6f/0xa0
 print_address_description.constprop.0+0x6f/0x350
 print_report+0x108/0x205
 kasan_report+0xdf/0x110
 mlxsw_sp_neigh_entry_update+0x1ea/0x200
 mlxsw_sp_router_rif_gone_sync+0x2a8/0x440
 mlxsw_sp_rif_destroy+0x1e9/0x750
 mlxsw_sp_netdevice_ipip_ol_event+0x3c9/0xdc0
 mlxsw_sp_router_netdevice_event+0x3ac/0x15e0
 notifier_call_chain+0xca/0x150
 call_netdevice_notifiers_info+0x7f/0x100
 unregister_netdevice_many_notify+0xc8c/0x1d90
 rtnl_dellink+0x34e/0xa50
 rtnetlink_rcv_msg+0x6fb/0xb70
 netlink_rcv_skb+0x131/0x360
 netlink_unicast+0x426/0x710
 netlink_sendmsg+0x75a/0xc20
 __sock_sendmsg+0xc1/0x150
 ____sys_sendmsg+0x5aa/0x7b0
 ___sys_sendmsg+0xfc/0x180
 __sys_sendmsg+0x121/0x1b0
 do_syscall_64+0xbb/0x1d0
 entry_SYSCALL_64_after_hwframe+0x4b/0x53

Fixes: 8fdb09a767 ("mlxsw: spectrum_router: Replay neighbours when RIF is made")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/c53c02c904fde32dad484657be3b1477884e9ad6.1747225701.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-15 07:14:15 -07:00
Andrew Bresticker
08fb624802 irqchip/riscv-imsic: Start local sync timer on correct CPU
When starting the local sync timer to synchronize the state of a remote
CPU it should be added on the CPU to be synchronized, not the initiating
CPU. This results in interrupt delivery being delayed until the timer
eventually runs (due to another mask/unmask/migrate operation) on the
target CPU.

Fixes: 0f67911e82 ("irqchip/riscv-imsic: Separate next and previous pointers in IMSIC vector")
Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/all/20250514171320.3494917-1-abrestic@rivosinc.com
2025-05-15 16:01:50 +02:00
Jens Axboe
c9bb139db4 Merge tag 'nvme-6.15-2025-05-15' of git://git.infradead.org/nvme into block-6.15
Pull NVMe fixes from Christoph:

"nvme fixes for linux 6.15

 - fixes for atomic writes (Alan Adamson)
 - fixes for polled CQs in nvmet-epf (Damien Le Moal)
 - fix for polled CQs in nvme-pci (Keith Busch)
 - fix compile on odd configs that need to be forced to inline
   (Kees Cook)
 - one more quirk (Ilya Guterman)"

* tag 'nvme-6.15-2025-05-15' of git://git.infradead.org/nvme:
  nvme-pci: add NVME_QUIRK_NO_DEEPEST_PS quirk for SOLIDIGM P44 Pro
  nvme: all namespaces in a subsystem must adhere to a common atomic write size
  nvme: multipath: enable BLK_FEAT_ATOMIC_WRITES for multipathing
  nvmet: pci-epf: remove NVMET_PCI_EPF_Q_IS_SQ
  nvmet: pci-epf: improve debug message
  nvmet: pci-epf: cleanup nvmet_pci_epf_raise_irq()
  nvmet: pci-epf: do not fall back to using INTX if not supported
  nvmet: pci-epf: clear completion queue IRQ flag on delete
  nvme-pci: acquire cq_poll_lock in nvme_poll_irqdisable
  nvme-pci: make nvme_pci_npages_prp() __always_inline
2025-05-15 06:20:56 -06:00
Vladimir Moskovkin
4e89a40774 platform/x86: dell-wmi-sysman: Avoid buffer overflow in current_password_store()
If the 'buf' array received from the user contains an empty string, the
'length' variable will be zero. Accessing the 'buf' array element with
index 'length - 1' will result in a buffer overflow.

Add a check for an empty string.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: e8a60aa740 ("platform/x86: Introduce support for Systems Management Driver over WMI for Dell Systems")
Cc: stable@vger.kernel.org
Signed-off-by: Vladimir Moskovkin <Vladimir.Moskovkin@kaspersky.com>
Link: https://lore.kernel.org/r/39973642a4f24295b4a8fad9109c5b08@kaspersky.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-05-15 14:52:09 +03:00
Valtteri Koskivuori
a7e255ff9f platform/x86: fujitsu-laptop: Support Lifebook S2110 hotkeys
The S2110 has an additional set of media playback control keys enabled
by a hardware toggle button that switches the keys between "Application"
and "Player" modes. Toggling "Player" mode just shifts the scancode of
each hotkey up by 4.

Add defines for new scancodes, and a keymap and dmi id for the S2110.

Tested on a Fujitsu Lifebook S2110.

Signed-off-by: Valtteri Koskivuori <vkoskiv@gmail.com>
Acked-by: Jonathan Woithe <jwoithe@just42.net>
Link: https://lore.kernel.org/r/20250509184251.713003-1-vkoskiv@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-05-15 14:45:08 +03:00
Johannes Berg
943aeda0d7 Merge tag 'mt76-fixes-2025-05-15' of https://github.com/nbd168/wireless
Felix Fietkau says:
===================
mt76 fix for 6.15

- disable napi on driver removal to fix warning
- fix multicast rx regression on mt7925
===================

Link: https://patch.msgid.link/3b526d06-b717-4d47-817c-a9f47b796a31@nbd.name/
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-05-15 13:44:57 +02:00
Kees Cook
82bbe02b25 wifi: mac80211: Set n_channels after allocating struct cfg80211_scan_request
Make sure that n_channels is set after allocating the
struct cfg80211_registered_device::int_scan_req member. Seen with
syzkaller:

UBSAN: array-index-out-of-bounds in net/mac80211/scan.c:1208:5
index 0 is out of range for type 'struct ieee80211_channel *[] __counted_by(n_channels)' (aka 'struct ieee80211_channel *[]')

This was missed in the initial conversions because I failed to locate
the allocation likely due to the "sizeof(void *)" not matching the
"channels" array type.

Reported-by: syzbot+4bcdddd48bb6f0be0da1@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/lkml/680fd171.050a0220.2b69d1.045e.GAE@google.com/
Fixes: e3eac9f32e ("wifi: cfg80211: Annotate struct cfg80211_scan_request with __counted_by")
Signed-off-by: Kees Cook <kees@kernel.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://patch.msgid.link/20250509184641.work.542-kees@kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-05-15 13:20:33 +02:00
Nicolas Chauvet
7b9938a144 ALSA: usb-audio: Add sample rate quirk for Microdia JP001 USB Camera
Microdia JP001 does not support reading the sample rate which leads to
many lines of "cannot get freq at ep 0x84".
This patch adds the USB ID to quirks.c and avoids those error messages.

usb 7-4: New USB device found, idVendor=0c45, idProduct=636b, bcdDevice= 1.00
usb 7-4: New USB device strings: Mfr=2, Product=1, SerialNumber=3
usb 7-4: Product: JP001
usb 7-4: Manufacturer: JP001
usb 7-4: SerialNumber: JP001
usb 7-4: 3:1: cannot get freq at ep 0x84

Cc: <stable@vger.kernel.org>
Signed-off-by: Nicolas Chauvet <kwizart@gmail.com>
Link: https://patch.msgid.link/20250515102132.73062-1-kwizart@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-05-15 12:45:48 +02:00
Vinod Koul
811d6a923b dmaengine: mediatek: drop unused variable
Commit 157ae5ffd7 dmaengine: mediatek: Fix a possible deadlock error
in mtk_cqdma_tx_status() fixed locks but kept unused varibale leading to
warning and build failure (due to warning treated as errors)

drivers/dma/mediatek/mtk-cqdma.c: In function 'mtk_cqdma_find_active_desc':
drivers/dma/mediatek/mtk-cqdma.c:423:23: error: unused variable 'flags' [-Werror=unused-variable]
  423 |         unsigned long flags;
      |                       ^~~~~

Fix by dropping this unused flag

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 157ae5ffd7 ("dmaengine: mediatek: Fix a possible deadlock error in mtk_cqdma_tx_status()")
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-15 11:42:13 +01:00
Subbaraya Sundeep
dcb479fde0 octeontx2-pf: Do not reallocate all ntuple filters
If ntuple filters count is modified followed by
unicast filters count using devlink then the ntuple count
set by user is ignored and all the ntuple filters are
being reallocated. Fix this by storing the ntuple count
set by user. Without this patch, say if user tries
to modify ntuple count as 8 followed by ucast filter count as 4
using devlink commands then ntuple count is being reverted to
default value 16 i.e, not retaining user set value 8.

Fixes: 39c469188b ("octeontx2-pf: Add ucast filter count configurability via devlink.")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1747054357-5850-1-git-send-email-sbhatta@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-15 12:37:59 +02:00
Ming Yen Hsieh
0aa8496add wifi: mt76: mt7925: fix missing hdr_trans_tlv command for broadcast wtbl
Ensure that the hdr_trans_tlv command is included in the broadcast wtbl to
prevent the IPv6 and multicast packet from being dropped by the chip.

Cc: stable@vger.kernel.org
Fixes: cb1353ef34 ("wifi: mt76: mt7925: integrate *mlo_sta_cmd and *sta_cmd")
Reported-by: Benjamin Xiao <fossben@pm.me>
Tested-by: Niklas Schnelle <niks@kernel.org>
Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com>
Link: https://lore.kernel.org/lkml/EmWnO5b-acRH1TXbGnkx41eJw654vmCR-8_xMBaPMwexCnfkvKCdlU5u19CGbaapJ3KRu-l3B-tSUhf8CCQwL0odjo6Cd5YG5lvNeB-vfdg=@pm.me/
Link: https://patch.msgid.link/20250509010421.403022-1-mingyen.hsieh@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-05-15 11:42:12 +02:00
Fedor Pchelkin
78ab4be549 wifi: mt76: disable napi on driver removal
A warning on driver removal started occurring after commit 9dd05df840
("net: warn if NAPI instance wasn't shut down"). Disable tx napi before
deleting it in mt76_dma_cleanup().

 WARNING: CPU: 4 PID: 18828 at net/core/dev.c:7288 __netif_napi_del_locked+0xf0/0x100
 CPU: 4 UID: 0 PID: 18828 Comm: modprobe Not tainted 6.15.0-rc4 #4 PREEMPT(lazy)
 Hardware name: ASUS System Product Name/PRIME X670E-PRO WIFI, BIOS 3035 09/05/2024
 RIP: 0010:__netif_napi_del_locked+0xf0/0x100
 Call Trace:
 <TASK>
 mt76_dma_cleanup+0x54/0x2f0 [mt76]
 mt7921_pci_remove+0xd5/0x190 [mt7921e]
 pci_device_remove+0x47/0xc0
 device_release_driver_internal+0x19e/0x200
 driver_detach+0x48/0x90
 bus_remove_driver+0x6d/0xf0
 pci_unregister_driver+0x2e/0xb0
 __do_sys_delete_module.isra.0+0x197/0x2e0
 do_syscall_64+0x7b/0x160
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Tested with mt7921e but the same pattern can be actually applied to other
mt76 drivers calling mt76_dma_cleanup() during removal. Tx napi is enabled
in their *_dma_init() functions and only toggled off and on again inside
their suspend/resume/reset paths. So it should be okay to disable tx
napi in such a generic way.

Found by Linux Verification Center (linuxtesting.org).

Fixes: 2ac515a5d7 ("mt76: mt76x02: use napi polling for tx cleanup")
Cc: stable@vger.kernel.org
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Tested-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com>
Link: https://patch.msgid.link/20250506115540.19045-1-pchelkin@ispras.ru
Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-05-15 11:22:41 +02:00
Linus Torvalds
088d13246a Merge tag 'kbuild-fixes-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:

 - Add proper pahole version dependency to CONFIG_GENDWARFKSYMS to avoid
   module loading errors

 - Fix UAPI header tests for the OpenRISC architecture

 - Add dependency on the libdw package in Debian and RPM packages

 - Disable -Wdefault-const-init-unsafe warnings on Clang

 - Make "make clean ARCH=um" also clean the arch/x86/ directory

 - Revert the use of -fmacro-prefix-map=, which causes issues with
   debugger usability

* tag 'kbuild-fixes-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: fix typos "module.builtin" to "modules.builtin"
  Revert "kbuild, rust: use -fremap-path-prefix to make paths relative"
  Revert "kbuild: make all file references relative to source root"
  kbuild: fix dependency on sorttable
  init: remove unused CONFIG_CC_CAN_LINK_STATIC
  um: let 'make clean' properly clean underlying SUBARCH as well
  kbuild: Disable -Wdefault-const-init-unsafe
  kbuild: rpm-pkg: Add (elfutils-devel or libdw-devel) to BuildRequires
  kbuild: deb-pkg: Add libdw-dev:native to Build-Depends-Arch
  usr/include: openrisc: don't HDRTEST bpf_perf_event.h
  kbuild: Require pahole <v1.28 or >v1.29 with GENDWARFKSYMS on X86
2025-05-14 22:24:17 -07:00
Jakub Kicinski
09db7a4d28 Merge branch 'hv_netvsc-fix-error-nvsp_rndis_pkt_complete-error-status-2'
Michael Kelley says:

====================
hv_netvsc: Fix error "nvsp_rndis_pkt_complete error status: 2"

Starting with commit dca5161f9b in the 6.3 kernel, the Linux driver
for Hyper-V synthetic networking (netvsc) occasionally reports
"nvsp_rndis_pkt_complete error status: 2".[1] This error indicates
that Hyper-V has rejected a network packet transmit request from the
guest, and the outgoing network packet is dropped. Higher level
network protocols presumably recover and resend the packet so there is
no functional error, but performance is slightly impacted. Commit
dca5161f9b is not the cause of the error -- it only added reporting
of an error that was already happening without any notice. The error
has presumably been present since the netvsc driver was originally
introduced into Linux.

This patch set fixes the root cause of the problem, which is that the
netvsc driver in Linux may send an incorrectly formatted VMBus message
to Hyper-V when transmitting the network packet. The incorrect
formatting occurs when the rndis header of the VMBus message crosses a
page boundary due to how the Linux skb head memory is aligned. In such
a case, two PFNs are required to describe the location of the rndis
header, even though they are contiguous in guest physical address
(GPA) space. Hyper-V requires that two PFNs be in a single "GPA range"
data struture, but current netvsc code puts each PFN in its own GPA
range, which Hyper-V rejects as an error in the case of the rndis
header.

The incorrect formatting occurs only for larger packets that netvsc
must transmit via a VMBus "GPA Direct" message. There's no problem
when netvsc transmits a smaller packet by copying it into a pre-
allocated send buffer slot because the pre-allocated slots don't have
page crossing issues.

After commit 14ad6ed30a in the 6.14 kernel, the error occurs much
more frequently in VMs with 16 or more vCPUs. It may occur every few
seconds, or even more frequently, in a ssh session that outputs a lot
of text. Commit 14ad6ed30a subtly changes how skb head memory is
allocated, making it much more likely that the rndis header will cross
a page boundary when the vCPU count is 16 or more.  The changes in
commit 14ad6ed30a are perfectly valid -- they just had the side
effect of making the netvsc bug more prominent.

One fix is to check for adjacent PFNs in vmbus_sendpacket_pagebuffer()
and just combine them into a single GPA range. Such a fix is very
contained. But conceptually it is fixing the problem at the wrong
level. So this patch set takes the broader approach of maintaining
the already known grouping of contiguous PFNs at a higher level in
the netvsc driver code, and propagating that grouping down to the
creation of the VMBus message to send to Hyper-V. Maintaining the
grouping fixes this problem, and has the added benefit of allowing
netvsc_dma_map() to make fewer calls to dma_map_single() to do bounce
buffering in CoCo VMs.

Patch 1 is a preparatory change to allow vmbus_sendpacket_mpb_desc()
to specify multiple GPA ranges. In current code
vmbus_sendpacket_mpb_desc() is used only by the storvsc synthetic SCSI
driver, and it always creates a single GPA range.

Patch 2 updates the netvsc driver to use vmbus_sendpacket_mpb_desc()
instead of vmbus_sendpacket_pagebuffer(). Because the higher levels of
netvsc still don't group contiguous PFNs, this patch is functionally
neutral. The VMBus message to Hyper-V still has many GPA ranges, each
with a single PFN. But it lays the groundwork for the next patch.

Patch 3 changes the higher levels of netvsc to preserve the already
known grouping of contiguous PFNs. When the contiguous groupings are
passed to vmbus_sendpacket_mpb_desc(), GPA ranges containing multiple
PFNs are produced, as expected by Hyper-V. This is point at which the
core problem is fixed.

Patches 4 and 5 remove code that is no longer necessary after the
previous patches.

These changes provide a net reduction of about 65 lines of code, which
is an added benefit.

These changes have been tested in normal VMs, in SEV-SNP and TDX CoCo
VMs, and in Dv6-series VMs where the netvsp implementation is in the
OpenHCL paravisor instead of the Hyper-V host.

These changes are built against kernel version 6.15-rc6.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=217503
====================

Link: https://patch.msgid.link/20250513000604.1396-1-mhklinux@outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:45:27 -07:00
Michael Kelley
45a442fe36 Drivers: hv: vmbus: Remove vmbus_sendpacket_pagebuffer()
With the netvsc driver changed to use vmbus_sendpacket_mpb_desc()
instead of vmbus_sendpacket_pagebuffer(), the latter has no remaining
callers. Remove it.

Cc: <stable@vger.kernel.org> # 6.1.x
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Link: https://patch.msgid.link/20250513000604.1396-6-mhklinux@outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:45:24 -07:00
Michael Kelley
5bbc644bbf hv_netvsc: Remove rmsg_pgcnt
init_page_array() now always creates a single page buffer array entry
for the rndis message, even if the rndis message crosses a page
boundary. As such, the number of page buffer array entries used for
the rndis message must no longer be tracked -- it is always just 1.
Remove the rmsg_pgcnt field and use "1" where the value is needed.

Cc: <stable@vger.kernel.org> # 6.1.x
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Link: https://patch.msgid.link/20250513000604.1396-5-mhklinux@outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:45:24 -07:00
Michael Kelley
41a6328b2c hv_netvsc: Preserve contiguous PFN grouping in the page buffer array
Starting with commit dca5161f9b ("hv_netvsc: Check status in
SEND_RNDIS_PKT completion message") in the 6.3 kernel, the Linux
driver for Hyper-V synthetic networking (netvsc) occasionally reports
"nvsp_rndis_pkt_complete error status: 2".[1] This error indicates
that Hyper-V has rejected a network packet transmit request from the
guest, and the outgoing network packet is dropped. Higher level
network protocols presumably recover and resend the packet so there is
no functional error, but performance is slightly impacted. Commit
dca5161f9b is not the cause of the error -- it only added reporting
of an error that was already happening without any notice. The error
has presumably been present since the netvsc driver was originally
introduced into Linux.

The root cause of the problem is that the netvsc driver in Linux may
send an incorrectly formatted VMBus message to Hyper-V when
transmitting the network packet. The incorrect formatting occurs when
the rndis header of the VMBus message crosses a page boundary due to
how the Linux skb head memory is aligned. In such a case, two PFNs are
required to describe the location of the rndis header, even though
they are contiguous in guest physical address (GPA) space. Hyper-V
requires that two rndis header PFNs be in a single "GPA range" data
struture, but current netvsc code puts each PFN in its own GPA range,
which Hyper-V rejects as an error.

The incorrect formatting occurs only for larger packets that netvsc
must transmit via a VMBus "GPA Direct" message. There's no problem
when netvsc transmits a smaller packet by copying it into a pre-
allocated send buffer slot because the pre-allocated slots don't have
page crossing issues.

After commit 14ad6ed30a ("net: allow small head cache usage with
large MAX_SKB_FRAGS values") in the 6.14-rc4 kernel, the error occurs
much more frequently in VMs with 16 or more vCPUs. It may occur every
few seconds, or even more frequently, in an ssh session that outputs a
lot of text. Commit 14ad6ed30a subtly changes how skb head memory is
allocated, making it much more likely that the rndis header will cross
a page boundary when the vCPU count is 16 or more. The changes in
commit 14ad6ed30a are perfectly valid -- they just had the side
effect of making the netvsc bug more prominent.

Current code in init_page_array() creates a separate page buffer array
entry for each PFN required to identify the data to be transmitted.
Contiguous PFNs get separate entries in the page buffer array, and any
information about contiguity is lost.

Fix the core issue by having init_page_array() construct the page
buffer array to represent contiguous ranges rather than individual
pages. When these ranges are subsequently passed to
netvsc_build_mpb_array(), it can build GPA ranges that contain
multiple PFNs, as required to avoid the error "nvsp_rndis_pkt_complete
error status: 2". If instead the network packet is sent by copying
into a pre-allocated send buffer slot, the copy proceeds using the
contiguous ranges rather than individual pages, but the result of the
copying is the same. Also fix rndis_filter_send_request() to construct
a contiguous range, since it has its own page buffer array.

This change has a side benefit in CoCo VMs in that netvsc_dma_map()
calls dma_map_single() on each contiguous range instead of on each
page. This results in fewer calls to dma_map_single() but on larger
chunks of memory, which should reduce contention on the swiotlb.

Since the page buffer array now contains one entry for each contiguous
range instead of for each individual page, the number of entries in
the array can be reduced, saving 208 bytes of stack space in
netvsc_xmit() when MAX_SKG_FRAGS has the default value of 17.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=217503

Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217503
Cc: <stable@vger.kernel.org> # 6.1.x
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Link: https://patch.msgid.link/20250513000604.1396-4-mhklinux@outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:45:24 -07:00
Michael Kelley
4f98616b85 hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages
netvsc currently uses vmbus_sendpacket_pagebuffer() to send VMBus
messages. This function creates a series of GPA ranges, each of which
contains a single PFN. However, if the rndis header in the VMBus
message crosses a page boundary, the netvsc protocol with the host
requires that both PFNs for the rndis header must be in a single "GPA
range" data structure, which isn't possible with
vmbus_sendpacket_pagebuffer(). As the first step in fixing this, add a
new function netvsc_build_mpb_array() to build a VMBus message with
multiple GPA ranges, each of which may contain multiple PFNs. Use
vmbus_sendpacket_mpb_desc() to send this VMBus message to the host.

There's no functional change since higher levels of netvsc don't
maintain or propagate knowledge of contiguous PFNs. Based on its
input, netvsc_build_mpb_array() still produces a separate GPA range
for each PFN and the behavior is the same as with
vmbus_sendpacket_pagebuffer(). But the groundwork is laid for a
subsequent patch to provide the necessary grouping.

Cc: <stable@vger.kernel.org> # 6.1.x
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Link: https://patch.msgid.link/20250513000604.1396-3-mhklinux@outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:45:23 -07:00
Michael Kelley
380b75d307 Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges
vmbus_sendpacket_mpb_desc() is currently used only by the storvsc driver
and is hardcoded to create a single GPA range. To allow it to also be
used by the netvsc driver to create multiple GPA ranges, no longer
hardcode as having a single GPA range. Allow the calling driver to
specify the rangecount in the supplied descriptor.

Update the storvsc driver to reflect this new approach.

Cc: <stable@vger.kernel.org> # 6.1.x
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Link: https://patch.msgid.link/20250513000604.1396-2-mhklinux@outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:45:23 -07:00
Linus Torvalds
546bce5792 Merge tag 'tpmdd-next-6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull tpm fixes from Jarkko Sakkinen:
 "A few last minute fixes for v6.15"

* tag 'tpmdd-next-6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  tpm: tis: Double the timeout B to 4s
  char: tpm: tpm-buf: Add sanity check fallback in read helpers
  tpm: Mask TPM RC in tpm2_start_auth_session()
2025-05-14 19:33:18 -07:00
Hariprasad Kelam
bf449f35e7 octeontx2-af: Fix CGX Receive counters
Each CGX block supports 4 logical MACs (LMACS). Receive
counters CGX_CMR_RX_STAT0-8 are per LMAC and CGX_CMR_RX_STAT9-12
are per CGX.

Due a bug in previous patch, stale Per CGX counters values observed.

Fixes: 66208910e5 ("octeontx2-af: Support to retrieve CGX LMAC stats")
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Link: https://patch.msgid.link/20250513071554.728922-1-hkelam@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:29:44 -07:00
Bo-Cun Chen
1bdea6fad6 net: ethernet: mtk_eth_soc: fix typo for declaration MT7988 ESW capability
Since MTK_ESW_BIT is a bit number rather than a bitmap, it causes
MTK_HAS_CAPS to produce incorrect results. This leads to the ETH
driver not declaring MAC capabilities correctly for the MT7988 ESW.

Fixes: 445eb6448e ("net: ethernet: mtk_eth_soc: add basic support for MT7988 SoC")
Signed-off-by: Bo-Cun Chen <bc-bocun.chen@mediatek.com>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/b8b37f409d1280fad9c4d32521e6207f63cd3213.1747110258.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:28:21 -07:00
Jiawen Wu
09e76365ba net: libwx: Fix FW mailbox unknown command
For the new SW-FW interaction, missing the error return if there is an
unknown command. It causes the driver to mistakenly believe that the
interaction is complete. This problem occurs when new driver is paired
with old firmware, which does not support the new mailbox commands.

Fixes: 2e5af6b2ae ("net: txgbe: Add basic support for new AML devices")
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/64DBB705D35A0016+20250513021009.145708-4-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:26:55 -07:00
Jiawen Wu
42efa358f0 net: libwx: Fix FW mailbox reply timeout
For the new SW-FW interaction, the timeout waiting for the firmware to
return is too short. So that some mailbox commands cannot be completed.
Use the 'timeout' parameter instead of fixed timeout value for flexible
configuration.

Fixes: 2e5af6b2ae ("net: txgbe: Add basic support for new AML devices")
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/5D5BDE3EA501BDB8+20250513021009.145708-3-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:26:55 -07:00
Jiawen Wu
141a8dec88 net: txgbe: Fix to calculate EEPROM checksum for AML devices
In the new firmware version, the shadow ram reserves some space to store
I2C information, so the checksum calculation needs to skip this section.
Otherwise, the driver will fail to probe because the invalid EEPROM
checksum.

Fixes: 2e5af6b2ae ("net: txgbe: Add basic support for new AML devices")
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1C6BF7A937237F5A+20250513021009.145708-2-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:26:55 -07:00
Subbaraya Sundeep
865ab24613 octeontx2-pf: macsec: Fix incorrect max transmit size in TX secy
MASCEC hardware block has a field called maximum transmit size for
TX secy. Max packet size going out of MCS block has be programmed
taking into account full packet size which has L2 header,SecTag
and ICV. MACSEC offload driver is configuring max transmit size as
macsec interface MTU which is incorrect. Say with 1500 MTU of real
device, macsec interface created on top of real device will have MTU of
1468(1500 - (SecTag + ICV)). This is causing packets from macsec
interface of size greater than or equal to 1468 are not getting
transmitted out because driver programmed max transmit size as 1468
instead of 1514(1500 + ETH_HDR_LEN).

Fixes: c54ffc7360 ("octeontx2-pf: mcs: Introduce MACSEC hardware offloading")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1747053756-4529-1-git-send-email-sbhatta@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:14:31 -07:00
Jakub Kicinski
f3dd5fb2fa netlink: specs: tc: all actions are indexed arrays
Some TC filters have actions listed as indexed arrays of nests
and some as just nests. They are all indexed arrays, the handling
is common across filters.

Fixes: 2267672a61 ("doc/netlink/specs: Update the tc spec")
Link: https://patch.msgid.link/20250513221638.842532-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:11:15 -07:00
Jakub Kicinski
a9fb87b8b8 netlink: specs: tc: fix a couple of attribute names
Fix up spelling of two attribute names. These are clearly typoes
and will prevent C codegen from working. Let's treat this as
a fix to get the correction into users' hands ASAP, and prevent
anyone depending on the wrong names.

Fixes: a1bcfde836 ("doc/netlink/specs: Add a spec for tc")
Link: https://patch.msgid.link/20250513221316.841700-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14 19:11:09 -07:00
Michal Suchanek
2f661f71fd tpm: tis: Double the timeout B to 4s
With some Infineon chips the timeouts in tpm_tis_send_data (both B and
C) can reach up to about 2250 ms.

Timeout C is retried since
commit de9e33df77 ("tpm, tpm_tis: Workaround failed command reception on Infineon devices")

Timeout B still needs to be extended.

The problem is most commonly encountered with context related operation
such as load context/save context. These are issued directly by the
kernel, and there is no retry logic for them.

When a filesystem is set up to use the TPM for unlocking the boot fails,
and restarting the userspace service is ineffective. This is likely
because ignoring a load context/save context result puts the real TPM
state and the TPM state expected by the kernel out of sync.

Chips known to be affected:
tpm_tis IFX1522:00: 2.0 TPM (device-id 0x1D, rev-id 54)
Description: SLB9672
Firmware Revision: 15.22

tpm_tis MSFT0101:00: 2.0 TPM (device-id 0x1B, rev-id 22)
Firmware Revision: 7.83

tpm_tis MSFT0101:00: 2.0 TPM (device-id 0x1A, rev-id 16)
Firmware Revision: 5.63

Link: https://lore.kernel.org/linux-integrity/Z5pI07m0Muapyu9w@kitsune.suse.cz/
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2025-05-15 04:49:15 +03:00
Purva Yeshi
32d495b384 char: tpm: tpm-buf: Add sanity check fallback in read helpers
Fix Smatch-detected issue:

drivers/char/tpm/tpm-buf.c:208 tpm_buf_read_u8() error:
uninitialized symbol 'value'.
drivers/char/tpm/tpm-buf.c:225 tpm_buf_read_u16() error:
uninitialized symbol 'value'.
drivers/char/tpm/tpm-buf.c:242 tpm_buf_read_u32() error:
uninitialized symbol 'value'.

Zero-initialize the return values in tpm_buf_read_u8(), tpm_buf_read_u16(),
and tpm_buf_read_u32() to guard against uninitialized data in case of a
boundary overflow.

Add defensive initialization ensures the return values are always defined,
preventing undefined behavior if the unexpected happens.

Signed-off-by: Purva Yeshi <purvayeshi550@gmail.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2025-05-15 04:47:14 +03:00
Jarkko Sakkinen
539fbab378 tpm: Mask TPM RC in tpm2_start_auth_session()
tpm2_start_auth_session() does not mask TPM RC correctly from the callers:

[   28.766528] tpm tpm0: A TPM error (2307) occurred start auth session

Process TPM RCs inside tpm2_start_auth_session(), and map them to POSIX
error codes.

Cc: stable@vger.kernel.org # v6.10+
Fixes: 699e3efd6c ("tpm: Add HMAC session start and end functions")
Reported-by: Herbert Xu <herbert@gondor.apana.org.au>
Closes: https://lore.kernel.org/linux-integrity/Z_NgdRHuTKP6JK--@gondor.apana.org.au/
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2025-05-15 04:47:14 +03:00
Linus Torvalds
74a6325597 Merge tag 'for-6.15-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:

 - fix potential endless loop when discarding a block group when
   disabling discard

 - reinstate message when setting a large value of mount option 'commit'

 - fix a folio leak when async extent submission fails

* tag 'for-6.15-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: add back warning for mount option commit values exceeding 300
  btrfs: fix folio leak in submit_one_async_extent()
  btrfs: fix discard worker infinite loop after disabling discard
2025-05-14 18:39:12 -07:00
Paulo Alcantara
3965c23773 smb: client: fix zero rsize error messages
cifs_prepare_read() might be called with a disconnected channel, where
TCP_Server_Info::max_read is set to zero due to reconnect, so calling
->negotiate_rize() will set @rsize to default min IO size (64KiB) and
then logging

	CIFS: VFS: SMB: Zero rsize calculated, using minimum value
	65536

If the reconnect happens in cifsd thread, cifs_renegotiate_iosize()
will end up being called and then @rsize set to the expected value.

Since we can't rely on the value of @server->max_read by the time we
call cifs_prepare_read(), try to ->negotiate_rize() only if
@cifs_sb->ctx->rsize is zero.

Reported-by: Steve French <stfrench@microsoft.com>
Fixes: c59f7c9661 ("smb: client: ensure aligned IO sizes")
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-14 19:26:38 -05:00
Jethro Donaldson
1fe4a44b7f smb: client: fix memory leak during error handling for POSIX mkdir
The response buffer for the CREATE request handled by smb311_posix_mkdir()
is leaked on the error path (goto err_free_rsp_buf) because the structure
pointer *rsp passed to free_rsp_buf() is not assigned until *after* the
error condition is checked.

As *rsp is initialised to NULL, free_rsp_buf() becomes a no-op and the leak
is instead reported by __kmem_cache_shutdown() upon subsequent rmmod of
cifs.ko if (and only if) the error path has been hit.

Pass rsp_iov.iov_base to free_rsp_buf() instead, similar to the code in
other functions in smb2pdu.c for which *rsp is assigned late.

Cc: stable@vger.kernel.org
Signed-off-by: Jethro Donaldson <devel@jro.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-14 19:26:15 -05:00
Vicki Pfau
f0d17942ea Input: xpad - add more controllers
Adds support for a revision of the Turtle Beach Recon Wired Controller,
the Turtle Beach Stealth Ultra, and the PowerA Wired Controller.

Signed-off-by: Vicki Pfau <vi@endrift.com>
Link: https://lore.kernel.org/r/20250513225950.2719387-1-vi@endrift.com
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-05-14 16:15:23 -07:00
Kent Overstreet
9c09e59cc5 bcachefs: fix wrong arg to fsck_err()
fsck_err() needs the btree transaction passed to it if there is one - so
that it can unlock/relock around prompting userspace for fixing the
error.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-14 18:59:15 -04:00
Kent Overstreet
d1041d8eab bcachefs: Fix missing commit in backpointer to missing target
Fsck wants to do transaction commits from an outer context; it may have
other repair to do (i.e. duplicate backpointers).

But when calling backpointer_not_found() from runtime code, i.e. runtime
self healing, we should be doing the commit - the outer context expects
to just be doing lookups.

This fixes bugs where we get stuck spinning, reported as "RCU lock hold
time warnings.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-14 17:05:19 -04:00
Kent Overstreet
a12cb6f758 bcachefs: Fix accidental O(n^2) in fiemap
Since bch2_seek_pagecache_data() searches for dirty data, we only want
to call it for holes in the extents btree - otherwise we have an
accidental O(n^2), as we repeatedly search the same range.

Reported-by: Marcin Mirosław <marcin@mejor.pl>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-14 17:05:19 -04:00
Kent Overstreet
43b9fece2d bcachefs: Fix set_should_be_locked() call in peek_slot()
set_should_be_locked() needs to be called before peek_key_cache(), which
traverses other paths and may do a trans unlock/relock.

This fixes an assertion pop in path_peek_slot(), when the path we're
using is unexpectedly not uptodate.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-14 17:05:19 -04:00
Alan Huang
61198e6287 bcachefs: Fix self deadlock
Before invoking bch2_accounting_mem_mod_locked in
bch2_gc_accounting_done, we already write locked mark_lock,
in bch2_accounting_mem_insert, we lock mark_lock again.

Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-14 17:05:19 -04:00
Kent Overstreet
19b22d04cd bcachefs: Don't set btree nodes as accessed on fill
Prevent jobs that do lots of scanning (i.e. evacuatee, scrub) from
causing OOMs.

The shrinker code seems to be having issues when it doesn't do any
freeing because it's just flipping off the acccessed bit - and the
accessed bit shouldn't be set on first use anyways.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-14 17:05:19 -04:00
Kent Overstreet
7b6759b199 bcachefs: Fix livelock in journal_entry_open()
When the journal is low on space, we might do discards from
journal_res_get() -> journal_entry_open().

Make sure we set j->can_discard correctly, so that if we're low on space
but not because discards aren't keeping up we don't livelock.

Fixes: 8e4d28036c ("bcachefs: Don't aggressively discard the journal")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-14 17:05:19 -04:00
Kent Overstreet
b1c71cb492 bcachefs: Fix broken btree_path lock invariants in next_node()
This fixes btree locking assert pops users were seeing during evacuate:

https://github.com/koverstreet/bcachefs/issues/878

May 09 22:45:02 sharon kernel: bcachefs (68116e25-fa2d-4c6f-86c7-e8b431d792ae):   bch2_btree_insert_node(): node not locked at level 1
May 09 22:45:02 sharon kernel:   bch2_btree_node_rewrite [bcachefs]: watermark=btree no_check_rw alloc l=0-1 mode=none nodes_written=0 cl.remaining=2 journal_seq=0
May 09 22:45:02 sharon kernel:   path: idx   1 ref 1:0   S B btree=alloc level=0 pos 0:3699637:0 0:3698012:1-0:3699637:0 bch2_move_btree.isra.0+0x1db/0x490 [bcachefs] uptodate 0 locks_want 2
May 09 22:45:02 sharon kernel:     l=0 locks intent seq 4 node ffff8bd700c93600
May 09 22:45:02 sharon kernel:     l=1 locks unlocked seq 1712 node ffff8bd6fd5e7a00
May 09 22:45:02 sharon kernel:     l=2 locks unlocked seq 2295 node ffff8bd6cc725400
May 09 22:45:02 sharon kernel:     l=3 locks unlocked seq 0 node 0000000000000000

Evacuate walks btree nodes with bch2_btree_iter_next_node() and rewrites
them, bch2_btree_update_start() upgrades the path to take intent locks
as far as it needs to.

But next_node() does low level unlock/relock calls on individual nodes,
and didn't handle the case where a path is supposed to be holding
multiple intent locks. If a path has locks_want > 1, it needs to be
either holding locks on all the btree nodes (at each level) requested,
or none of them.

Fix this with a bch2_btree_path_downgrade().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-14 17:05:19 -04:00
Kent Overstreet
cd52cc3544 bcachefs: Don't strip rebalance_opts from indirect extents
Fix bch2_bkey_clear_needs_rebalance(): indirect extents are never
supposed to have bch_extent_rebalance stripped off, because that's how
we get the IO path options when we don't have the original inode it
belonged to.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-14 17:05:19 -04:00
Linus Torvalds
c94d59a126 Merge tag 'trace-v6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:

 - Fix sample code that uses trace_array_printk()

   The sample code for in kernel use of trace_array (that creates an
   instance for use within the kernel) and shows how to use
   trace_array_printk() that writes into the created instance, used
   trace_printk_init_buffers(). But that function is used to initialize
   normal trace_printk() and produces the NOTICE banner which is not
   needed for use of trace_array_printk(). The function to initialize
   that is trace_array_init_printk() that takes the created trace array
   instance as a parameter.

   Update the sample code to reflect the proper usage.

 - Fix preemption count output for stacktrace event

   The tracing buffer shows the preempt count level when an event
   executes. Because writing the event itself disables preemption, this
   needs to be accounted for when recording. The stacktrace event did
   not account for this so the output of the stacktrace event showed
   preemption was disabled while the event that triggered the stacktrace
   shows preemption is enabled and this leads to confusion. Account for
   preemption being disabled for the stacktrace event.

   The same happened for stack traces triggered by function tracer.

 - Fix persistent ring buffer when trace_pipe is used

   The ring buffer swaps the reader page with the next page to read from
   the write buffer when trace_pipe is used. If there's only a page of
   data in the ring buffer, this swap will cause the "commit" pointer
   (last data written) to be on the reader page. If more data is written
   to the buffer, it is added to the reader page until it falls off back
   into the write buffer.

   If the system reboots and the commit pointer is still on the reader
   page, even if new data was written, the persistent buffer validator
   will miss finding the commit pointer because it only checks the write
   buffer and does not check the reader page. This causes the validator
   to fail the validation and clear the buffer, where the new data is
   lost.

   There was a check for this, but it checked the "head pointer", which
   was incorrect, because the "head pointer" always stays on the write
   buffer and is the next page to swap out for the reader page. Fix the
   logic to catch this case and allow the user to still read the data
   after reboot.

* tag 'trace-v6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  ring-buffer: Fix persistent buffer when commit page is the reader page
  ftrace: Fix preemption accounting for stacktrace filter command
  ftrace: Fix preemption accounting for stacktrace trigger command
  tracing: samples: Initialize trace_array_printk() with the correct function
2025-05-14 11:24:19 -07:00
Steven Rostedt
1d6c39c89f ring-buffer: Fix persistent buffer when commit page is the reader page
The ring buffer is made up of sub buffers (sometimes called pages as they
are by default PAGE_SIZE). It has the following "pages":

  "tail page" - this is the page that the next write will write to
  "head page" - this is the page that the reader will swap the reader page with.
  "reader page" - This belongs to the reader, where it will swap the head
                  page from the ring buffer so that the reader does not
                  race with the writer.

The writer may end up on the "reader page" if the ring buffer hasn't
written more than one page, where the "tail page" and the "head page" are
the same.

The persistent ring buffer has meta data that points to where these pages
exist so on reboot it can re-create the pointers to the cpu_buffer
descriptor. But when the commit page is on the reader page, the logic is
incorrect.

The check to see if the commit page is on the reader page checked if the
head page was the reader page, which would never happen, as the head page
is always in the ring buffer. The correct check would be to test if the
commit page is on the reader page. If that's the case, then it can exit
out early as the commit page is only on the reader page when there's only
one page of data in the buffer. There's no reason to iterate the ring
buffer pages to find the "commit page" as it is already found.

To trigger this bug:

  # echo 1 > /sys/kernel/tracing/instances/boot_mapped/events/syscalls/sys_enter_fchownat/enable
  # touch /tmp/x
  # chown sshd /tmp/x
  # reboot

On boot up, the dmesg will have:
 Ring buffer meta [0] is from previous boot!
 Ring buffer meta [1] is from previous boot!
 Ring buffer meta [2] is from previous boot!
 Ring buffer meta [3] is from previous boot!
 Ring buffer meta [4] commit page not found
 Ring buffer meta [5] is from previous boot!
 Ring buffer meta [6] is from previous boot!
 Ring buffer meta [7] is from previous boot!

Where the buffer on CPU 4 had a "commit page not found" error and that
buffer is cleared and reset causing the output to be empty and the data lost.

When it works correctly, it has:

  # cat /sys/kernel/tracing/instances/boot_mapped/trace_pipe
        <...>-1137    [004] .....   998.205323: sys_enter_fchownat: __syscall_nr=0x104 (260) dfd=0xffffff9c (4294967196) filename=(0xffffc90000a0002c) user=0x3e8 (1000) group=0xffffffff (4294967295) flag=0x0 (0

Cc: stable@vger.kernel.org
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/20250513115032.3e0b97f7@gandalf.local.home
Fixes: 5f3b6e839f ("ring-buffer: Validate boot range memory events")
Reported-by: Tasos Sahanidis <tasos@tasossah.com>
Tested-by: Tasos Sahanidis <tasos@tasossah.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-05-14 13:53:23 -04:00
pengdonglin
11aff32439 ftrace: Fix preemption accounting for stacktrace filter command
The preemption count of the stacktrace filter command to trace ksys_read
is consistently incorrect:

$ echo ksys_read:stacktrace > set_ftrace_filter

   <...>-453     [004] ...1.    38.308956: <stack trace>
=> ksys_read
=> do_syscall_64
=> entry_SYSCALL_64_after_hwframe

The root cause is that the trace framework disables preemption when
invoking the filter command callback in function_trace_probe_call:

   preempt_disable_notrace();
   probe_ops->func(ip, parent_ip, probe_opsbe->tr, probe_ops, probe->data);
   preempt_enable_notrace();

Use tracing_gen_ctx_dec() to account for the preempt_disable_notrace(),
which will output the correct preemption count:

$ echo ksys_read:stacktrace > set_ftrace_filter

   <...>-410     [006] .....    31.420396: <stack trace>
=> ksys_read
=> do_syscall_64
=> entry_SYSCALL_64_after_hwframe

Cc: stable@vger.kernel.org
Fixes: 36590c50b2 ("tracing: Merge irqflags + preempt counter.")
Link: https://lore.kernel.org/20250512094246.1167956-2-dolinux.peng@gmail.com
Signed-off-by: pengdonglin <dolinux.peng@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-05-14 13:53:23 -04:00
pengdonglin
e333332657 ftrace: Fix preemption accounting for stacktrace trigger command
When using the stacktrace trigger command to trace syscalls, the
preemption count was consistently reported as 1 when the system call
event itself had 0 (".").

For example:

root@ubuntu22-vm:/sys/kernel/tracing/events/syscalls/sys_enter_read
$ echo stacktrace > trigger
$ echo 1 > enable

    sshd-416     [002] .....   232.864910: sys_read(fd: a, buf: 556b1f3221d0, count: 8000)
    sshd-416     [002] ...1.   232.864913: <stack trace>
 => ftrace_syscall_enter
 => syscall_trace_enter
 => do_syscall_64
 => entry_SYSCALL_64_after_hwframe

The root cause is that the trace framework disables preemption in __DO_TRACE before
invoking the trigger callback.

Use the tracing_gen_ctx_dec() that will accommodate for the increase of
the preemption count in __DO_TRACE when calling the callback. The result
is the accurate reporting of:

    sshd-410     [004] .....   210.117660: sys_read(fd: 4, buf: 559b725ba130, count: 40000)
    sshd-410     [004] .....   210.117662: <stack trace>
 => ftrace_syscall_enter
 => syscall_trace_enter
 => do_syscall_64
 => entry_SYSCALL_64_after_hwframe

Cc: stable@vger.kernel.org
Fixes: ce33c845b0 ("tracing: Dump stacktrace trigger to the corresponding instance")
Link: https://lore.kernel.org/20250512094246.1167956-1-dolinux.peng@gmail.com
Signed-off-by: pengdonglin <dolinux.peng@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-05-14 13:53:09 -04:00
Linus Torvalds
1a80a098c6 Merge tag 'execve-v6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull execve fix from Kees Cook:
 "This fixes a corner case for ASLR-disabled static-PIE brk collision
  with vdso allocations:

   - binfmt_elf: Move brk for static PIE even if ASLR disabled"

* tag 'execve-v6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  binfmt_elf: Move brk for static PIE even if ASLR disabled
2025-05-14 09:15:16 -07:00
Linus Torvalds
00f281fd9d Merge tag 'soc-fixes-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
 "These all address issues in devicetree files:

   - The Rockchip rk3588j are now limited the same way as the vendor
     kernel, to allow room for the industrial-grade temperature ranges.

   - Seven more Rockchip fixes address minor issues with specific boards

   - Invalid clk controller references in multiple amlogic chips, plus
     one accidentally disabled audio on clock

   - Two devicetree fixes for i.MX8MP boards, both for incorrect
     regulator settings

   - A power domain change for apple laptop touchbar, fixing
     suspend/resume problems

   - An incorrect DMA controller setting for sophgo cv18xx chips"

* tag 'soc-fixes-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  arm64: dts: amazon: Fix simple-bus node name schema warnings
  MAINTAINERS: delete email for Shiraz Hashim
  arm64: dts: imx8mp-var-som: Fix LDO5 shutdown causing SD card timeout
  arm64: dts: imx8mp: use 800MHz NoC OPP for nominal drive mode
  arm64: dts: amlogic: dreambox: fix missing clkc_audio node
  riscv: dts: sophgo: fix DMA data-width configuration for CV18xx
  arm64: dts: rockchip: fix Sige5 RTC interrupt pin
  arm64: dts: rockchip: Assign RT5616 MCLK rate on rk3588-friendlyelec-cm3588
  arm64: dts: rockchip: Align wifi node name with bindings in CB2
  arm64: dts: amlogic: g12: fix reference to unknown/untested PWM clock
  arm64: dts: amlogic: gx: fix reference to unknown/untested PWM clock
  ARM: dts: amlogic: meson8b: fix reference to unknown/untested PWM clock
  ARM: dts: amlogic: meson8: fix reference to unknown/untested PWM clock
  arm64: dts: apple: touchbar: Mark ps_dispdfr_be as always-on
  mailmap: Update email for Asahi Lina
  arm64: dts: rockchip: Fix mmc-pwrseq clock name on rock-pi-4
  arm64: dts: rockchip: Use "regulator-fixed" for btreg on px30-engicam for vcc3v3-btreg
  arm64: dts: rockchip: Add pinmuxing for eMMC on QNAP TS433
  arm64: dts: rockchip: Remove overdrive-mode OPPs from RK3588J SoC dtsi
  arm64: dts: rockchip: Allow Turing RK1 cooling fan to spin down
2025-05-14 09:11:05 -07:00
Umesh Nerlige Ramappa
617d824c53 drm/xe: Add WA BB to capture active context utilization
Context Timestamp (CTX_TIMESTAMP) in the LRC accumulates the run ticks
of the context, but only gets updated when the context switches out. In
order to check how long a context has been active before it switches
out, two things are required:

(1) Determine if the context is running:

To do so, we program the WA BB to set an initial value for CTX_TIMESTAMP
in the LRC. The value chosen is 1 since 0 is the initial value when the
LRC is initialized. During a query, we just check for this value to
determine if the context is active. If the context switched out, it
would overwrite this location with the actual CTX_TIMESTAMP MMIO value.
Note that WA BB runs as the last part of the context restore, so reusing
this LRC location will not clobber anything.

(2) Calculate the time that the context has been active for:

The CTX_TIMESTAMP ticks only when the context is active. If a context is
active, we just use the CTX_TIMESTAMP MMIO as the new value of
utilization. While doing so, we need to read the CTX_TIMESTAMP MMIO
for the specific engine instance. Since we do not know which instance
the context is running on until it is scheduled, we also read the
ENGINE_ID MMIO in the WA BB and store it in the PPHSWP.

Using the above 2 instructions in a WA BB, capture active context
utilization.

v2: (Matt Brost)
- This breaks TDR, fix it by saving the CTX_TIMESTAMP register
  "drm/xe: Save CTX_TIMESTAMP mmio value instead of LRC value"
- Drop tile from LRC if using gt
  "drm/xe: Save the gt pointer in LRC and drop the tile"

v3:
- Remove helpers for bb_per_ctx_ptr (Matt)
- Add define for context active value (Matt)
- Use 64 bit CTX TIMESTAMP for platforms that support it. For platforms
  that don't, live with the rare race. (Matt, Lucas)
- Convert engine id to hwe and get the MMIO value (Lucas)
- Correct commit message on when WA BB runs (Lucas)

v4:
- s/GRAPHICS_VER(...)/xe->info.has_64bit_timestamp/ (Matt)
- Drop support for active utilization on a VF (CI failure)
- In xe_lrc_init ensure the lrc value is 0 to begin with (CI regression)

v5:
- Minor checkpatch fix
- Squash into previous commit and make TDR use 32-bit time
- Update code comment to match commit msg

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4532
Cc: <stable@vger.kernel.org> # v6.13+
Suggested-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250509161159.2173069-8-umesh.nerlige.ramappa@intel.com
(cherry picked from commit 82b98cadb0)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14 09:03:29 -07:00
Umesh Nerlige Ramappa
ce15563e49 drm/xe: Save the gt pointer in lrc and drop the tile
Save the gt pointer in the lrc so that it can used for gt based helpers.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250509161159.2173069-7-umesh.nerlige.ramappa@intel.com
(cherry picked from commit 741d3ef8b8)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14 09:03:29 -07:00
Umesh Nerlige Ramappa
66c8f7b435 drm/xe: Save CTX_TIMESTAMP mmio value instead of LRC value
For determining actual job execution time, save the current value of the
CTX_TIMESTAMP register rather than the value saved in LRC since the
current register value is the closest to the start time of the job.

v2: Define MI_STORE_REGISTER_MEM to fix compile error
v3: Place MI_STORE_REGISTER_MEM sorted by MI_INSTR (Lucas)

Fixes: 65921374c4 ("drm/xe: Emit ctx timestamp copy in ring ops")
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250509161159.2173069-6-umesh.nerlige.ramappa@intel.com
(cherry picked from commit 38b14233e5)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14 09:03:29 -07:00
Matthew Brost
1b36ea2fc6 drm/xe: Timeslice GPU on atomic SVM fault
Ensure GPU can make forward progress on an atomic SVM GPU fault by
giving the GPU a timeslice of 5ms

v2:
 - Reduce timeslice to 5ms
 - Double timeslice on retry
 - Split out GPU SVM changes into independent patch
v5:
 - Double timeslice in a few more places

Fixes: 2f118c9491 ("drm/xe: Add SVM VRAM migration")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://lore.kernel.org/r/20250512135500.1405019-5-matthew.brost@intel.com
(cherry picked from commit a5d8d3be1d)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14 09:03:29 -07:00
Matthew Brost
df8c37810b drm/gpusvm: Add timeslicing support to GPU SVM
Add timeslicing support to GPU SVM which will guarantee the GPU a
minimum execution time on piece of physical memory before migration back
to CPU. Intended to implement strict migration policies which require
memory to be in a certain placement for correct execution.

Required for shared CPU and GPU atomics on certain devices.

Fixes: 99624bdff8 ("drm/gpusvm: Add support for GPU Shared Virtual Memory")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://lore.kernel.org/r/20250512135500.1405019-4-matthew.brost@intel.com
(cherry picked from commit 8dc1812b5b)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14 09:03:29 -07:00
Matthew Brost
794f5493f5 drm/xe: Strict migration policy for atomic SVM faults
Mixing GPU and CPU atomics does not work unless a strict migration
policy of GPU atomics must be device memory. Enforce a policy of must be
in VRAM with a retry loop of 3 attempts, if retry loop fails abort
fault.

Removing always_migrate_to_vram modparam as we now have real migration
policy.

v2:
 - Only retry migration on atomics
 - Drop alway migrate modparam
v3:
 - Only set vram_only on DGFX (Himal)
 - Bail on get_pages failure if vram_only and retry count exceeded (Himal)
 - s/vram_only/devmem_only
 - Update xe_svm_range_is_valid to accept devmem_only argument
v4:
 - Fix logic bug get_pages failure
v5:
 - Fix commit message (Himal)
 - Mention removing always_migrate_to_vram in commit message (Lucas)
 - Fix xe_svm_range_is_valid to check for devmem pages
 - Bail on devmem_only && !migrate_devmem (Thomas)
v6:
 - Add READ_ONCE barriers for opportunistic checks (Thomas)
 - Pair READ_ONCE with WRITE_ONCE (Thomas)
v7:
 - Adjust comments (Thomas)

Fixes: 2f118c9491 ("drm/xe: Add SVM VRAM migration")
Cc: stable@vger.kernel.org
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://lore.kernel.org/r/20250512135500.1405019-3-matthew.brost@intel.com
(cherry picked from commit a9ac0fa455)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14 09:03:29 -07:00
Himal Prasad Ghimiray
7bd68ce21d drm/gpusvm: Introduce devmem_only flag for allocation
This commit adds a new flag, devmem_only, to the drm_gpusvm structure. The
purpose of this flag is to ensure that the get_pages function allocates
memory exclusively from the device's memory. If the allocation from
device memory fails, the function will return an -EFAULT error.

Required for shared CPU and GPU atomics on certain devices.

v3:
 - s/vram_only/devmem_only/

Fixes: 99624bdff8 ("drm/gpusvm: Add support for GPU Shared Virtual Memory")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250512135500.1405019-2-matthew.brost@intel.com
(cherry picked from commit 8a9b978ebd)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14 09:03:29 -07:00
Aradhya Bhatia
b1f704107c drm/xe/xe2hpg: Add Wa_22021007897
Add Wa_22021007897 for the Xe2_HPG (graphics version: 20.01) IP. It is
a permanent workaround, and applicable on all the steppings.

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com>
Link: https://lore.kernel.org/r/20250512065004.2576-1-aradhya.bhatia@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit e5c13e2c50)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-14 09:03:29 -07:00
David (Ming Qiang) Wu
ee7360fc27 drm/amdgpu: read back register after written for VCN v4.0.5
On VCN v4.0.5 there is a race condition where the WPTR is not
updated after starting from idle when doorbell is used. Adding
register read-back after written at function end is to ensure
all register writes are done before they can be used.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12528
Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 07c9db090b)
Cc: stable@vger.kernel.org
2025-05-14 11:51:31 -04:00
Melissa Wen
fe14c0f096 Revert "drm/amd/display: Hardware cursor changes color when switched to software cursor"
This reverts commit 272e6aab14.

Applying degamma curve to the cursor by default breaks Linux userspace
expectation.

On Linux, AMD display manager enables cursor degamma ROM just for
implict sRGB on HW versions where degamma is split into two blocks:
degamma ROM for pre-defined TFs and `gamma correction` for user/custom
curves, and degamma ROM settings doesn't apply to cursor plane.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1513
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2803
Reported-by: Michel Dänzer <michel.daenzer@mailbox.org>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4144
Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f6a305d474)
Cc: stable@vger.kernel.org
2025-05-14 11:51:04 -04:00
Christophe JAILLET
1cfe51ef07 i2c: designware: Fix an error handling path in i2c_dw_pci_probe()
If navi_amd_register_client() fails, the previous i2c_dw_probe() call
should be undone by a corresponding i2c_del_adapter() call, as already done
in the remove function.

Fixes: 17631e8ca2 ("i2c: designware: Add driver support for AMD NAVI GPU")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: <stable@vger.kernel.org> # v5.13+
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/fcd9651835a32979df8802b2db9504c523a8ebbb.1747158983.git.christophe.jaillet@wanadoo.fr
2025-05-14 17:28:24 +02:00
Ilya Guterman
e765bf89f4 nvme-pci: add NVME_QUIRK_NO_DEEPEST_PS quirk for SOLIDIGM P44 Pro
This commit adds the NVME_QUIRK_NO_DEEPEST_PS quirk for device
[126f:2262], which belongs to device SOLIDIGM P44 Pro SSDPFKKW020X7

The device frequently have trouble exiting the deepest power state (5),
resulting in the entire disk being unresponsive.

Verified by setting nvme_core.default_ps_max_latency_us=10000 and
observing the expected behavior.

Signed-off-by: Ilya Guterman <amfernusus@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-14 17:16:16 +02:00
Nicolas Frattaroli
08f959759e mmc: sdhci-of-dwcmshc: add PD workaround on RK3576
RK3576's power domains have a peculiar design where the PD_NVM power
domain, of which the sdhci controller is a part, seemingly does not have
idempotent runtime disable/enable. The end effect is that if PD_NVM gets
turned off by the generic power domain logic because all the devices
depending on it are suspended, then the next time the sdhci device is
unsuspended, it'll hang the SoC as soon as it tries accessing the CQHCI
registers.

RK3576's UFS support needed a new dev_pm_genpd_rpm_always_on function
added to the generic power domains API to handle what appears to be a
similar hardware design.

Use this new function to ask for the same treatment in the sdhci
controller by giving rk3576 its own platform data with its own postinit
function. The benefit of doing this instead of marking the power domains
always on in the power domain core is that we only do this if we know
the platform we're running on actually uses the sdhci controller. For
others, keeping PD_NVM always on would be a waste, as they won't run
into this specific issue. The only other IP in PD_NVM that could be
affected is FSPI0. If it gets a mainline driver, it will probably want
to do the same thing.

Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Fixes: cfee1b5077 ("pmdomain: rockchip: Add support for RK3576 SoC")
Cc: <stable@vger.kernel.org> # v6.15+
Link: https://lore.kernel.org/r/20250423-rk3576-emmc-fix-v3-1-0bf80e29967f@collabora.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2025-05-14 17:01:42 +02:00
Tavian Barnes
7dd7f39fce ASoC: SOF: Intel: hda: Fix UAF when reloading module
hda_generic_machine_select() appends -idisp to the tplg filename by
allocating a new string with devm_kasprintf(), then stores the string
right back into the global variable snd_soc_acpi_intel_hda_machines.
When the module is unloaded, this memory is freed, resulting in a global
variable pointing to freed memory.  Reloading the module then triggers
a use-after-free:

BUG: KFENCE: use-after-free read in string+0x48/0xe0

Use-after-free read at 0x00000000967e0109 (in kfence-#99):
 string+0x48/0xe0
 vsnprintf+0x329/0x6e0
 devm_kvasprintf+0x54/0xb0
 devm_kasprintf+0x58/0x80
 hda_machine_select.cold+0x198/0x17a2 [snd_sof_intel_hda_generic]
 sof_probe_work+0x7f/0x600 [snd_sof]
 process_one_work+0x17b/0x330
 worker_thread+0x2ce/0x3f0
 kthread+0xcf/0x100
 ret_from_fork+0x31/0x50
 ret_from_fork_asm+0x1a/0x30

kfence-#99: 0x00000000198a940f-0x00000000ace47d9d, size=64, cache=kmalloc-64

allocated by task 333 on cpu 8 at 17.798069s (130.453553s ago):
 devm_kmalloc+0x52/0x120
 devm_kvasprintf+0x66/0xb0
 devm_kasprintf+0x58/0x80
 hda_machine_select.cold+0x198/0x17a2 [snd_sof_intel_hda_generic]
 sof_probe_work+0x7f/0x600 [snd_sof]
 process_one_work+0x17b/0x330
 worker_thread+0x2ce/0x3f0
 kthread+0xcf/0x100
 ret_from_fork+0x31/0x50
 ret_from_fork_asm+0x1a/0x30

freed by task 1543 on cpu 4 at 141.586686s (6.665010s ago):
 release_nodes+0x43/0xb0
 devres_release_all+0x90/0xf0
 device_unbind_cleanup+0xe/0x70
 device_release_driver_internal+0x1c1/0x200
 driver_detach+0x48/0x90
 bus_remove_driver+0x6d/0xf0
 pci_unregister_driver+0x42/0xb0
 __do_sys_delete_module+0x1d1/0x310
 do_syscall_64+0x82/0x190
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Fix it by copying the match array with devm_kmemdup_array() before we
modify it.

Fixes: 5458411d75 ("ASoC: SOF: Intel: hda: refactoring topology name fixup for HDA mach")
Suggested-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Acked-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Signed-off-by: Tavian Barnes <tavianator@tavianator.com>
Link: https://patch.msgid.link/570b15570b274520a0d9052f4e0f064a29c950ef.1747229716.git.tavianator@tavianator.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-14 16:34:48 +02:00
Marc Zyngier
fb0ea6e487 irqchip: Drop MSI_CHIP_FLAG_SET_ACK from unsuspecting MSI drivers
Commit 1c000dcaad ("irqchip/irq-msi-lib: Optionally set default
irq_eoi()/irq_ack()") added blanket MSI_CHIP_FLAG_SET_ACK flags,
irrespective of whether the underlying irqchip required it or not.

Drop it from a number of drivers that do not require it.

Fixes: 1c000dcaad ("irqchip/irq-msi-lib: Optionally set default irq_eoi()/irq_ack()")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250513172819.2216709-6-maz@kernel.org
2025-05-14 16:24:27 +02:00
Tiezhu Yang
12614f7942 LoongArch: uprobes: Remove redundant code about resume_era
arch_uprobe_skip_sstep() returns true if instruction was emulated, that
is to say, there is no need to single step for the emulated instructions.
regs->csr_era will point to the destination address directly after the
exception, so the resume_era related code is redundant, just remove them.

Cc: stable@vger.kernel.org
Fixes: 19bc6cb640 ("LoongArch: Add uprobes support")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-05-14 22:18:10 +08:00
Tiezhu Yang
0b326b2371 LoongArch: uprobes: Remove user_{en,dis}able_single_step()
When executing the "perf probe" and "perf stat" test cases about some
cryptographic algorithm, the output shows that "Trace/breakpoint trap".
This is because it uses the software singlestep breakpoint for uprobes
on LoongArch, and no need to use the hardware singlestep. So just remove
the related function call to user_{en,dis}able_single_step() for uprobes
on LoongArch.

How to reproduce:

Please make sure CONFIG_UPROBE_EVENTS is set and openssl supports sm2
algorithm, then execute the following command.

cd tools/perf && make
./perf probe -x /usr/lib64/libcrypto.so BN_mod_mul_montgomery
./perf stat -e probe_libcrypto:BN_mod_mul_montgomery openssl speed sm2

Cc: stable@vger.kernel.org
Fixes: 19bc6cb640 ("LoongArch: Add uprobes support")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-05-14 22:18:10 +08:00
Huacai Chen
ceb9155d05 LoongArch: Save and restore CSR.CNTC for hibernation
Save and restore CSR.CNTC for hibernation which is similar to suspend.

For host this is unnecessary because sched clock is ensured continuous,
but for kvm guest sched clock isn't enough because rdtime.d should also
be continuous.

Host::rdtime.d = Host::CSR.CNTC + counter
Guest::rdtime.d = Host::CSR.CNTC + Host::CSR.GCNTC + Guest::CSR.CNTC + counter

so,

Guest::rdtime.d = Host::rdtime.d + Host::CSR.GCNTC + Guest::CSR.CNTC

To ensure Guest::rdtime.d continuous, Host::rdtime.d should be at first
continuous, while Host::CSR.GCNTC / Guest::CSR.CNTC is maintained by KVM.

Cc: stable@vger.kernel.org
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-05-14 22:17:52 +08:00
Huacai Chen
3e245b7b74 LoongArch: Move __arch_cpu_idle() to .cpuidle.text section
Now arch_cpu_idle() is annotated with __cpuidle which means it is in
the .cpuidle.text section, but __arch_cpu_idle() isn't. Thus, fix the
missing .cpuidle.text section assignment for __arch_cpu_idle() in order
to correct backtracing with nmi_backtrace().

The principle is similar to the commit 97c8580e85 ("MIPS: Annotate
cpu_wait implementations with __cpuidle")

Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-05-14 22:17:52 +08:00
Huacai Chen
90436d2342 LoongArch: Fix MAX_REG_OFFSET calculation
Fix MAX_REG_OFFSET calculation, make it point to the last register
in 'struct pt_regs' and not to the marker itself, which could allow
regs_get_register() to return an invalid offset.

Cc: stable@vger.kernel.org
Fixes: 803b0fc5c3 ("LoongArch: Add process management")
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-05-14 22:17:43 +08:00
Tianyang Zhang
2468b0e3d5 LoongArch: Prevent cond_resched() occurring within kernel-fpu
When CONFIG_PREEMPT_COUNT is not configured (i.e. CONFIG_PREEMPT_NONE/
CONFIG_PREEMPT_VOLUNTARY), preempt_disable() / preempt_enable() merely
acts as a barrier(). However, in these cases cond_resched() can still
trigger a context switch and modify the CSR.EUEN, resulting in do_fpu()
exception being activated within the kernel-fpu critical sections, as
demonstrated in the following path:

dcn32_calculate_wm_and_dlg()
    DC_FP_START()
	dcn32_calculate_wm_and_dlg_fpu()
	    dcn32_find_dummy_latency_index_for_fw_based_mclk_switch()
		dcn32_internal_validate_bw()
		    dcn32_enable_phantom_stream()
			dc_create_stream_for_sink()
			   kzalloc(GFP_KERNEL)
				__kmem_cache_alloc_node()
				    __cond_resched()
    DC_FP_END()

This patch is similar to commit d021985504 (x86/fpu: Improve crypto
performance by making kernel-mode FPU reliably usable in softirqs).  It
uses local_bh_disable() instead of preempt_disable() for non-RT kernels
so it can avoid the cond_resched() issue, and also extend the kernel-fpu
application scenarios to the softirq context.

Cc: stable@vger.kernel.org
Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-05-14 22:17:43 +08:00
Stefan Wahren
5e27af0514 dmaengine: fsl-edma: Fix return code for unhandled interrupts
For fsl,imx93-edma4 two DMA channels share the same interrupt.
So in case fsl_edma3_tx_handler is called for the "wrong"
channel, the return code must be IRQ_NONE. This signalize that
the interrupt wasn't handled.

Fixes: 72f5801a4e ("dmaengine: fsl-edma: integrate v3 support")
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Joy Zou <joy.zou@nxp.com>
Link: https://lore.kernel.org/r/20250424114829.9055-1-wahrenst@gmx.net
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 14:56:22 +01:00
Qiu-ji Chen
157ae5ffd7 dmaengine: mediatek: Fix a possible deadlock error in mtk_cqdma_tx_status()
Fix a potential deadlock bug. Observe that in the mtk-cqdma.c
file, functions like mtk_cqdma_issue_pending() and
mtk_cqdma_free_active_desc() properly acquire the pc lock before the vc
lock when handling pc and vc fields. However, mtk_cqdma_tx_status()
violates this order by first acquiring the vc lock before invoking
mtk_cqdma_find_active_desc(), which subsequently takes the pc lock. This
reversed locking sequence (vc → pc) contradicts the established
pc → vc order and creates deadlock risks.

Fix the issue by moving the vc lock acquisition code from
mtk_cqdma_find_active_desc() to mtk_cqdma_tx_status(). Ensure the pc lock
is acquired before the vc lock in the calling function to maintain correct
locking hierarchy. Note that since mtk_cqdma_find_active_desc() is a
static function with only one caller (mtk_cqdma_tx_status()), this
modification safely eliminates the deadlock possibility without affecting
other components.

This possible bug is found by an experimental static analysis tool
developed by our team. This tool analyzes the locking APIs to extract
function pairs that can be concurrently executed, and then analyzes the
instructions in the paired functions to identify possible concurrency bugs
including deadlocks, data races and atomicity violations.

Fixes: b1f01e48df ("dmaengine: mediatek: Add MediaTek Command-Queue DMA controller for MT6765 SoC")
Cc: stable@vger.kernel.org
Signed-off-by: Qiu-ji Chen <chenqiuji666@gmail.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20250508073634.3719-1-chenqiuji666@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 14:54:31 +01:00
Dave Jiang
ae74cd15ad dmaengine: idxd: Fix ->poll() return value
The fix to block access from different address space did not return a
correct value for ->poll() change.  kernel test bot reported that a
return value of type __poll_t is expected rather than int. Fix to return
POLLNVAL to indicate invalid request.

Fixes: 8dfa57aabf ("dmaengine: idxd: Fix allowing write() from different address spaces")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505081851.rwD7jVxg-lkp@intel.com/
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20250508170548.2747425-1-dave.jiang@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 14:52:55 +01:00
Shuai Xue
a409e919ca dmaengine: idxd: Refactor remove call with idxd_cleanup() helper
The idxd_cleanup() helper cleans up perfmon, interrupts, internals and
so on. Refactor remove call with the idxd_cleanup() helper to avoid code
duplication. Note, this also fixes the missing put_device() for idxd
groups, enginces and wqs.

Fixes: bfe1d56091 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Cc: stable@vger.kernel.org
Suggested-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20250404120217.48772-10-xueshuai@linux.alibaba.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 14:40:40 +01:00
Shuai Xue
d5449ff1b0 dmaengine: idxd: Add missing idxd cleanup to fix memory leak in remove call
The remove call stack is missing idxd cleanup to free bitmap, ida and
the idxd_device. Call idxd_free() helper routines to make sure we exit
gracefully.

Fixes: bfe1d56091 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Cc: stable@vger.kernel.org
Suggested-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20250404120217.48772-9-xueshuai@linux.alibaba.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 14:40:40 +01:00
Shuai Xue
90022b3a69 dmaengine: idxd: fix memory leak in error handling path of idxd_pci_probe
Memory allocated for idxd is not freed if an error occurs during
idxd_pci_probe(). To fix it, free the allocated memory in the reverse
order of allocation before exiting the function in case of an error.

Fixes: bfe1d56091 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Cc: stable@vger.kernel.org
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Link: https://lore.kernel.org/r/20250404120217.48772-8-xueshuai@linux.alibaba.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 14:40:40 +01:00
Shuai Xue
46a5cca76c dmaengine: idxd: fix memory leak in error handling path of idxd_alloc
Memory allocated for idxd is not freed if an error occurs during
idxd_alloc(). To fix it, free the allocated memory in the reverse order
of allocation before exiting the function in case of an error.

Fixes: a8563a33a5 ("dmanegine: idxd: reformat opcap output to match bitmap_parse() input")
Cc: stable@vger.kernel.org
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Link: https://lore.kernel.org/r/20250404120217.48772-7-xueshuai@linux.alibaba.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 14:40:39 +01:00
Shuai Xue
61d651572b dmaengine: idxd: Add missing cleanups in cleanup internals
The idxd_cleanup_internals() function only decreases the reference count
of groups, engines, and wqs but is missing the step to release memory
resources.

To fix this, use the cleanup helper to properly release the memory
resources.

Fixes: ddf742d4f3 ("dmaengine: idxd: Add missing cleanup for early error out in probe call")
Cc: stable@vger.kernel.org
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20250404120217.48772-6-xueshuai@linux.alibaba.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 14:40:39 +01:00
Shuai Xue
61259fb96e dmaengine: idxd: Add missing cleanup for early error out in idxd_setup_internals
The idxd_setup_internals() is missing some cleanup when things fail in
the middle.

Add the appropriate cleanup routines:

- cleanup groups
- cleanup enginces
- cleanup wqs

to make sure it exits gracefully.

Fixes: defe49f960 ("dmaengine: idxd: fix group conf_dev lifetime")
Cc: stable@vger.kernel.org
Suggested-by: Fenghua Yu <fenghuay@nvidia.com>
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20250404120217.48772-5-xueshuai@linux.alibaba.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 14:40:39 +01:00
Shuai Xue
aa6f4f945b dmaengine: idxd: fix memory leak in error handling path of idxd_setup_groups
Memory allocated for groups is not freed if an error occurs during
idxd_setup_groups(). To fix it, free the allocated memory in the reverse
order of allocation before exiting the function in case of an error.

Fixes: defe49f960 ("dmaengine: idxd: fix group conf_dev lifetime")
Cc: stable@vger.kernel.org
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Link: https://lore.kernel.org/r/20250404120217.48772-4-xueshuai@linux.alibaba.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 14:40:39 +01:00
Shuai Xue
817bced19d dmaengine: idxd: fix memory leak in error handling path of idxd_setup_engines
Memory allocated for engines is not freed if an error occurs during
idxd_setup_engines(). To fix it, free the allocated memory in the
reverse order of allocation before exiting the function in case of an
error.

Fixes: 75b9113090 ("dmaengine: idxd: fix engine conf_dev lifetime")
Cc: stable@vger.kernel.org
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Link: https://lore.kernel.org/r/20250404120217.48772-3-xueshuai@linux.alibaba.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 14:40:39 +01:00
Shuai Xue
3fd2f4bc01 dmaengine: idxd: fix memory leak in error handling path of idxd_setup_wqs
Memory allocated for wqs is not freed if an error occurs during
idxd_setup_wqs(). To fix it, free the allocated memory in the reverse
order of allocation before exiting the function in case of an error.

Fixes: 7c5dd23e57 ("dmaengine: idxd: fix wq conf_dev 'struct device' lifetime")
Fixes: 700af3a0a2 ("dmaengine: idxd: add 'struct idxd_dev' as wrapper for conf_dev")
Fixes: de5819b994 ("dmaengine: idxd: track enabled workqueues in bitmap")
Fixes: b0325aefd3 ("dmaengine: idxd: add WQ operation cap restriction support")
Cc: stable@vger.kernel.org
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Link: https://lore.kernel.org/r/20250404120217.48772-2-xueshuai@linux.alibaba.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 14:40:39 +01:00
Carlos Maiolino
08c73a4b2e xfs: Fix comment on xfs_trans_ail_update_bulk()
This function doesn't take the AIL lock, but should be called
with AIL lock held. Also (hopefuly) simplify the comment.

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chandan Babu R <chandanbabu@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-05-14 15:37:50 +02:00
Carlos Maiolino
fa8deae92f xfs: Fix a comment on xfs_ail_delete
It doesn't return anything.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Chandan Babu R <chandanbabu@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-05-14 15:37:50 +02:00
Nirjhar Roy (IBM)
95b613339c xfs: Fail remount with noattr2 on a v5 with v4 enabled
Bug: When we compile the kernel with CONFIG_XFS_SUPPORT_V4=y,
remount with "-o remount,noattr2" on a v5 XFS does not
fail explicitly.

Reproduction:
mkfs.xfs -f /dev/loop0
mount /dev/loop0 /mnt/scratch
mount -o remount,noattr2 /dev/loop0 /mnt/scratch

However, with CONFIG_XFS_SUPPORT_V4=n, the remount
correctly fails explicitly. This is because the way the
following 2 functions are defined:

static inline bool xfs_has_attr2 (struct xfs_mount *mp)
{
	return !IS_ENABLED(CONFIG_XFS_SUPPORT_V4) ||
		(mp->m_features & XFS_FEAT_ATTR2);
}
static inline bool xfs_has_noattr2 (const struct xfs_mount *mp)
{
	return mp->m_features & XFS_FEAT_NOATTR2;
}

xfs_has_attr2() returns true when CONFIG_XFS_SUPPORT_V4=n
and hence, the following if condition in
xfs_fs_validate_params() succeeds and returns -EINVAL:

/*
 * We have not read the superblock at this point, so only the attr2
 * mount option can set the attr2 feature by this stage.
 */

if (xfs_has_attr2(mp) && xfs_has_noattr2(mp)) {
	xfs_warn(mp, "attr2 and noattr2 cannot both be specified.");
	return -EINVAL;
}

With CONFIG_XFS_SUPPORT_V4=y, xfs_has_attr2() always return
false and hence no error is returned.

Fix: Check if the existing mount has crc enabled(i.e, of
type v5 and has attr2 enabled) and the
remount has noattr2, if yes, return -EINVAL.

I have tested xfs/{189,539} in fstests with v4
and v5 XFS with both CONFIG_XFS_SUPPORT_V4=y/n and
they both behave as expected.

This patch also fixes remount from noattr2 -> attr2 (on a v4 xfs).

Related discussion in [1]

[1] https://lore.kernel.org/all/Z65o6nWxT00MaUrW@dread.disaster.area/

Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-05-14 15:37:50 +02:00
Christoph Hellwig
fbecd731de xfs: fix zoned GC data corruption due to wrong bv_offset
xfs_zone_gc_write_chunk writes out the data buffer read in earlier using
the same bio, and currenly looks at bv_offset for the offset into the
scratch folio for that.  But commit 26064d3e2b ("block: fix adding
folio to bio") changed how bv_page and bv_offset are calculated for
adding larger folios, breaking this fragile logic.

Switch to extracting the full physical address from the old bio_vec,
and calculate the offset into the folio from that instead.

This fixes data corruption during garbage collection with heavy rockdsb
workloads.  Thanks to Hans for tracking down the culprit commit during
long bisection sessions.

Fixes: 26064d3e2b ("block: fix adding folio to bio")
Fixes: 080d01c41d ("xfs: implement zoned garbage collection")
Reported-by: Hans Holmberg <Hans.Holmberg@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <Hans.Holmberg@wdc.com>
Tested-by: Hans Holmberg <Hans.Holmberg@wdc.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-05-14 15:37:49 +02:00
Wengang Wang
09dab6ce02 xfs: free up mp->m_free[0].count in error case
In xfs_init_percpu_counters(), memory for mp->m_free[0].count wasn't freed
in error case. Free it up in this patch.

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Fixes: 712bae9663 ("xfs: generalize the freespace and reserved blocks handling")
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-05-14 15:37:49 +02:00
Hyejeong Choi
72c7d62583 dma-buf: insert memory barrier before updating num_fences
smp_store_mb() inserts memory barrier after storing operation.
It is different with what the comment is originally aiming so Null
pointer dereference can be happened if memory update is reordered.

Signed-off-by: Hyejeong Choi <hjeong.choi@samsung.com>
Fixes: a590d0fdba ("dma-buf: Update reservation shared_count after adding the new fence")
CC: stable@vger.kernel.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20250513020638.GA2329653@au1-maretx-p37.eng.sarc.samsung.com
Signed-off-by: Christian König <christian.koenig@amd.com>
2025-05-14 15:31:23 +02:00
Alan Adamson
8695f060a0 nvme: all namespaces in a subsystem must adhere to a common atomic write size
The first namespace configured in a subsystem sets the subsystem's
atomic write size based on its AWUPF or NAWUPF. Subsequent namespaces
must have an atomic write size (per their AWUPF or NAWUPF) less than or
equal to the subsystem's atomic write size, or their probing will be
rejected.

Signed-off-by: Alan Adamson <alan.adamson@oracle.com>
[hch: fold in review comments from John Garry]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
2025-05-14 15:18:39 +02:00
Jens Axboe
d871198ee4 io_uring/fdinfo: grab ctx->uring_lock around io_uring_show_fdinfo()
Not everything requires locking in there, which is why the 'has_lock'
variable exists. But enough does that it's a bit unwieldy to manage.
Wrap the whole thing in a ->uring_lock trylock, and just return
with no output if we fail to grab it. The existing trylock() will
already have greatly diminished utility/output for the failure case.

This fixes an issue with reading the SQE fields, if the ring is being
actively resized at the same time.

Reported-by: Jann Horn <jannh@google.com>
Fixes: 79cfe9e59c ("io_uring/register: add IORING_REGISTER_RESIZE_RINGS")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-14 07:15:28 -06:00
Ma Ke
b2ea5f4958 phy: Fix error handling in tegra_xusb_port_init
If device_add() fails, do not use device_unregister() for error
handling. device_unregister() consists two functions: device_del() and
put_device(). device_unregister() should only be called after
device_add() succeeded because device_del() undoes what device_add()
does if successful. Change device_unregister() to put_device() call
before returning from the function.

As comment of device_add() says, 'if device_add() succeeds, you should
call device_del() when you want to get rid of it. If device_add() has
not succeeded, use only put_device() to drop the reference count'.

Found by code review.

Cc: stable@vger.kernel.org
Fixes: 53d2a715c2 ("phy: Add Tegra XUSB pad controller support")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20250303072739.3874987-1-make24@iscas.ac.cn
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 12:28:43 +01:00
Claudiu Beznea
86e70849f4 phy: renesas: rcar-gen3-usb2: Set timing registers only once
phy-rcar-gen3-usb2 driver exports 4 PHYs. The timing registers are common
to all PHYs. There is no need to set them every time a PHY is initialized.
Set timing register only when the 1st PHY is initialized.

Fixes: f3b5a8d9b5 ("phy: rcar-gen3-usb2: Add R-Car Gen3 USB2 PHY driver")
Cc: stable@vger.kernel.org
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Link: https://lore.kernel.org/r/20250507125032.565017-6-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 12:25:09 +01:00
Claudiu Beznea
9ce71e85b2 phy: renesas: rcar-gen3-usb2: Assert PLL reset on PHY power off
Assert PLL reset on PHY power off. This saves power.

Fixes: f3b5a8d9b5 ("phy: rcar-gen3-usb2: Add R-Car Gen3 USB2 PHY driver")
Cc: stable@vger.kernel.org
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Link: https://lore.kernel.org/r/20250507125032.565017-5-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 12:25:09 +01:00
Claudiu Beznea
55a387ebb9 phy: renesas: rcar-gen3-usb2: Lock around hardware registers and driver data
The phy-rcar-gen3-usb2 driver exposes four individual PHYs that are
requested and configured by PHY users. The struct phy_ops APIs access the
same set of registers to configure all PHYs. Additionally, PHY settings can
be modified through sysfs or an IRQ handler. While some struct phy_ops APIs
are protected by a driver-wide mutex, others rely on individual
PHY-specific mutexes.

This approach can lead to various issues, including:
1/ the IRQ handler may interrupt PHY settings in progress, racing with
   hardware configuration protected by a mutex lock
2/ due to msleep(20) in rcar_gen3_init_otg(), while a configuration thread
   suspends to wait for the delay, another thread may try to configure
   another PHY (with phy_init() + phy_power_on()); re-running the
   phy_init() goes to the exact same configuration code, re-running the
   same hardware configuration on the same set of registers (and bits)
   which might impact the result of the msleep for the 1st configuring
   thread
3/ sysfs can configure the hardware (though role_store()) and it can
   still race with the phy_init()/phy_power_on() APIs calling into the
   drivers struct phy_ops

To address these issues, add a spinlock to protect hardware register access
and driver private data structures (e.g., calls to
rcar_gen3_is_any_rphy_initialized()). Checking driver-specific data remains
necessary as all PHY instances share common settings. With this change,
the existing mutex protection is removed and the cleanup.h helpers are
used.

While at it, to keep the code simpler, do not skip
regulator_enable()/regulator_disable() APIs in
rcar_gen3_phy_usb2_power_on()/rcar_gen3_phy_usb2_power_off() as the
regulators enable/disable operations are reference counted anyway.

Fixes: f3b5a8d9b5 ("phy: rcar-gen3-usb2: Add R-Car Gen3 USB2 PHY driver")
Cc: stable@vger.kernel.org
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Link: https://lore.kernel.org/r/20250507125032.565017-4-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 12:25:09 +01:00
Claudiu Beznea
de76809f60 phy: renesas: rcar-gen3-usb2: Move IRQ request in probe
Commit 08b0ad375c ("phy: renesas: rcar-gen3-usb2: move IRQ registration
to init") moved the IRQ request operation from probe to
struct phy_ops::phy_init API to avoid triggering interrupts (which lead to
register accesses) while the PHY clocks (enabled through runtime PM APIs)
are not active. If this happens, it results in a synchronous abort.

One way to reproduce this issue is by enabling CONFIG_DEBUG_SHIRQ, which
calls free_irq() on driver removal.

Move the IRQ request and free operations back to probe, and take the
runtime PM state into account in IRQ handler. This commit is preparatory
for the subsequent fixes in this series.

Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Link: https://lore.kernel.org/r/20250507125032.565017-3-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 12:25:09 +01:00
Claudiu Beznea
54c4c58713 phy: renesas: rcar-gen3-usb2: Fix role detection on unbind/bind
It has been observed on the Renesas RZ/G3S SoC that unbinding and binding
the PHY driver leads to role autodetection failures. This issue occurs when
PHY 3 is the first initialized PHY. PHY 3 does not have an interrupt
associated with the USB2_INT_ENABLE register (as
rcar_gen3_int_enable[3] = 0). As a result, rcar_gen3_init_otg() is called
to initialize OTG without enabling PHY interrupts.

To resolve this, add rcar_gen3_is_any_otg_rphy_initialized() and call it in
role_store(), role_show(), and rcar_gen3_init_otg(). At the same time,
rcar_gen3_init_otg() is only called when initialization for a PHY with
interrupt bits is in progress. As a result, the
struct rcar_gen3_phy::otg_initialized is no longer needed.

Fixes: 549b6b55b0 ("phy: renesas: rcar-gen3-usb2: enable/disable independent irqs")
Cc: stable@vger.kernel.org
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Link: https://lore.kernel.org/r/20250507125032.565017-2-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 12:25:09 +01:00
Dan Carpenter
83c178470e phy: tegra: xusb: remove a stray unlock
We used to take a lock in tegra186_utmi_bias_pad_power_on() but now we
have moved the lock into the caller.  Unfortunately, when we moved the
lock this unlock was left behind and it results in a double unlock.
Delete it now.

Fixes: b47158fb42 ("phy: tegra: xusb: Use a bitmask for UTMI pad power state tracking")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://lore.kernel.org/r/aAjmR6To4EnvRl4G@stanley.mountain
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 12:21:18 +01:00
Gabor Juhos
b04f0d89e8 arm64: dts: marvell: uDPU: define pinctrl state for alarm LEDs
The two alarm LEDs of on the uDPU board are stopped working since
commit 78efa53e71 ("leds: Init leds class earlier").

The LEDs are driven by the GPIO{15,16} pins of the North Bridge
GPIO controller. These pins are part of the 'spi_quad' pin group
for which the 'spi' function is selected via the default pinctrl
state of the 'spi' node. This is wrong however, since in order to
allow controlling the LEDs, the pins should use the 'gpio' function.

Before the commit mentined above, the 'spi' function is selected
first by the pinctrl core before probing the spi driver, but then
it gets overridden to 'gpio' implicitly via the
devm_gpiod_get_index_optional() call from the 'leds-gpio' driver.

After the commit, the LED subsystem gets initialized before the
SPI subsystem, so the function of the pin group remains 'spi'
which in turn prevents controlling of the LEDs.

Despite the change of the initialization order, the root cause is
that the pinctrl state definition is wrong since its initial commit
0d45062cfc ("arm64: dts: marvell: Add device tree for uDPU board"),

To fix the problem, override the function in the 'spi_quad_pins'
node to 'gpio' and move the pinctrl state definition from the
'spi' node into the 'leds' node.

Cc: stable@vger.kernel.org # needs adjustment for < 6.1
Fixes: 0d45062cfc ("arm64: dts: marvell: Add device tree for uDPU board")
Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
Signed-off-by: Imre Kaloz <kaloz@openwrt.org>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2025-05-14 13:01:47 +02:00
Wentao Liang
9e000f1b7f ALSA: es1968: Add error handling for snd_pcm_hw_constraint_pow2()
The function snd_es1968_capture_open() calls the function
snd_pcm_hw_constraint_pow2(), but does not check its return
value. A proper implementation can be found in snd_cx25821_pcm_open().

Add error handling for snd_pcm_hw_constraint_pow2() and propagate its
error code.

Fixes: b942cf815b ("[ALSA] es1968 - Fix stuttering capture")
Cc: stable@vger.kernel.org # v2.6.22
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
Link: https://patch.msgid.link/20250514092444.331-1-vulab@iscas.ac.cn
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-05-14 12:13:03 +02:00
Hariprasad Kelam
314007549d octeontx2-pf: Fix ethtool support for SDP representors
The hardware supports multiple MAC types, including RPM, SDP, and LBK.
However, features such as link settings and pause frames are only available
on RPM MAC, and not supported on SDP or LBK.

This patch updates the ethtool operations logic accordingly to reflect
this behavior.

Fixes: 2f7f33a095 ("octeontx2-pf: Add representors for sdp MAC")
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-14 10:37:20 +01:00
Cosmin Tanislav
6b0cd72757 regulator: max20086: fix invalid memory access
max20086_parse_regulators_dt() calls of_regulator_match() using an
array of struct of_regulator_match allocated on the stack for the
matches argument.

of_regulator_match() calls devm_of_regulator_put_matches(), which calls
devres_alloc() to allocate a struct devm_of_regulator_matches which will
be de-allocated using devm_of_regulator_put_matches().

struct devm_of_regulator_matches is populated with the stack allocated
matches array.

If the device fails to probe, devm_of_regulator_put_matches() will be
called and will try to call of_node_put() on that stack pointer,
generating the following dmesg entries:

max20086 6-0028: Failed to read DEVICE_ID reg: -121
kobject: '\xc0$\xa5\x03' (000000002cebcb7a): is not initialized, yet
kobject_put() is being called.

Followed by a stack trace matching the call flow described above.

Switch to allocating the matches array using devm_kcalloc() to
avoid accessing the stack pointer long after it's out of scope.

This also has the advantage of allowing multiple max20086 to probe
without overriding the data stored inside the global of_regulator_match.

Fixes: bfff546aae ("regulator: Add MAX20086-MAX20089 driver")
Signed-off-by: Cosmin Tanislav <demonsingur@gmail.com>
Link: https://patch.msgid.link/20250508064947.2567255-1-demonsingur@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-14 11:16:52 +02:00
Alessandro Grassi
fb98bd0a13 spi: spi-sun4i: fix early activation
The SPI interface is activated before the CPOL setting is applied. In
that moment, the clock idles high and CS goes low. After a short delay,
CPOL and other settings are applied, which may cause the clock to change
state and idle low. This transition is not part of a clock cycle, and it
can confuse the receiving device.

To prevent this unexpected transition, activate the interface while CPOL
and the other settings are being applied.

Signed-off-by: Alessandro Grassi <alessandro.grassi@mailbox.org>
Link: https://patch.msgid.link/20250502095520.13825-1-alessandro.grassi@mailbox.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-14 10:56:43 +02:00
Algea Cao
f9475055b1 phy: phy-rockchip-samsung-hdptx: Fix PHY PLL output 50.25MHz error
When using HDMI PLL frequency division coefficient at 50.25MHz
that is calculated by rk_hdptx_phy_clk_pll_calc(), it fails to
get PHY LANE lock. Although the calculated values are within the
allowable range of PHY PLL configuration.

In order to fix the PHY LANE lock error and provide the expected
50.25MHz output, manually compute the required PHY PLL frequency
division coefficient and add it to ropll_tmds_cfg configuration
table.

Signed-off-by: Algea Cao <algea.cao@rock-chips.com>
Reviewed-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20250427095124.3354439-1-algea.cao@rock-chips.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 09:55:09 +01:00
Hal Feng
3f097adb9b phy: starfive: jh7110-usb: Fix USB 2.0 host occasional detection failure
JH7110 USB 2.0 host fails to detect USB 2.0 devices occasionally. With a
long time of debugging and testing, we found that setting Rx clock gating
control signal to normal power consumption mode can solve this problem.

Signed-off-by: Hal Feng <hal.feng@starfivetech.com>
Link: https://lore.kernel.org/r/20250422101244.51686-1-hal.feng@starfivetech.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14 09:45:12 +01:00
Alan Adamson
a21675ee3b nvme: multipath: enable BLK_FEAT_ATOMIC_WRITES for multipathing
A change to QEMU resulted in all nvme controllers (single and
multi-controller subsystems) to have its CMIC.MCTRS bit set which
indicates the subsystem supports multiple controllers and it is possible
a namespace can be shared between those multiple controllers in a
multipath configuration.

When a namespace of a CMIC.MCTRS enabled subsystem is allocated, a
multipath node is created.  The queue limits for this node are inherited
from the namespace being allocated. When inheriting queue limits, the
features being inherited need to be specified. The atomic write feature
(BLK_FEAT_ATOMIC_WRITES) was not specified so the atomic queue limits
were not inherited by the multipath disk node which resulted in the sysfs
atomic write attributes being zeroed. The fix is to include
BLK_FEAT_ATOMIC_WRITES in the list of features to be inherited.

Signed-off-by: Alan Adamson <alan.adamson@oracle.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-14 07:33:10 +02:00
Paul Chaignon
0b91fda3a1 xfrm: Sanitize marks before insert
Prior to this patch, the mark is sanitized (applying the state's mask to
the state's value) only on inserts when checking if a conflicting XFRM
state or policy exists.

We discovered in Cilium that this same sanitization does not occur
in the hot-path __xfrm_state_lookup. In the hot-path, the sk_buff's mark
is simply compared to the state's value:

    if ((mark & x->mark.m) != x->mark.v)
        continue;

Therefore, users can define unsanitized marks (ex. 0xf42/0xf00) which will
never match any packet.

This commit updates __xfrm_state_insert and xfrm_policy_insert to store
the sanitized marks, thus removing this footgun.

This has the side effect of changing the ip output, as the
returned mark will have the mask applied to it when printed.

Fixes: 3d6acfa764 ("xfrm: SA lookups with mark")
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Louis DeLosSantos <louis.delos.devel@gmail.com>
Co-developed-by: Louis DeLosSantos <louis.delos.devel@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-05-14 07:18:58 +02:00
Abdun Nihaal
9d8a99c5a7 qlcnic: fix memory leak in qlcnic_sriov_channel_cfg_cmd()
In one of the error paths in qlcnic_sriov_channel_cfg_cmd(), the memory
allocated in qlcnic_sriov_alloc_bc_mbx_args() for mailbox arguments is
not freed. Fix that by jumping to the error path that frees them, by
calling qlcnic_free_mbx_args(). This was found using static analysis.

Fixes: f197a7aa62 ("qlcnic: VF-PF communication channel implementation")
Signed-off-by: Abdun Nihaal <abdun.nihaal@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250512044829.36400-1-abdun.nihaal@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 17:09:53 -07:00
Vladimir Oltean
d5c17e3654 docs: networking: timestamping: improve stacked PHC sentence
The first paragraph makes no grammatical sense. I suppose a portion of
the intended sentece is missing: "[The challenge with ] stacked PHCs
(...) is that they uncover bugs".

Rephrase, and at the same time simplify the structure of the sentence a
little bit, it is not easy to follow.

Fixes: 94d9f78f4d ("docs: networking: timestamping: add section for stacked PHC devices")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Link: https://patch.msgid.link/20250512131751.320283-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 15:40:54 -07:00
Carolina Jubran
588431474e net/mlx5e: Disable MACsec offload for uplink representor profile
MACsec offload is not supported in switchdev mode for uplink
representors. When switching to the uplink representor profile, the
MACsec offload feature must be cleared from the netdevice's features.

If left enabled, attempts to add offloads result in a null pointer
dereference, as the uplink representor does not support MACsec offload
even though the feature bit remains set.

Clear NETIF_F_HW_MACSEC in mlx5e_fix_uplink_rep_features().

Kernel log:

Oops: general protection fault, probably for non-canonical address 0xdffffc000000000f: 0000 [#1] SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000078-0x000000000000007f]
CPU: 29 UID: 0 PID: 4714 Comm: ip Not tainted 6.14.0-rc4_for_upstream_debug_2025_03_02_17_35 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:__mutex_lock+0x128/0x1dd0
Code: d0 7c 08 84 d2 0f 85 ad 15 00 00 8b 35 91 5c fe 03 85 f6 75 29 49 8d 7e 60 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 a6 15 00 00 4d 3b 76 60 0f 85 fd 0b 00 00 65 ff
RSP: 0018:ffff888147a4f160 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000001
RDX: 000000000000000f RSI: 0000000000000000 RDI: 0000000000000078
RBP: ffff888147a4f2e0 R08: ffffffffa05d2c19 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
R13: dffffc0000000000 R14: 0000000000000018 R15: ffff888152de0000
FS:  00007f855e27d800(0000) GS:ffff88881ee80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004e5768 CR3: 000000013ae7c005 CR4: 0000000000372eb0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
Call Trace:
 <TASK>
 ? die_addr+0x3d/0xa0
 ? exc_general_protection+0x144/0x220
 ? asm_exc_general_protection+0x22/0x30
 ? mlx5e_macsec_add_secy+0xf9/0x700 [mlx5_core]
 ? __mutex_lock+0x128/0x1dd0
 ? lockdep_set_lock_cmp_fn+0x190/0x190
 ? mlx5e_macsec_add_secy+0xf9/0x700 [mlx5_core]
 ? mutex_lock_io_nested+0x1ae0/0x1ae0
 ? lock_acquire+0x1c2/0x530
 ? macsec_upd_offload+0x145/0x380
 ? lockdep_hardirqs_on_prepare+0x400/0x400
 ? kasan_save_stack+0x30/0x40
 ? kasan_save_stack+0x20/0x40
 ? kasan_save_track+0x10/0x30
 ? __kasan_kmalloc+0x77/0x90
 ? __kmalloc_noprof+0x249/0x6b0
 ? genl_family_rcv_msg_attrs_parse.constprop.0+0xb5/0x240
 ? mlx5e_macsec_add_secy+0xf9/0x700 [mlx5_core]
 mlx5e_macsec_add_secy+0xf9/0x700 [mlx5_core]
 ? mlx5e_macsec_add_rxsa+0x11a0/0x11a0 [mlx5_core]
 macsec_update_offload+0x26c/0x820
 ? macsec_set_mac_address+0x4b0/0x4b0
 ? lockdep_hardirqs_on_prepare+0x284/0x400
 ? _raw_spin_unlock_irqrestore+0x47/0x50
 macsec_upd_offload+0x2c8/0x380
 ? macsec_update_offload+0x820/0x820
 ? __nla_parse+0x22/0x30
 ? genl_family_rcv_msg_attrs_parse.constprop.0+0x15e/0x240
 genl_family_rcv_msg_doit+0x1cc/0x2a0
 ? genl_family_rcv_msg_attrs_parse.constprop.0+0x240/0x240
 ? cap_capable+0xd4/0x330
 genl_rcv_msg+0x3ea/0x670
 ? genl_family_rcv_msg_dumpit+0x2a0/0x2a0
 ? lockdep_set_lock_cmp_fn+0x190/0x190
 ? macsec_update_offload+0x820/0x820
 netlink_rcv_skb+0x12b/0x390
 ? genl_family_rcv_msg_dumpit+0x2a0/0x2a0
 ? netlink_ack+0xd80/0xd80
 ? rwsem_down_read_slowpath+0xf90/0xf90
 ? netlink_deliver_tap+0xcd/0xac0
 ? netlink_deliver_tap+0x155/0xac0
 ? _copy_from_iter+0x1bb/0x12c0
 genl_rcv+0x24/0x40
 netlink_unicast+0x440/0x700
 ? netlink_attachskb+0x760/0x760
 ? lock_acquire+0x1c2/0x530
 ? __might_fault+0xbb/0x170
 netlink_sendmsg+0x749/0xc10
 ? netlink_unicast+0x700/0x700
 ? __might_fault+0xbb/0x170
 ? netlink_unicast+0x700/0x700
 __sock_sendmsg+0xc5/0x190
 ____sys_sendmsg+0x53f/0x760
 ? import_iovec+0x7/0x10
 ? kernel_sendmsg+0x30/0x30
 ? __copy_msghdr+0x3c0/0x3c0
 ? filter_irq_stacks+0x90/0x90
 ? stack_depot_save_flags+0x28/0xa30
 ___sys_sendmsg+0xeb/0x170
 ? kasan_save_stack+0x30/0x40
 ? copy_msghdr_from_user+0x110/0x110
 ? do_syscall_64+0x6d/0x140
 ? lock_acquire+0x1c2/0x530
 ? __virt_addr_valid+0x116/0x3b0
 ? __virt_addr_valid+0x1da/0x3b0
 ? lock_downgrade+0x680/0x680
 ? __delete_object+0x21/0x50
 __sys_sendmsg+0xf7/0x180
 ? __sys_sendmsg_sock+0x20/0x20
 ? kmem_cache_free+0x14c/0x4e0
 ? __x64_sys_close+0x78/0xd0
 do_syscall_64+0x6d/0x140
 entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7f855e113367
Code: 0e 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
RSP: 002b:00007ffd15e90c88 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f855e113367
RDX: 0000000000000000 RSI: 00007ffd15e90cf0 RDI: 0000000000000004
RBP: 00007ffd15e90dbc R08: 0000000000000028 R09: 000000000045d100
R10: 00007f855e011dd8 R11: 0000000000000246 R12: 0000000000000019
R13: 0000000067c6b785 R14: 00000000004a1e80 R15: 0000000000000000
 </TASK>
Modules linked in: 8021q garp mrp sch_ingress openvswitch nsh mlx5_ib mlx5_fwctl mlx5_dpll mlx5_core rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay zram zsmalloc fuse [last unloaded: mlx5_core]
---[ end trace 0000000000000000 ]---

Fixes: 8ff0ac5be1 ("net/mlx5: Add MACsec offload Tx command support")
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1746958552-561295-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 15:31:37 -07:00
Konstantin Shkolnyy
7fd7ad6f36 vsock/test: Fix occasional failure in SIOCOUTQ tests
These tests:
    "SOCK_STREAM ioctl(SIOCOUTQ) 0 unsent bytes"
    "SOCK_SEQPACKET ioctl(SIOCOUTQ) 0 unsent bytes"
output: "Unexpected 'SIOCOUTQ' value, expected 0, got 64 (CLIENT)".

They test that the SIOCOUTQ ioctl reports 0 unsent bytes after the data
have been received by the other side. However, sometimes there is a delay
in updating this "unsent bytes" counter, and the test fails even though
the counter properly goes to 0 several milliseconds later.

The delay occurs in the kernel because the used buffer notification
callback virtio_vsock_tx_done(), called upon receipt of the data by the
other side, doesn't update the counter itself. It delegates that to
a kernel thread (via vsock->tx_work). Sometimes that thread is delayed
more than the test expects.

Change the test to poll SIOCOUTQ until it returns 0 or a timeout occurs.

Signed-off-by: Konstantin Shkolnyy <kshk@linux.ibm.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Fixes: 18ee44ce97 ("test/vsock: add ioctl unsent bytes test")
Link: https://patch.msgid.link/20250507151456.2577061-1-kshk@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 15:01:50 -07:00
Lukas Wunner
396786af1c tools: ynl-gen: Allow multi-attr without nested-attributes again
Since commit ce6cb8113c ("tools: ynl-gen: individually free previous
values on double set"), specifying the "multi-attr" property raises an
error unless the "nested-attributes" property is specified as well:

  File "tools/net/ynl/./pyynl/ynl_gen_c.py", line 1147, in _load_nested_sets
    child = self.pure_nested_structs.get(nested)
                                         ^^^^^^
  UnboundLocalError: cannot access local variable 'nested' where it is not associated with a value

This appears to be a bug since there are existing specs which omit
"nested-attributes" on "multi-attr" attributes.  Also, according to
Documentation/userspace-api/netlink/specs.rst, multi-attr "is the
recommended way of implementing arrays (no extra nesting)", suggesting
that nesting should even be avoided in favor of multi-attr.

Fix the indentation of the if-block introduced by the commit to avoid
the error.

Fixes: ce6cb8113c ("tools: ynl-gen: individually free previous values on double set")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patch.msgid.link/d6b58684b7e5bfb628f7313e6893d0097904e1d1.1746940107.git.lukas@wunner.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-13 15:01:50 -07:00
Eric Biggers
9f35e33144 x86/its: Fix build errors when CONFIG_MODULES=n
Fix several build errors when CONFIG_MODULES=n, including the following:

../arch/x86/kernel/alternative.c:195:25: error: incomplete definition of type 'struct module'
  195 |         for (int i = 0; i < mod->its_num_pages; i++) {

Fixes: 872df34d7c ("x86/its: Use dynamic thunks for indirect branches")
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-05-13 14:36:08 -07:00
Yazen Ghannam
24ee8d9432 x86/CPU/AMD: Add X86_FEATURE_ZEN6
Add a synthetic feature flag for Zen6.

  [  bp: Move the feature flag to a free slot and avoid future merge
     conflicts from incoming stuff. ]

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250513204857.3376577-1-yazen.ghannam@amd.com
2025-05-13 22:59:11 +02:00
Thomas Hellström
2d2f82e1a8 drm/xe: Fix the gem shrinker name
The xe buffer object shrinker name is visible in the
<debugfs>/shrinker directory and most if not all other shinkers
follow a naming convention that looks like
<subsystem>-<driver>_<objects>:<unique>
Follow the same convention for xe, changing the name to
drm-xe_gem:<unique>.

Other shrinkers typically use the device node for <unique> but
since drm drivers typically don't have a single unique device-
node, instead use the unique name in the drm device.

Fixes: 00c8efc318 ("drm/xe: Add a shrinker for xe bos")
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Link: https://lore.kernel.org/r/20250508112931.3347-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 243bf99e2f)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-13 13:17:05 -07:00
Linus Torvalds
405e6c37c8 Merge tag 'probes-fixes-v6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes fixes from Masami Hiramatsu:

 - fprobe: Fix RCU warning message in list traversal

   fprobe_module_callback() using hlist_for_each_entry_rcu() traverse
   the fprobe list but it locks fprobe_mutex() instead of rcu lock
   because it is enough. So add lockdep_is_held() to avoid warning.

 - tracing: eprobe: Add missing trace_probe_log_clear for eprobe

   __trace_eprobe_create() uses trace_probe_log but forgot to clear it
   at exit. Add trace_probe_log_clear() calls.

 - tracing: probes: Fix possible race in trace_probe_log APIs

   trace_probe_log APIs are used in probe event (dynamic_events,
   kprobe_events and uprobe_events) creation. Only dynamic_events uses
   the dyn_event_ops_mutex mutex to serialize it. This makes kprobe and
   uprobe events to lock the same mutex to serialize its creation to
   avoid race in trace_probe_log APIs.

* tag 'probes-fixes-v6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: probes: Fix a possible race in trace_probe_log APIs
  tracing: add missing trace_probe_log_clear for eprobes
  tracing: fprobe: Fix RCU warning message in list traversal
2025-05-13 12:20:07 -07:00
Wayne Lin
d33724ffb7 drm/amd/display: Avoid flooding unnecessary info messages
It's expected that we'll encounter temporary exceptions
during aux transactions. Adjust logging from drm_info to
drm_dbg_dp to prevent flooding with unnecessary log messages.

Fixes: 3637e457eb ("drm/amd/display: Fix wrong handling for AUX_DEFER case")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250513032026.838036-1-Wayne.Lin@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9a9c3e1fe5)
Cc: stable@vger.kernel.org
2025-05-13 14:36:27 -04:00
Melissa Wen
a3b7e65b6b drm/amd/display: Fix null check of pipe_ctx->plane_state for update_dchubp_dpp
Similar to commit 6a057072dd ("drm/amd/display: Fix null check for
pipe_ctx->plane_state in dcn20_program_pipe") that addresses a null
pointer dereference on dcn20_update_dchubp_dpp. This is the same
function hooked for update_dchubp_dpp in dcn401, with the same issue.
Fix possible null pointer deference on dcn401_program_pipe too.

Fixes: 63ab80d9ac ("drm/amd/display: DML2.1 Post-Si Cleanup")
Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d8d47f7397)
2025-05-13 14:35:10 -04:00
Aurabindo Pillai
2ddac70fed drm/amd/display: check stream id dml21 wrapper to get plane_id
[Why & How]
Fix a false positive warning which occurs due to lack of correct checks
when querying plane_id in DML21. This fixes the warning when performing a
mode1 reset (cat /sys/kernel/debug/dri/1/amdgpu_gpu_recover):

[   35.751250] WARNING: CPU: 11 PID: 326 at /tmp/amd.PHpyAl7v/amd/amdgpu/../display/dc/dml2/dml2_dc_resource_mgmt.c:91 dml2_map_dc_pipes+0x243d/0x3f40 [amdgpu]
[   35.751434] Modules linked in: amdgpu(OE) amddrm_ttm_helper(OE) amdttm(OE) amddrm_buddy(OE) amdxcp(OE) amddrm_exec(OE) amd_sched(OE) amdkcl(OE) drm_suballoc_helper drm_ttm_helper ttm drm_display_helper cec rc_core i2c_algo_bit rfcomm qrtr cmac algif_hash algif_skcipher af_alg bnep amd_atl intel_rapl_msr intel_rapl_common snd_hda_codec_hdmi snd_hda_intel edac_mce_amd snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec kvm_amd snd_hda_core snd_hwdep snd_pcm kvm snd_seq_midi snd_seq_midi_event snd_rawmidi crct10dif_pclmul polyval_clmulni polyval_generic btusb ghash_clmulni_intel sha256_ssse3 btrtl sha1_ssse3 snd_seq btintel aesni_intel btbcm btmtk snd_seq_device crypto_simd sunrpc cryptd bluetooth snd_timer ccp binfmt_misc rapl snd i2c_piix4 wmi_bmof gigabyte_wmi k10temp i2c_smbus soundcore gpio_amdpt mac_hid sch_fq_codel msr parport_pc ppdev lp parport efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 hid_generic usbhid hid crc32_pclmul igc ahci xhci_pci libahci xhci_pci_renesas video wmi
[   35.751501] CPU: 11 UID: 0 PID: 326 Comm: kworker/u64:9 Tainted: G           OE      6.11.0-21-generic #21~24.04.1-Ubuntu
[   35.751504] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[   35.751505] Hardware name: Gigabyte Technology Co., Ltd. X670E AORUS PRO X/X670E AORUS PRO X, BIOS F30 05/22/2024
[   35.751506] Workqueue: amdgpu-reset-dev amdgpu_debugfs_reset_work [amdgpu]
[   35.751638] RIP: 0010:dml2_map_dc_pipes+0x243d/0x3f40 [amdgpu]
[   35.751794] Code: 6d 0c 00 00 8b 84 24 88 00 00 00 41 3b 44 9c 20 0f 84 fc 07 00 00 48 83 c3 01 48 83 fb 06 75 b3 4c 8b 64 24 68 4c 8b 6c 24 40 <0f> 0b b8 06 00 00 00 49 8b 94 24 a0 49 00 00 89 c3 83 f8 07 0f 87
[   35.751796] RSP: 0018:ffffbfa3805d7680 EFLAGS: 00010246
[   35.751798] RAX: 0000000000010000 RBX: 0000000000000006 RCX: 0000000000000000
[   35.751799] RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000000
[   35.751800] RBP: ffffbfa3805d78f0 R08: 0000000000000000 R09: 0000000000000000
[   35.751801] R10: 0000000000000000 R11: 0000000000000000 R12: ffffbfa383249000
[   35.751802] R13: ffffa0e68f280000 R14: ffffbfa383249658 R15: 0000000000000000
[   35.751803] FS:  0000000000000000(0000) GS:ffffa0edbe580000(0000) knlGS:0000000000000000
[   35.751804] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   35.751805] CR2: 00005d847ef96c58 CR3: 000000041de3e000 CR4: 0000000000f50ef0
[   35.751806] PKRU: 55555554
[   35.751807] Call Trace:
[   35.751810]  <TASK>
[   35.751816]  ? show_regs+0x6c/0x80
[   35.751820]  ? __warn+0x88/0x140
[   35.751822]  ? dml2_map_dc_pipes+0x243d/0x3f40 [amdgpu]
[   35.751964]  ? report_bug+0x182/0x1b0
[   35.751969]  ? handle_bug+0x6e/0xb0
[   35.751972]  ? exc_invalid_op+0x18/0x80
[   35.751974]  ? asm_exc_invalid_op+0x1b/0x20
[   35.751978]  ? dml2_map_dc_pipes+0x243d/0x3f40 [amdgpu]
[   35.752117]  ? math_pow+0x48/0xa0 [amdgpu]
[   35.752256]  ? srso_alias_return_thunk+0x5/0xfbef5
[   35.752260]  ? math_pow+0x48/0xa0 [amdgpu]
[   35.752400]  ? srso_alias_return_thunk+0x5/0xfbef5
[   35.752403]  ? math_pow+0x11/0xa0 [amdgpu]
[   35.752524]  ? srso_alias_return_thunk+0x5/0xfbef5
[   35.752526]  ? core_dcn4_mode_programming+0xe4d/0x20d0 [amdgpu]
[   35.752663]  ? srso_alias_return_thunk+0x5/0xfbef5
[   35.752669]  dml21_validate+0x3d4/0x980 [amdgpu]

Reviewed-by: Austin Zheng <austin.zheng@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f8ad62c0a9)
2025-05-13 14:24:12 -04:00
George Shen
3c1a467372 drm/amd/display: fix link_set_dpms_off multi-display MST corner case
[Why & How]
When MST config is unplugged/replugged too quickly, it can potentially
result in a scenario where previous DC state has not been reset before
the HPD link detection sequence begins. In this case, driver will
disable the streams/link prior to re-enabling the link for link
training.

There is a bug in the current logic that does not account for the fact
that current_state can be released and cleared prior to swapping to a
new state (resulting in the pipe_ctx stream pointers to be cleared) in
between disabling streams.

To resolve this, cache the original streams prior to committing any
stream updates.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1561782686)
2025-05-13 14:22:03 -04:00
John Olender
874697e127 drm/amd/display: Defer BW-optimization-blocked DRR adjustments
[Why & How]
Instead of dropping DRR updates, defer them. This fixes issues where
monitor continues to see incorrect refresh rate after VRR was turned off
by userspace.

Fixes: 32953485c5 ("drm/amd/display: Do not update DRR while BW optimizations pending")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3546
Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: John Olender <john.olender@gmail.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 53761b7ecd)
Cc: stable@vger.kernel.org
2025-05-13 14:21:24 -04:00
Gabe Teeger
190818d1b6 Revert: "drm/amd/display: Enable urgent latency adjustment on DCN35"
This reverts commit 756c85e4d0 ("drm/amd/display: Enable urgent latency adjustment on DCN35")

Reason for revert: Negative power impact.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Gabe Teeger <Gabe.Teeger@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9334c491cd)
2025-05-13 14:20:55 -04:00
Wayne Lin
d433981385 drm/amd/display: Correct the reply value when AUX write incomplete
[Why]
Now forcing aux->transfer to return 0 when incomplete AUX write is
inappropriate. It should return bytes have been transferred.

[How]
aux->transfer is asked not to change original msg except reply field of
drm_dp_aux_msg structure. Copy the msg->buffer when it's write request,
and overwrite the first byte when sink reply 1 byte indicating partially
written byte number. Then we can return the correct value without
changing the original msg.

Fixes: 3637e457eb ("drm/amd/display: Fix wrong handling for AUX_DEFER case")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7ac37f0dcd)
Cc: stable@vger.kernel.org
2025-05-13 14:18:09 -04:00
Tim Huang
2d73b0845a drm/amdgpu: fix incorrect MALL size for GFX1151
On GFX1151, the reported MALL cache size reflects only
half of its actual size; this adjustment corrects the discrepancy.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 0a5c060b59)
Cc: stable@vger.kernel.org
2025-05-13 14:16:43 -04:00
Philip Yang
a0fa7873f2 drm/amdgpu: csa unmap use uninterruptible lock
After process exit to unmap csa and free GPU vm, if signal is accepted
and then waiting to take vm lock is interrupted and return, it causes
memory leaking and below warning backtrace.

Change to use uninterruptible wait lock fix the issue.

WARNING: CPU: 69 PID: 167800 at amd/amdgpu/amdgpu_kms.c:1525
 amdgpu_driver_postclose_kms+0x294/0x2a0 [amdgpu]
 Call Trace:
  <TASK>
  drm_file_free.part.0+0x1da/0x230 [drm]
  drm_close_helper.isra.0+0x65/0x70 [drm]
  drm_release+0x6a/0x120 [drm]
  amdgpu_drm_release+0x51/0x60 [amdgpu]
  __fput+0x9f/0x280
  ____fput+0xe/0x20
  task_work_run+0x67/0xa0
  do_exit+0x217/0x3c0
  do_group_exit+0x3b/0xb0
  get_signal+0x14a/0x8d0
  arch_do_signal_or_restart+0xde/0x100
  exit_to_user_mode_loop+0xc1/0x1a0
  exit_to_user_mode_prepare+0xf4/0x100
  syscall_exit_to_user_mode+0x17/0x40
  do_syscall_64+0x69/0xc0

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7dbbfb3c17)
Cc: stable@vger.kernel.org
2025-05-13 14:16:30 -04:00
Ashish Kalra
82b7f88f23 x86/sev: Make sure pages are not skipped during kdump
When shared pages are being converted to private during kdump, additional
checks are performed. They include handling the case of a GHCB page being
contained within a huge page.

Currently, this check incorrectly skips a page just below the GHCB page from
being transitioned back to private during kdump preparation.

This skipped page causes a 0x404 #VC exception when it is accessed later while
dumping guest memory for vmcore generation.

Correct the range to be checked for GHCB contained in a huge page.  Also,
ensure that the skipped huge page containing the GHCB page is transitioned
back to private by applying the correct address mask later when changing GHCBs
to private at end of kdump preparation.

  [ bp: Massage commit message. ]

Fixes: 3074152e56 ("x86/sev: Convert shared memory back to private on kexec")
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Srikanth Aithal <sraithal@amd.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/20250506183529.289549-1-Ashish.Kalra@amd.com
2025-05-13 19:47:48 +02:00
Ashish Kalra
d2062cc1b1 x86/sev: Do not touch VMSA pages during SNP guest memory kdump
When kdump is running makedumpfile to generate vmcore and dump SNP guest
memory it touches the VMSA page of the vCPU executing kdump.

It then results in unrecoverable #NPF/RMP faults as the VMSA page is
marked busy/in-use when the vCPU is running and subsequently a causes
guest softlockup/hang.

Additionally, other APs may be halted in guest mode and their VMSA pages
are marked busy and touching these VMSA pages during guest memory dump
will also cause #NPF.

Issue AP_DESTROY GHCB calls on other APs to ensure they are kicked out
of guest mode and then clear the VMSA bit on their VMSA pages.

If the vCPU running kdump is an AP, mark it's VMSA page as offline to
ensure that makedumpfile excludes that page while dumping guest memory.

Fixes: 3074152e56 ("x86/sev: Convert shared memory back to private on kexec")
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Srikanth Aithal <sraithal@amd.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/20250428214151.155464-1-Ashish.Kalra@amd.com
2025-05-13 19:40:44 +02:00
Andre Przywara
98e6da673c clk: sunxi-ng: d1: Add missing divider for MMC mod clocks
The D1/R528/T113 SoCs have a hidden divider of 2 in the MMC mod clocks,
just as other recent SoCs. So far we did not describe that, which led
to the resulting MMC clock rate to be only half of its intended value.

Use a macro that allows to describe a fixed post-divider, to compensate
for that divisor.

This brings the MMC performance on those SoCs to its expected level,
so about 23 MB/s for SD cards, instead of the 11 MB/s measured so far.

Fixes: 35b97bb941 ("clk: sunxi-ng: Add support for the D1 SoC clocks")
Reported-by: Kuba Szczodrzyński <kuba@szczodrzynski.pl>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Link: https://patch.msgid.link/20250501120631.837186-1-andre.przywara@arm.com
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2025-05-14 00:40:13 +08:00
Matti Lehtimäki
4ca45af0a5 remoteproc: qcom_wcnss: Fix on platforms without fallback regulators
Recent change to handle platforms with only single power domain broke
pronto-v3 which requires power domains and doesn't have fallback voltage
regulators in case power domains are missing. Add a check to verify
the number of fallback voltage regulators before using the code which
handles single power domain situation.

Fixes: 65991ea8a6 ("remoteproc: qcom_wcnss: Handle platforms with only single power domain")
Signed-off-by: Matti Lehtimäki <matti.lehtimaki@gmail.com>
Tested-by: Luca Weiss <luca.weiss@fairphone.com> # sdm632-fairphone-fp3
Link: https://lore.kernel.org/r/20250511234026.94735-1-matti.lehtimaki@gmail.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2025-05-13 10:34:33 -05:00
Rong Zhang
578e1b96fa HID: bpf: abort dispatch if device destroyed
The current HID bpf implementation assumes no output report/request will
go through it after hid_bpf_destroy_device() has been called. This leads
to a bug that unplugging certain types of HID devices causes a cleaned-
up SRCU to be accessed. The bug was previously a hidden failure until a
recent x86 percpu change [1] made it access not-present pages.

The bug will be triggered if the conditions below are met:

A) a device under the driver has some LEDs on
B) hid_ll_driver->request() is uninplemented (e.g., logitech-djreceiver)

If condition A is met, hidinput_led_worker() is always scheduled *after*
hid_bpf_destroy_device().

hid_destroy_device
` hid_bpf_destroy_device
  ` cleanup_srcu_struct(&hdev->bpf.srcu)
` hid_remove_device
  ` ...
    ` led_classdev_unregister
      ` led_trigger_set(led_cdev, NULL)
        ` led_set_brightness(led_cdev, LED_OFF)
          ` ...
            ` input_inject_event
              ` input_event_dispose
                ` hidinput_input_event
                  ` schedule_work(&hid->led_work) [hidinput_led_worker]

This is fine when condition B is not met, where hidinput_led_worker()
calls hid_ll_driver->request(). This is the case for most HID drivers,
which implement it or use the generic one from usbhid. The driver itself
or an underlying driver will then abort processing the request.

Otherwise, hidinput_led_worker() tries hid_hw_output_report() and leads
to the bug.

hidinput_led_worker
` hid_hw_output_report
  ` dispatch_hid_bpf_output_report
    ` srcu_read_lock(&hdev->bpf.srcu)
    ` srcu_read_unlock(&hdev->bpf.srcu, idx)

The bug has existed since the introduction [2] of
dispatch_hid_bpf_output_report(). However, the same bug also exists in
dispatch_hid_bpf_raw_requests(), and I've reproduced (no visible effect
because of the lack of [1], but confirmed bpf.destroyed == 1) the bug
against the commit (i.e., the Fixes:) introducing the function. This is
because hidinput_led_worker() falls back to hid_hw_raw_request() when
hid_ll_driver->output_report() is uninplemented (e.g., logitech-
djreceiver).

hidinput_led_worker
` hid_hw_output_report: -ENOSYS
` hid_hw_raw_request
  ` dispatch_hid_bpf_raw_requests
    ` srcu_read_lock(&hdev->bpf.srcu)
    ` srcu_read_unlock(&hdev->bpf.srcu, idx)

Fix the issue by returning early in the two mentioned functions if
hid_bpf has been marked as destroyed. Though
dispatch_hid_bpf_device_event() handles input events, and there is no
evidence that it may be called after the destruction, the same check, as
a safety net, is also added to it to maintain the consistency among all
dispatch functions.

The impact of the bug on other architectures is unclear. Even if it acts
as a hidden failure, this is still dangerous because it corrupts
whatever is on the address calculated by SRCU. Thus, CC'ing the stable
list.

[1]: commit 9d7de2aa8b ("x86/percpu/64: Use relative percpu offsets")
[2]: commit 9286675a2a ("HID: bpf: add HID-BPF hooks for
hid_hw_output_report")

Closes: https://lore.kernel.org/all/20250506145548.GGaBoi9Jzp3aeJizTR@fat_crate.local/
Fixes: 8bd0488b5e ("HID: bpf: add HID-BPF hooks for hid_hw_raw_requests")
Cc: stable@vger.kernel.org
Signed-off-by: Rong Zhang <i@rong.moe>
Tested-by: Petr Tesarik <petr@tesarici.cz>
Link: https://patch.msgid.link/20250512152420.87441-1-i@rong.moe
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2025-05-13 15:53:50 +02:00
Jonas Gorski
4227ea91e2 net: dsa: b53: prevent standalone from trying to forward to other ports
When bridged ports and standalone ports share a VLAN, e.g. via VLAN
uppers, or untagged traffic with a vlan unaware bridge, the ASIC will
still try to forward traffic to known FDB entries on standalone ports.
But since the port VLAN masks prevent forwarding to bridged ports, this
traffic will be dropped.

This e.g. can be observed in the bridge_vlan_unaware ping tests, where
this breaks pinging with learning on.

Work around this by enabling the simplified EAP mode on switches
supporting it for standalone ports, which causes the ASIC to redirect
traffic of unknown source MAC addresses to the CPU port.

Since standalone ports do not learn, there are no known source MAC
addresses, so effectively this redirects all incoming traffic to the CPU
port.

Fixes: ff39c2d686 ("net: dsa: b53: Add bridge support")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20250508091424.26870-1-jonas.gorski@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13 15:39:02 +02:00
Masami Hiramatsu (Google)
fd837de3c9 tracing: probes: Fix a possible race in trace_probe_log APIs
Since the shared trace_probe_log variable can be accessed and
modified via probe event create operation of kprobe_events,
uprobe_events, and dynamic_events, it should be protected.
In the dynamic_events, all operations are serialized by
`dyn_event_ops_mutex`. But kprobe_events and uprobe_events
interfaces are not serialized.

To solve this issue, introduces dyn_event_create(), which runs
create() operation under the mutex, for kprobe_events and
uprobe_events. This also uses lockdep to check the mutex is
held when using trace_probe_log* APIs.

Link: https://lore.kernel.org/all/174684868120.551552.3068655787654268804.stgit@devnote2/

Reported-by: Paul Cacheux <paulcacheux@gmail.com>
Closes: https://lore.kernel.org/all/20250510074456.805a16872b591e2971a4d221@kernel.org/
Fixes: ab105a4fb8 ("tracing: Use tracing error_log with probe events")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2025-05-13 22:23:34 +09:00
Markus Burri
7118be7c60 gpio: virtuser: fix potential out-of-bound write
If the caller wrote more characters, count is truncated to the max
available space in "simple_write_to_buffer". Check that the input
size does not exceed the buffer size. Write a zero termination
afterwards.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505091754.285hHbr2-lkp@intel.com/
Signed-off-by: Markus Burri <markus.burri@mt.com>
Link: https://lore.kernel.org/r/20250509150459.115489-1-markus.burri@mt.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-05-13 12:59:30 +02:00
Emanuele Ghidoli
3e38f94606 gpio: pca953x: fix IRQ storm on system wake up
If an input changes state during wake-up and is used as an interrupt
source, the IRQ handler reads the volatile input register to clear the
interrupt mask and deassert the IRQ line. However, the IRQ handler is
triggered before access to the register is granted, causing the read
operation to fail.

As a result, the IRQ handler enters a loop, repeatedly printing the
"failed reading register" message, until `pca953x_resume()` is eventually
called, which restores the driver context and enables access to
registers.

Fix by disabling the IRQ line before entering suspend mode, and
re-enabling it after the driver context is restored in `pca953x_resume()`.

An IRQ can be disabled with disable_irq() and still wake the system as
long as the IRQ has wake enabled, so the wake-up functionality is
preserved.

Fixes: b765743005 ("gpio: pca953x: Restore registers after suspend/resume cycle")
Cc: stable@vger.kernel.org
Signed-off-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20250512095441.31645-1-francesco@dolcini.it
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-05-13 12:53:00 +02:00
Geert Uytterhoeven
66e48ef6ef ALSA: sh: SND_AICA should depend on SH_DMA_API
If CONFIG_SH_DMA_API=n:

    WARNING: unmet direct dependencies detected for G2_DMA
      Depends on [n]: SH_DREAMCAST [=y] && SH_DMA_API [=n]
      Selected by [y]:
      - SND_AICA [=y] && SOUND [=y] && SND [=y] && SND_SUPERH [=y] && SH_DREAMCAST [=y]

SND_AICA selects G2_DMA.  As the latter depends on SH_DMA_API, the
former should depend on SH_DMA_API, too.

Fixes: f477a538c1 ("sh: dma: fix kconfig dependency for G2_DMA")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505131320.PzgTtl9H-lkp@intel.com/
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/b90625f8a9078d0d304bafe862cbe3a3fab40082.1747121335.git.geert+renesas@glider.be
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-05-13 10:59:41 +02:00
Christian Heusel
2b24eb060c ALSA: usb-audio: Add sample rate quirk for Audioengine D1
A user reported on the Arch Linux Forums that their device is emitting
the following message in the kernel journal, which is fixed by adding
the quirk as submitted in this patch:

    > kernel: usb 1-2: current rate 8436480 is different from the runtime rate 48000

There also is an entry for this product line added long time ago.
Their specific device has the following ID:

    $ lsusb | grep Audio
    Bus 001 Device 002: ID 1101:0003 EasyPass Industrial Co., Ltd Audioengine D1

Link: https://bbs.archlinux.org/viewtopic.php?id=305494
Fixes: 93f9d1a4ac ("ALSA: usb-audio: Apply sample rate quirk for Audioengine D1")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Heusel <christian@heusel.eu>
Link: https://patch.msgid.link/20250512-audioengine-quirk-addition-v1-1-4c370af6eff7@heusel.eu
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-05-13 10:58:44 +02:00
Paolo Abeni
6eeceb3f20 Merge branch 'address-eee-regressions-on-ksz-switches-since-v6-9-v6-14'
Oleksij Rempel says:

====================
address EEE regressions on KSZ switches since v6.9 (v6.14+)

This patch series addresses a regression in Energy Efficient Ethernet
(EEE) handling for KSZ switches with integrated PHYs, introduced in
kernel v6.9 by commit fe0d4fd928 ("net: phy: Keep track of EEE
configuration").

The first patch updates the DSA driver to allow phylink to properly
manage PHY EEE configuration. Since integrated PHYs handle LPI
internally and ports without integrated PHYs do not document MAC-level
LPI support, dummy MAC LPI callbacks are provided.

The second patch removes outdated EEE workarounds from the micrel PHY
driver, as they are no longer needed with correct phylink handling.

This series addresses the regression for mainline and kernels starting
from v6.14. It is not easily possible to fully fix older kernels due
to missing infrastructure changes.

Tested on KSZ9893 hardware.
====================

Link: https://patch.msgid.link/20250504081434.424489-1-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13 10:23:27 +02:00
Oleksij Rempel
8c619eb21b net: phy: micrel: remove KSZ9477 EEE quirks now handled by phylink
The KSZ9477 PHY driver contained workarounds for broken EEE capability
advertisements by manually masking supported EEE modes and forcibly
disabling EEE if MICREL_NO_EEE was set.

With proper MAC-side EEE handling implemented via phylink, these quirks
are no longer necessary. Remove MICREL_NO_EEE handling and the use of
ksz9477_get_features().

This simplifies the PHY driver and avoids duplicated EEE management logic.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Cc: stable@vger.kernel.org # v6.14+
Link: https://patch.msgid.link/20250504081434.424489-3-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13 10:23:24 +02:00
Oleksij Rempel
76ca05e0ab net: dsa: microchip: let phylink manage PHY EEE configuration on KSZ switches
Phylink expects MAC drivers to provide LPI callbacks to properly manage
Energy Efficient Ethernet (EEE) configuration. On KSZ switches with
integrated PHYs, LPI is internally handled by hardware, while ports
without integrated PHYs have no documented MAC-level LPI support.

Provide dummy mac_disable_tx_lpi() and mac_enable_tx_lpi() callbacks to
satisfy phylink requirements. Also, set default EEE capabilities during
phylink initialization where applicable.

Since phylink can now gracefully handle optional EEE configuration,
remove the need for the MICREL_NO_EEE PHY flag.

This change addresses issues caused by incomplete EEE refactoring
introduced in commit fe0d4fd928 ("net: phy: Keep track of EEE
configuration"). It is not easily possible to fix all older kernels, but
this patch ensures proper behavior on latest kernels and can be
considered for backporting to stable kernels starting from v6.14.

Fixes: fe0d4fd928 ("net: phy: Keep track of EEE configuration")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Cc: stable@vger.kernel.org # v6.14+
Link: https://patch.msgid.link/20250504081434.424489-2-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13 10:23:24 +02:00
Damien Le Moal
8113d610a7 nvmet: pci-epf: remove NVMET_PCI_EPF_Q_IS_SQ
The flag NVMET_PCI_EPF_Q_IS_SQ is set but never used. Remove it.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-13 07:57:12 +02:00
Damien Le Moal
4f6f3f4fe3 nvmet: pci-epf: improve debug message
Improve the debug message of nvmet_pci_epf_create_cq() to indicate if a
completion queue IRQ is disabled.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-13 07:57:11 +02:00
Damien Le Moal
4236e600bf nvmet: pci-epf: cleanup nvmet_pci_epf_raise_irq()
There is no point in taking the controller irq_lock and calling
nvmet_pci_epf_should_raise_irq() for a completion queue which does not
have IRQ enabled (NVMET_PCI_EPF_Q_IRQ_ENABLED flag is not set).
Move the test for the NVMET_PCI_EPF_Q_IRQ_ENABLED flag out of
nvmet_pci_epf_should_raise_irq() to the top of nvmet_pci_epf_raise_irq()
to return early when no IRQ should be raised.

Also, use dev_err_ratelimited() to avoid a message storm under load when
raising IRQs is failing.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-13 07:57:11 +02:00
Damien Le Moal
2c3a6f6a28 nvmet: pci-epf: do not fall back to using INTX if not supported
Some endpoint PCIe controllers do not support raising legacy INTX
interrupts. This support is indicated by the intx_capable field of
struct pci_epc_features. Modify nvmet_pci_epf_raise_irq() to not
automatically fallback to trying raising an INTX interrupt after an MSI
or MSI-X error if the controller does not support INTX.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-13 07:57:11 +02:00
Damien Le Moal
85adf2094a nvmet: pci-epf: clear completion queue IRQ flag on delete
The function nvmet_pci_epf_delete_cq() unconditionally calls
nvmet_pci_epf_remove_irq_vector() even for completion queues that do not
have interrupts enabled. Furthermore, for completion queues that do
have IRQ enabled, deleting and re-creating the completion queue leaves
the flag NVMET_PCI_EPF_Q_IRQ_ENABLED set, even if the completion queue
is being re-created with IRQ disabled.

Fix these issues by calling nvmet_pci_epf_remove_irq_vector() only if
NVMET_PCI_EPF_Q_IRQ_ENABLED is set and make sure to always clear that
flag.

Fixes: 0faa0fe6f9 ("nvmet: New NVMe PCI endpoint function target driver")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-13 07:57:11 +02:00
Keith Busch
3d8932133d nvme-pci: acquire cq_poll_lock in nvme_poll_irqdisable
We need to lock this queue for that condition because the timeout work
executes per-namespace and can poll the poll CQ.

Reported-by: Hannes Reinecke <hare@kernel.org>
Closes: https://lore.kernel.org/all/20240902130728.1999-1-hare@kernel.org/
Fixes: a0fa9647a5 ("NVMe: add blk polling support")
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Daniel Wagner <wagi@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-13 07:57:11 +02:00
Kees Cook
40696426b8 nvme-pci: make nvme_pci_npages_prp() __always_inline
The only reason nvme_pci_npages_prp() could be used as a compile-time
known result in BUILD_BUG_ON() is because the compiler was always choosing
to inline the function. Under special circumstances (sanitizer coverage
functions disabled for __init functions on ARCH=um), the compiler decided
to stop inlining it:

   drivers/nvme/host/pci.c: In function 'nvme_init':
   include/linux/compiler_types.h:557:45: error: call to '__compiletime_assert_678' declared with attribute error: BUILD_BUG_ON failed: nvme_pci_npages_prp() > NVME_MAX_NR_ALLOCATIONS
     557 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
         |                                             ^
   include/linux/compiler_types.h:538:25: note: in definition of macro '__compiletime_assert'
     538 |                         prefix ## suffix();                             \
         |                         ^~~~~~
   include/linux/compiler_types.h:557:9: note: in expansion of macro '_compiletime_assert'
     557 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
         |         ^~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert'
      39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
         |                                     ^~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:50:9: note: in expansion of macro 'BUILD_BUG_ON_MSG'
      50 |         BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
         |         ^~~~~~~~~~~~~~~~
   drivers/nvme/host/pci.c:3804:9: note: in expansion of macro 'BUILD_BUG_ON'
    3804 |         BUILD_BUG_ON(nvme_pci_npages_prp() > NVME_MAX_NR_ALLOCATIONS);
         |         ^~~~~~~~~~~~

Force it to be __always_inline to make sure it is always available for
use with BUILD_BUG_ON().

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505061846.12FMyRjj-lkp@intel.com/
Fixes: c372cdd1ef ("nvme-pci: iod npages fits in s8")
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-13 07:57:11 +02:00
Steve Siwinski
e8007fad54 scsi: sd_zbc: block: Respect bio vector limits for REPORT ZONES buffer
The REPORT ZONES buffer size is currently limited by the HBA's maximum
segment count to ensure the buffer can be mapped. However, the block
layer further limits the number of iovec entries to 1024 when allocating
a bio.

To avoid allocation of buffers too large to be mapped, further restrict
the maximum buffer size to BIO_MAX_INLINE_VECS.

Replace the UIO_MAXIOV symbolic name with the more contextually
appropriate BIO_MAX_INLINE_VECS.

Fixes: b091ac6168 ("sd_zbc: Fix report zones buffer allocation")
Cc: stable@vger.kernel.org
Signed-off-by: Steve Siwinski <ssiwinski@atto.com>
Link: https://lore.kernel.org/r/20250508200122.243129-1-ssiwinski@atto.com
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-05-12 22:35:48 -04:00
Vladimir Oltean
498625a8ab net: dsa: sja1105: discard incoming frames in BR_STATE_LISTENING
It has been reported that when under a bridge with stp_state=1, the logs
get spammed with this message:

[  251.734607] fsl_dpaa2_eth dpni.5 eth0: Couldn't decode source port

Further debugging shows the following info associated with packets:
source_port=-1, switch_id=-1, vid=-1, vbid=1

In other words, they are data plane packets which are supposed to be
decoded by dsa_tag_8021q_find_port_by_vbid(), but the latter (correctly)
refuses to do so, because no switch port is currently in
BR_STATE_LEARNING or BR_STATE_FORWARDING - so the packet is effectively
unexpected.

The error goes away after the port progresses to BR_STATE_LEARNING in 15
seconds (the default forward_time of the bridge), because then,
dsa_tag_8021q_find_port_by_vbid() can correctly associate the data plane
packets with a plausible bridge port in a plausible STP state.

Re-reading IEEE 802.1D-1990, I see the following:

"4.4.2 Learning: (...) The Forwarding Process shall discard received
frames."

IEEE 802.1D-2004 further clarifies:

"DISABLED, BLOCKING, LISTENING, and BROKEN all correspond to the
DISCARDING port state. While those dot1dStpPortStates serve to
distinguish reasons for discarding frames, the operation of the
Forwarding and Learning processes is the same for all of them. (...)
LISTENING represents a port that the spanning tree algorithm has
selected to be part of the active topology (computing a Root Port or
Designated Port role) but is temporarily discarding frames to guard
against loops or incorrect learning."

Well, this is not what the driver does - instead it sets
mac[port].ingress = true.

To get rid of the log spam, prevent unexpected data plane packets to
be received by software by discarding them on ingress in the LISTENING
state.

In terms of blame attribution: the prints only date back to commit
d7f9787a76 ("net: dsa: tag_8021q: add support for imprecise RX based
on the VBID"). However, the settings would permit a LISTENING port to
forward to a FORWARDING port, and the standard suggests that's not OK.

Fixes: 640f763f98 ("net: dsa: sja1105: Add support for Spanning Tree Protocol")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20250509113816.2221992-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-12 18:44:50 -07:00
Cosmin Ratiu
af5f54b0ef net: Lock lower level devices when updating features
__netdev_update_features() expects the netdevice to be ops-locked, but
it gets called recursively on the lower level netdevices to sync their
features, and nothing locks those.

This commit fixes that, with the assumption that it shouldn't be possible
for both higher-level and lover-level netdevices to require the instance
lock, because that would lead to lock dependency warnings.

Without this, playing with higher level (e.g. vxlan) netdevices on top
of netdevices with instance locking enabled can run into issues:

WARNING: CPU: 59 PID: 206496 at ./include/net/netdev_lock.h:17 netif_napi_add_weight_locked+0x753/0xa60
[...]
Call Trace:
 <TASK>
 mlx5e_open_channel+0xc09/0x3740 [mlx5_core]
 mlx5e_open_channels+0x1f0/0x770 [mlx5_core]
 mlx5e_safe_switch_params+0x1b5/0x2e0 [mlx5_core]
 set_feature_lro+0x1c2/0x330 [mlx5_core]
 mlx5e_handle_feature+0xc8/0x140 [mlx5_core]
 mlx5e_set_features+0x233/0x2e0 [mlx5_core]
 __netdev_update_features+0x5be/0x1670
 __netdev_update_features+0x71f/0x1670
 dev_ethtool+0x21c5/0x4aa0
 dev_ioctl+0x438/0xae0
 sock_ioctl+0x2ba/0x690
 __x64_sys_ioctl+0xa78/0x1700
 do_syscall_64+0x6d/0x140
 entry_SYSCALL_64_after_hwframe+0x4b/0x53
 </TASK>

Fixes: 7e4d784f58 ("net: hold netdev instance lock during rtnetlink operations")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250509072850.2002821-1-cratiu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-12 18:44:33 -07:00
Mathieu Othacehe
c92d6089d8 net: cadence: macb: Fix a possible deadlock in macb_halt_tx.
There is a situation where after THALT is set high, TGO stays high as
well. Because jiffies are never updated, as we are in a context with
interrupts disabled, we never exit that loop and have a deadlock.

That deadlock was noticed on a sama5d4 device that stayed locked for days.

Use retries instead of jiffies so that the timeout really works and we do
not have a deadlock anymore.

Fixes: e86cd53afc ("net/macb: better manage tx errors")
Signed-off-by: Mathieu Othacehe <othacehe@gnu.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250509121935.16282-1-othacehe@gnu.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-12 18:39:38 -07:00
Linus Torvalds
e9565e23cd Merge tag 'sched_ext-for-6.15-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext fixes from Tejun Heo:
 "A little bit invasive for rc6 but they're important fixes, pass tests
  fine and won't break anything outside sched_ext:

   - scx_bpf_cpuperf_set() calls internal functions that require the rq
     to be locked. It assumed that the BPF caller has rq locked but
     that's not always true. Fix it by tracking whether rq is currently
     held by the CPU and grabbing it if necessary

   - bpf_iter_scx_dsq_new() was leaving the DSQ iterator in an
     uninitialized state after an error. However, next() and destroy()
     can be called on an iterator which failed initialization and thus
     they always need to be initialized even after an init error. Fix by
     always initializing the iterator

   - Remove duplicate BTF_ID_FLAGS() entries"

* tag 'sched_ext-for-6.15-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
  sched_ext: bpf_iter_scx_dsq_new() should always initialize iterator
  sched_ext: Fix rq lock state in hotplug ops
  sched_ext: Remove duplicate BTF_ID_FLAGS definitions
  sched_ext: Fix missing rq lock in scx_bpf_cpuperf_set()
  sched_ext: Track currently locked rq
2025-05-12 18:02:05 -07:00
Linus Torvalds
d471045e75 Merge tag 'cgroup-for-6.15-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fix from Tejun Heo:
 "One low-risk patch to fix a cpuset bug where it over-eagerly tries to
  modify CPU affinity of kernel threads"

* tag 'cgroup-for-6.15-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup/cpuset: Extend kthread_is_per_cpu() check to all PF_NO_SETAFFINITY tasks
2025-05-12 17:58:57 -07:00
Kyoji Ogasawara
4ce2affc6e btrfs: add back warning for mount option commit values exceeding 300
The Btrfs documentation states that if the commit value is greater than
300 a warning should be issued. The warning was accidentally lost in the
new mount API update.

Fixes: 6941823cc8 ("btrfs: remove old mount API code")
CC: stable@vger.kernel.org # 6.12+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Kyoji Ogasawara <sawara04.o@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-12 21:39:34 +02:00
Boris Burkov
a0fd1c6098 btrfs: fix folio leak in submit_one_async_extent()
If btrfs_reserve_extent() fails while submitting an async_extent for a
compressed write, then we fail to call free_async_extent_pages() on the
async_extent and leak its folios. A likely cause for such a failure
would be btrfs_reserve_extent() failing to find a large enough
contiguous free extent for the compressed extent.

I was able to reproduce this by:

1. mount with compress-force=zstd:3
2. fallocating most of a filesystem to a big file
3. fragmenting the remaining free space
4. trying to copy in a file which zstd would generate large compressed
   extents for (vmlinux worked well for this)

Step 4. hits the memory leak and can be repeated ad nauseam to
eventually exhaust the system memory.

Fix this by detecting the case where we fallback to uncompressed
submission for a compressed async_extent and ensuring that we call
free_async_extent_pages().

Fixes: 131a821a24 ("btrfs: fallback if compressed IO fails for ENOSPC")
CC: stable@vger.kernel.org # 6.1+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Co-developed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-12 21:39:13 +02:00
Filipe Manana
54db6d1bdd btrfs: fix discard worker infinite loop after disabling discard
If the discard worker is running and there's currently only one block
group, that block group is a data block group, it's in the unused block
groups discard list and is being used (it got an extent allocated from it
after becoming unused), the worker can end up in an infinite loop if a
transaction abort happens or the async discard is disabled (during remount
or unmount for example).

This happens like this:

1) Task A, the discard worker, is at peek_discard_list() and
   find_next_block_group() returns block group X;

2) Block group X is in the unused block groups discard list (its discard
   index is BTRFS_DISCARD_INDEX_UNUSED) since at some point in the past
   it become an unused block group and was added to that list, but then
   later it got an extent allocated from it, so its ->used counter is not
   zero anymore;

3) The current transaction is aborted by task B and we end up at
   __btrfs_handle_fs_error() in the transaction abort path, where we call
   btrfs_discard_stop(), which clears BTRFS_FS_DISCARD_RUNNING from
   fs_info, and then at __btrfs_handle_fs_error() we set the fs to RO mode
   (setting SB_RDONLY in the super block's s_flags field);

4) Task A calls __add_to_discard_list() with the goal of moving the block
   group from the unused block groups discard list into another discard
   list, but at __add_to_discard_list() we end up doing nothing because
   btrfs_run_discard_work() returns false, since the super block has
   SB_RDONLY set in its flags and BTRFS_FS_DISCARD_RUNNING is not set
   anymore in fs_info->flags. So block group X remains in the unused block
   groups discard list;

5) Task A then does a goto into the 'again' label, calls
   find_next_block_group() again we gets block group X again. Then it
   repeats the previous steps over and over since there are not other
   block groups in the discard lists and block group X is never moved
   out of the unused block groups discard list since
   btrfs_run_discard_work() keeps returning false and therefore
   __add_to_discard_list() doesn't move block group X out of that discard
   list.

When this happens we can get a soft lockup report like this:

  [71.957] watchdog: BUG: soft lockup - CPU#0 stuck for 27s! [kworker/u4:3:97]
  [71.957] Modules linked in: xfs af_packet rfkill (...)
  [71.957] CPU: 0 UID: 0 PID: 97 Comm: kworker/u4:3 Tainted: G        W          6.14.2-1-default #1 openSUSE Tumbleweed 968795ef2b1407352128b466fe887416c33af6fa
  [71.957] Tainted: [W]=WARN
  [71.957] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  [71.957] Workqueue: btrfs_discard btrfs_discard_workfn [btrfs]
  [71.957] RIP: 0010:btrfs_discard_workfn+0xc4/0x400 [btrfs]
  [71.957] Code: c1 01 48 83 (...)
  [71.957] RSP: 0018:ffffafaec03efe08 EFLAGS: 00000246
  [71.957] RAX: ffff897045500000 RBX: ffff8970413ed8d0 RCX: 0000000000000000
  [71.957] RDX: 0000000000000001 RSI: ffff8970413ed8d0 RDI: 0000000a8f1272ad
  [71.957] RBP: 0000000a9d61c60e R08: ffff897045500140 R09: 8080808080808080
  [71.957] R10: ffff897040276800 R11: fefefefefefefeff R12: ffff8970413ed860
  [71.957] R13: ffff897045500000 R14: ffff8970413ed868 R15: 0000000000000000
  [71.957] FS:  0000000000000000(0000) GS:ffff89707bc00000(0000) knlGS:0000000000000000
  [71.957] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [71.957] CR2: 00005605bcc8d2f0 CR3: 000000010376a001 CR4: 0000000000770ef0
  [71.957] PKRU: 55555554
  [71.957] Call Trace:
  [71.957]  <TASK>
  [71.957]  process_one_work+0x17e/0x330
  [71.957]  worker_thread+0x2ce/0x3f0
  [71.957]  ? __pfx_worker_thread+0x10/0x10
  [71.957]  kthread+0xef/0x220
  [71.957]  ? __pfx_kthread+0x10/0x10
  [71.957]  ret_from_fork+0x34/0x50
  [71.957]  ? __pfx_kthread+0x10/0x10
  [71.957]  ret_from_fork_asm+0x1a/0x30
  [71.957]  </TASK>
  [71.957] Kernel panic - not syncing: softlockup: hung tasks
  [71.987] CPU: 0 UID: 0 PID: 97 Comm: kworker/u4:3 Tainted: G        W    L     6.14.2-1-default #1 openSUSE Tumbleweed 968795ef2b1407352128b466fe887416c33af6fa
  [71.989] Tainted: [W]=WARN, [L]=SOFTLOCKUP
  [71.989] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  [71.991] Workqueue: btrfs_discard btrfs_discard_workfn [btrfs]
  [71.992] Call Trace:
  [71.993]  <IRQ>
  [71.994]  dump_stack_lvl+0x5a/0x80
  [71.994]  panic+0x10b/0x2da
  [71.995]  watchdog_timer_fn.cold+0x9a/0xa1
  [71.996]  ? __pfx_watchdog_timer_fn+0x10/0x10
  [71.997]  __hrtimer_run_queues+0x132/0x2a0
  [71.997]  hrtimer_interrupt+0xff/0x230
  [71.998]  __sysvec_apic_timer_interrupt+0x55/0x100
  [71.999]  sysvec_apic_timer_interrupt+0x6c/0x90
  [72.000]  </IRQ>
  [72.000]  <TASK>
  [72.001]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
  [72.002] RIP: 0010:btrfs_discard_workfn+0xc4/0x400 [btrfs]
  [72.002] Code: c1 01 48 83 (...)
  [72.005] RSP: 0018:ffffafaec03efe08 EFLAGS: 00000246
  [72.006] RAX: ffff897045500000 RBX: ffff8970413ed8d0 RCX: 0000000000000000
  [72.006] RDX: 0000000000000001 RSI: ffff8970413ed8d0 RDI: 0000000a8f1272ad
  [72.007] RBP: 0000000a9d61c60e R08: ffff897045500140 R09: 8080808080808080
  [72.008] R10: ffff897040276800 R11: fefefefefefefeff R12: ffff8970413ed860
  [72.009] R13: ffff897045500000 R14: ffff8970413ed868 R15: 0000000000000000
  [72.010]  ? btrfs_discard_workfn+0x51/0x400 [btrfs 23b01089228eb964071fb7ca156eee8cd3bf996f]
  [72.011]  process_one_work+0x17e/0x330
  [72.012]  worker_thread+0x2ce/0x3f0
  [72.013]  ? __pfx_worker_thread+0x10/0x10
  [72.014]  kthread+0xef/0x220
  [72.014]  ? __pfx_kthread+0x10/0x10
  [72.015]  ret_from_fork+0x34/0x50
  [72.015]  ? __pfx_kthread+0x10/0x10
  [72.016]  ret_from_fork_asm+0x1a/0x30
  [72.017]  </TASK>
  [72.017] Kernel Offset: 0x15000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
  [72.019] Rebooting in 90 seconds..

So fix this by making sure we move a block group out of the unused block
groups discard list when calling __add_to_discard_list().

Fixes: 2bee7eb8bb ("btrfs: discard one region at a time in async discard")
Link: https://bugzilla.suse.com/show_bug.cgi?id=1242012
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Boris Burkov <boris@bur.io>
Reviewed-by: Daniel Vacek <neelx@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-12 21:38:56 +02:00
Linus Torvalds
7a8bdc7fe0 Merge tag 'platform-drivers-x86-v6.15-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform drivers fixes from Ilpo Järvinen:

 - amd/pmc: Use spurious 8042 quirk with MECHREVO Wujie 14XA

 - amd/pmf:
     - Ensure Smart PC policies are valid
     - Fix memory leak when the engine fails to start

 - amd/hsmp: Make amd_hsmp and hsmp_acpi as mutually exclusive drivers

 - asus-wmi: Fix wlan_ctrl_by_user detection

 - thinkpad_acpi: Add support for NEC Lavie X1475JAS

* tag 'platform-drivers-x86-v6.15-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: asus-wmi: Fix wlan_ctrl_by_user detection
  platform/x86/amd/pmc: Declare quirk_spurious_8042 for MECHREVO Wujie 14XA (GX4HRXL)
  platform/x86: thinkpad_acpi: Support also NEC Lavie X1475JAS
  platform/x86/amd/hsmp: Make amd_hsmp and hsmp_acpi as mutually exclusive drivers
  drivers/platform/x86/amd: pmf: Check for invalid Smart PC Policies
  drivers/platform/x86/amd: pmf: Check for invalid sideloaded Smart PC Policies
2025-05-12 10:48:02 -07:00
Linus Torvalds
8b64199a7f Merge tag 'udf_for_v6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF fix from Jan Kara:
 "Fix a bug in UDF inode eviction leading to spewing pointless
  error messages"

* tag 'udf_for_v6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: Make sure i_lenExtents is uptodate on inode eviction
2025-05-12 10:23:20 -07:00
Steven Rostedt
1b0c192c92 tracing: samples: Initialize trace_array_printk() with the correct function
When using trace_array_printk() on a created instance, the correct
function to use to initialize it is:

  trace_array_init_printk()

Not

  trace_printk_init_buffer()

The former is a proper function to use, the latter is for initializing
trace_printk() and causes the NOTICE banner to be displayed.

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Divya Indi <divya.indi@oracle.com>
Link: https://lore.kernel.org/20250509152657.0f6744d9@gandalf.local.home
Fixes: 89ed42495e ("tracing: Sample module to demonstrate kernel access to Ftrace instances.")
Fixes: 38ce2a9e33 ("tracing: Add trace_array_init_printk() to initialize instance trace_printk() buffers")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-05-12 13:07:22 -04:00
Linus Torvalds
e238e49b18 Merge tag 'vfs-6.15-rc7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:

 - Ensure that simple_xattr_list() always includes security.* xattrs

 - Fix eventpoll busy loop optimization when combined with timeouts

 - Disable swapon() for devices with block sizes greater than page sizes

 - Don't call errseq_set() twice during mark_buffer_write_io_error().
   Just use mapping_set_error() which takes care to not deference
   unconditionally

* tag 'vfs-6.15-rc7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs: Remove redundant errseq_set call in mark_buffer_write_io_error.
  swapfile: disable swapon for bs > ps devices
  fs/eventpoll: fix endless busy loop after timeout has expired
  fs/xattr.c: fix simple_xattr_list to always include security.* xattrs
2025-05-12 10:04:14 -07:00
Wolfram Sang
579cb52ecd MAINTAINERS: add me as maintainer for the gpio sloppy logic analyzer
This was forgotten when the analyzer went upstream.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20250424212234.5313-2-wsa+renesas@sang-engineering.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-05-12 17:50:18 +02:00
Jens Axboe
f446c6311e io_uring/memmap: don't use page_address() on a highmem page
For older/32-bit systems with highmem, don't assume that the pages in
a mapped region are always going to be mapped. If io_region_init_ptr()
finds that the pages are coalescable, also check if the first page is
a HighMem page or not. If it is, fall through to the usual vmap()
mapping rather than attempt to get the unmapped page address.

Cc: stable@vger.kernel.org
Fixes: c4d0ac1c15 ("io_uring/memmap: optimise single folio regions")
Link: https://lore.kernel.org/all/681fe2fb.050a0220.f2294.001a.GAE@google.com/
Reported-by: syzbot+5b8c4abafcb1d791ccfc@syzkaller.appspotmail.com
Link: https://lore.kernel.org/all/681fed0a.050a0220.f2294.001c.GAE@google.com/
Reported-by: syzbot+6456a99dfdc2e78c4feb@syzkaller.appspotmail.com
Tested-by: syzbot+6456a99dfdc2e78c4feb@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-12 09:27:41 -06:00
hexue
63166b815d io_uring/uring_cmd: fix hybrid polling initialization issue
Modify the check for whether the timer is initialized during IO transfer
when passthrough is used with hybrid polling, to ensure that it's always
setup correctly.

Cc: stable@vger.kernel.org
Fixes: 01ee194d1a ("io_uring: add support for hybrid IOPOLL")
Signed-off-by: hexue <xue01.he@samsung.com>
Link: https://lore.kernel.org/r/20250512052025.293031-1-xue01.he@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-12 07:17:02 -06:00
Keith Busch
8098514bd5 block: always allocate integrity buffer when required
Many nvme metadata formats can not strip or generate the metadata on the
controller side. For these formats, a host provided integrity buffer is
mandatory even if it isn't checked.

The block integrity read_verify and write_generate attributes prevent
allocating the metadata buffer, but we need it when the format requires
it, otherwise reads and writes will be rejected by the driver with IO
errors.

Assume the integrity buffer can be offloaded to the controller if the
metadata size is the same as the protection information size. Otherwise
provide an unchecked host buffer when the read verify or write
generation attributes are disabled. This fixes the following nvme
warning:

 ------------[ cut here ]------------
 WARNING: CPU: 1 PID: 371 at drivers/nvme/host/core.c:1036 nvme_setup_rw+0x122/0x210
 ...
 RIP: 0010:nvme_setup_rw+0x122/0x210
 ...
 Call Trace:
  <TASK>
  nvme_setup_cmd+0x1b4/0x280
  nvme_queue_rqs+0xc4/0x1f0 [nvme]
  blk_mq_dispatch_queue_requests+0x24a/0x430
  blk_mq_flush_plug_list+0x50/0x140
  __blk_flush_plug+0xc1/0x100
  __submit_bio+0x1c1/0x360
  ? submit_bio_noacct_nocheck+0x2d6/0x3c0
  submit_bio_noacct_nocheck+0x2d6/0x3c0
  ? submit_bio_noacct+0x47/0x4c0
  submit_bio_wait+0x48/0xa0
  __blkdev_direct_IO_simple+0xee/0x210
  ? current_time+0x1d/0x100
  ? current_time+0x1d/0x100
  ? __bio_clone+0xb0/0xb0
  blkdev_read_iter+0xbb/0x140
  vfs_read+0x239/0x310
  ksys_read+0x58/0xc0
  do_syscall_64+0x6c/0x180
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250509153802.3482493-1-kbusch@meta.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-12 07:14:03 -06:00
Markus Burri
2f0c6fceae accel/ivpu: Use effective buffer size for zero terminator
Use the effective written size instead of original size as index for zero
termination. If the input from user-space is to larger and the input is
truncated, the original size is out-of-bound.
Since there is an upfront size check here, the change is for consistency.

Signed-off-by: Markus Burri <markus.burri@mt.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250508130612.82270-3-markus.burri@mt.com
2025-05-12 15:12:44 +02:00
Alexey Makhalov
386cd3dcfd MAINTAINERS: Update Alexey Makhalov's email address
Fix a typo in an email address.

Closes: https://lore.kernel.org/all/20240925-rational-succinct-vulture-cca9fb@lemur/T/
Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Reported-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Alexey Makhalov <alexey.makhalov@broadcom.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/20250318004031.2703923-1-alexey.makhalov@broadcom.com
2025-05-12 13:41:06 +02:00
Mickaël Salaün
3039ed4327 landlock: Improve bit operations in audit code
Use the BIT() and BIT_ULL() macros in the new audit code instead of
explicit shifts to improve readability.  Use bitmask instead of modulo
operation to simplify code.

Add test_range1_rand15() and test_range2_rand15() KUnit tests to improve
get_id_range() coverage.

Signed-off-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20250512093732.1408485-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-05-12 11:38:53 +02:00
Fabio Estevam
9c1798259b drm/tiny: panel-mipi-dbi: Use drm_client_setup_with_fourcc()
Since commit 559358282e ("drm/fb-helper: Don't use the preferred depth
for the BPP default"), RGB565 displays such as the CFAF240320X no longer
render correctly: colors are distorted and the content is shown twice
horizontally.

This regression is due to the fbdev emulation layer defaulting to 32 bits
per pixel, whereas the display expects 16 bpp (RGB565). As a result, the
framebuffer data is incorrectly interpreted by the panel.

Fix the issue by calling drm_client_setup_with_fourcc() with a format
explicitly selected based on the display's bits-per-pixel value. For 16
bpp, use DRM_FORMAT_RGB565; for other values, fall back to the previous
behavior. This ensures that the allocated framebuffer format matches the
hardware expectations, avoiding color and layout corruption.

Tested on a CFAF240320X display with an RGB565 configuration, confirming
correct colors and layout after applying this patch.

Cc: stable@vger.kernel.org
Fixes: 559358282e ("drm/fb-helper: Don't use the preferred depth for the BPP default")
Signed-off-by: Fabio Estevam <festevam@denx.de>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20250417103458.2496790-1-festevam@gmail.com
2025-05-12 08:57:04 +02:00
Masahiro Yamada
e0cd396d89 kbuild: fix typos "module.builtin" to "modules.builtin"
The filenames in the comments do not match the actual generated files.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-05-12 15:04:52 +09:00
Thomas Weißschuh
8cf5b3f836 Revert "kbuild, rust: use -fremap-path-prefix to make paths relative"
This reverts commit dbdffaf50f.

--remap-path-prefix breaks the ability of debuggers to find the source
file corresponding to object files. As there is no simple or uniform
way to specify the source directory explicitly, this breaks developers
workflows.

Revert the unconditional usage of --remap-path-prefix, equivalent to the
same change for -ffile-prefix-map in KBUILD_CPPFLAGS.

Fixes: dbdffaf50f ("kbuild, rust: use -fremap-path-prefix to make paths relative")
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-05-12 15:04:13 +09:00
Thomas Weißschuh
020d7f1448 Revert "kbuild: make all file references relative to source root"
This reverts commit cacd22ce69.

-ffile-prefix-map breaks the ability of debuggers to find the source
file corresponding to object files. As there is no simple or uniform
way to specify the source directory explicitly, this breaks developers
workflows.

Revert the unconditional usage of -ffile-prefix-map.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/lkml/edc50aa7-0740-4942-8c15-96f12f2acc7e@kernel.org/
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Closes: https://lore.kernel.org/lkml/aBEttQH4kimHFScx@intel.com/
Fixes: cacd22ce69 ("kbuild: make all file references relative to source root")
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-05-12 15:04:13 +09:00
Masahiro Yamada
f0e4b333cf kbuild: fix dependency on sorttable
Commit ac4f06789b ("kbuild: Create intermediate vmlinux build with
relocations preserved") missed replacing one occurrence of "vmlinux"
that was added during the same development cycle.

Fixes: ac4f06789b ("kbuild: Create intermediate vmlinux build with relocations preserved")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
2025-05-12 15:04:09 +09:00
Masahiro Yamada
d1b99cdf22 init: remove unused CONFIG_CC_CAN_LINK_STATIC
This is a leftover from commit 98e20e5e13 ("bpfilter: remove bpfilter").

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-05-12 15:03:46 +09:00
Masahiro Yamada
ab09da7570 um: let 'make clean' properly clean underlying SUBARCH as well
Building the kernel with O= is affected by stale in-tree build artifacts.

So, if the source tree is not clean, Kbuild displays the following:

  $ make ARCH=um O=build defconfig
  make[1]: Entering directory '/.../linux/build'
  ***
  *** The source tree is not clean, please run 'make ARCH=um mrproper'
  *** in /.../linux
  ***
  make[2]: *** [/.../linux/Makefile:673: outputmakefile] Error 1
  make[1]: *** [/.../linux/Makefile:248: __sub-make] Error 2
  make[1]: Leaving directory '/.../linux/build'
  make: *** [Makefile:248: __sub-make] Error 2

Usually, running 'make mrproper' is sufficient for cleaning the source
tree for out-of-tree builds.

However, building UML generates build artifacts not only in arch/um/,
but also in the SUBARCH directory (i.e., arch/x86/). If in-tree stale
files remain under arch/x86/, Kbuild will reuse them instead of creating
new ones under the specified build directory.

This commit makes 'make ARCH=um clean' recurse into the SUBARCH directory.

Reported-by: Shuah Khan <skhan@linuxfoundation.org>
Closes: https://lore.kernel.org/lkml/20250502172459.14175-1-skhan@linuxfoundation.org/
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
2025-05-12 15:03:42 +09:00
Nathan Chancellor
d0afcfeb9e kbuild: Disable -Wdefault-const-init-unsafe
A new on by default warning in clang [1] aims to flags instances where
const variables without static or thread local storage or const members
in aggregate types are not initialized because it can lead to an
indeterminate value. This is quite noisy for the kernel due to
instances originating from header files such as:

  drivers/gpu/drm/i915/gt/intel_ring.h:62:2: error: default initialization of an object of type 'typeof (ring->size)' (aka 'const unsigned int') leaves the object uninitialized [-Werror,-Wdefault-const-init-var-unsafe]
     62 |         typecheck(typeof(ring->size), next);
        |         ^
  include/linux/typecheck.h:10:9: note: expanded from macro 'typecheck'
     10 | ({      type __dummy; \
        |              ^

  include/net/ip.h:478:14: error: default initialization of an object of type 'typeof (rt->dst.expires)' (aka 'const unsigned long') leaves the object uninitialized [-Werror,-Wdefault-const-init-var-unsafe]
    478 |                 if (mtu && time_before(jiffies, rt->dst.expires))
        |                            ^
  include/linux/jiffies.h:138:26: note: expanded from macro 'time_before'
    138 | #define time_before(a,b)        time_after(b,a)
        |                                 ^
  include/linux/jiffies.h:128:3: note: expanded from macro 'time_after'
    128 |         (typecheck(unsigned long, a) && \
        |          ^
  include/linux/typecheck.h:11:12: note: expanded from macro 'typecheck'
     11 |         typeof(x) __dummy2; \
        |                   ^

  include/linux/list.h:409:27: warning: default initialization of an object of type 'union (unnamed union at include/linux/list.h:409:27)' with const member leaves the object uninitialized [-Wdefault-const-init-field-unsafe]
    409 |         struct list_head *next = smp_load_acquire(&head->next);
        |                                  ^
  include/asm-generic/barrier.h:176:29: note: expanded from macro 'smp_load_acquire'
    176 | #define smp_load_acquire(p) __smp_load_acquire(p)
        |                             ^
  arch/arm64/include/asm/barrier.h:164:59: note: expanded from macro '__smp_load_acquire'
    164 |         union { __unqual_scalar_typeof(*p) __val; char __c[1]; } __u;   \
        |                                                                  ^
  include/linux/list.h:409:27: note: member '__val' declared 'const' here

  crypto/scatterwalk.c:66:22: error: default initialization of an object of type 'struct scatter_walk' with const member leaves the object uninitialized [-Werror,-Wdefault-const-init-field-unsafe]
     66 |         struct scatter_walk walk;
        |                             ^
  include/crypto/algapi.h:112:15: note: member 'addr' declared 'const' here
    112 |                 void *const addr;
        |                             ^

  fs/hugetlbfs/inode.c:733:24: error: default initialization of an object of type 'struct vm_area_struct' with const member leaves the object uninitialized [-Werror,-Wdefault-const-init-field-unsafe]
    733 |         struct vm_area_struct pseudo_vma;
        |                               ^
  include/linux/mm_types.h:803:20: note: member 'vm_flags' declared 'const' here
    803 |                 const vm_flags_t vm_flags;
        |                                  ^

Silencing the instances from typecheck.h is difficult because '= {}' is
not available in older but supported compilers and '= {0}' would cause
warnings about a literal 0 being treated as NULL. While it might be
possible to come up with a local hack to silence the warning for
clang-21+, it may not be worth it since -Wuninitialized will still
trigger if an uninitialized const variable is actually used.

In all audited cases of the "field" variant of the warning, the members
are either not used in the particular call path, modified through other
means such as memset() / memcpy() because the containing object is not
const, or are within a union with other non-const members.

Since this warning does not appear to have a high signal to noise ratio,
just disable it.

Cc: stable@vger.kernel.org
Link: 576161cb60 [1]
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Closes: https://lore.kernel.org/CA+G9fYuNjKcxFKS_MKPRuga32XbndkLGcY-PVuoSwzv6VWbY=w@mail.gmail.com/
Reported-by: Marcus Seyfarth <m.seyfarth@gmail.com>
Closes: https://github.com/ClangBuiltLinux/linux/issues/2088
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-05-12 15:03:17 +09:00
WangYuli
5bd6bdd0f7 kbuild: rpm-pkg: Add (elfutils-devel or libdw-devel) to BuildRequires
The dwarf.h header, which is included by
scripts/gendwarfksyms/gendwarfksyms.h, resides within elfutils-devel
or libdw-devel package.

This portion of the code is compiled under the condition that
CONFIG_GENDWARFKSYMS is enabled.

Consequently, add (elfutils-devel or libdw-devel) to BuildRequires to
prevent unforeseen compilation failures.

Fix follow possible error:
  In file included from scripts/gendwarfksyms/cache.c:6:
  scripts/gendwarfksyms/gendwarfksyms.h:6:10: fatal error: 'dwarf.h' file not found
      6 | #include <dwarf.h>
        |          ^~~~~~~~~

Link: https://lore.kernel.org/all/3e52d80d-0c60-4df5-8cb5-21d4b1fce7b7@suse.com/
Fixes: f28568841a ("tools: Add gendwarfksyms")
Suggested-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: WangYuli <wangyuli@uniontech.com>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-05-12 15:03:17 +09:00
WangYuli
657f96cb7c kbuild: deb-pkg: Add libdw-dev:native to Build-Depends-Arch
The dwarf.h header, which is included by
scripts/gendwarfksyms/gendwarfksyms.h, resides within the libdw-dev
package.

This portion of the code is compiled under the condition that
CONFIG_GENDWARFKSYMS is enabled.

Consequently, add libdw-dev to Build-Depends-Arch to prevent
unforeseen compilation failures.

Fix follow possible error:
  In file included from scripts/gendwarfksyms/symbols.c:6:
  scripts/gendwarfksyms/gendwarfksyms.h:6:10: fatal error: 'dwarf.h' file not found
      6 | #include <dwarf.h>
        |          ^~~~~~~~~

Fixes: f28568841a ("tools: Add gendwarfksyms")
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: WangYuli <wangyuli@uniontech.com>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Tested-by: Nicolas Schier <n.schier@avm.de>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-05-12 15:03:17 +09:00
Randy Dunlap
56045757ac usr/include: openrisc: don't HDRTEST bpf_perf_event.h
Since openrisc does not support PERF_EVENTS, omit the HDRTEST of
bpf_perf_event.h for arch/openrisc/.

Fixes a build error:
usr/include/linux/bpf_perf_event.h:14:28: error: field 'regs' has incomplete type

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Stafford Horne <shorne@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-05-12 15:03:17 +09:00
Sami Tolvanen
9520a2b3f0 kbuild: Require pahole <v1.28 or >v1.29 with GENDWARFKSYMS on X86
With CONFIG_GENDWARFKSYMS, __gendwarfksyms_ptr variables are
added to the kernel in EXPORT_SYMBOL() to ensure DWARF type
information is available for exported symbols in the TUs where
they're actually exported. These symbols are dropped when linking
vmlinux, but dangling references to them remain in DWARF.

With CONFIG_DEBUG_INFO_BTF enabled on X86, pahole versions after
commit 47dcb534e253 ("btf_encoder: Stop indexing symbols for
VARs") and before commit 9810758003ce ("btf_encoder: Verify 0
address DWARF variables are in ELF section") place these symbols
in the .data..percpu section, which results in an "Invalid
offset" error in btf_datasec_check_meta() during boot, as all
the variables are at zero offset and have non-zero size. If
CONFIG_DEBUG_INFO_BTF_MODULES is enabled, this also results in a
failure to load modules with:

  failed to validate module [$module] BTF: -22

As the issue occurs in pahole v1.28 and the fix was merged
after v1.29 was released, require pahole <v1.28 or >v1.29 when
GENDWARFKSYMS is enabled with DEBUG_INFO_BTF on X86.

Reported-by: Paolo Pisati <paolo.pisati@canonical.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-05-12 15:03:16 +09:00
Linus Torvalds
627277ba7c Merge tag 'arm64_cbpf_mitigation_2025_05_08' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 cBPF BHB mitigation from James Morse:
 "This adds the BHB mitigation into the code JITted for cBPF programs as
  these can be loaded by unprivileged users via features like seccomp.

  The existing mechanisms to disable the BHB mitigation will also
  prevent the mitigation being JITted. In addition, cBPF programs loaded
  by processes with the SYS_ADMIN capability are not mitigated as these
  could equally load an eBPF program that does the same thing.

  For good measure, the list of 'k' values for CPU's local mitigations
  is updated from the version on arm's website"

* tag 'arm64_cbpf_mitigation_2025_05_08' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: proton-pack: Add new CPUs 'k' values for branch mitigation
  arm64: bpf: Only mitigate cBPF programs loaded by unprivileged users
  arm64: bpf: Add BHB mitigation to the epilogue for cBPF programs
  arm64: proton-pack: Expose whether the branchy loop k value
  arm64: proton-pack: Expose whether the platform is mitigated by firmware
  arm64: insn: Add support for encoding DSB
2025-05-11 17:45:00 -07:00
Barry Song
75cb1cca2c mm: userfaultfd: correct dirty flags set for both present and swap pte
As David pointed out, what truly matters for mremap and userfaultfd move
operations is the soft dirty bit.  The current comment and
implementation—which always sets the dirty bit for present PTEs and
fails to set the soft dirty bit for swap PTEs—are incorrect.  This could
break features like Checkpoint-Restore in Userspace (CRIU).

This patch updates the behavior to correctly set the soft dirty bit for
both present and swap PTEs in accordance with mremap.

Link: https://lkml.kernel.org/r/20250508220912.7275-1-21cnbao@gmail.com
Fixes: adef440691 ("userfaultfd: UFFDIO_MOVE uABI")
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Reported-by: David Hildenbrand <david@redhat.com>
Closes: https://lore.kernel.org/linux-mm/02f14ee1-923f-47e3-a994-4950afb9afcc@redhat.com/
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:29:55 -07:00
Sergey Senozhatsky
02f5bf89f0 zsmalloc: don't underflow size calculation in zs_obj_write()
Do not mix class->size and object size during offsets/sizes calculation in
zs_obj_write().  Size classes can merge into clusters, based on
objects-per-zspage and pages-per-zspage characteristics, so some size
classes can store objects smaller than class->size.  This becomes
problematic when object size is much smaller than class->size.  zsmalloc
can falsely decide that object spans two physical pages, because a larger
class->size value is used for that check, while the actual object is much
smaller and fits the free space of the first physical page, so there is
nothing to write to the second page and memcpy() size calculation
underflows.

 Unable to handle kernel paging request at virtual address ffffc00081ff4000
 pc : __memcpy+0x10/0x24
 lr : zs_obj_write+0x1b0/0x1d0 [zsmalloc]
 Call trace:
  __memcpy+0x10/0x24 (P)
  zram_write_page+0x150/0x4fc [zram]
  zram_submit_bio+0x5e0/0x6a4 [zram]
  __submit_bio+0x168/0x220
  submit_bio_noacct_nocheck+0x128/0x2c8
  submit_bio_noacct+0x19c/0x2f8

This is mostly seen on system with larger page-sizes, because size class
cluters of such systems hold wider size ranges than on 4K PAGE_SIZE
systems.

Assume a 16K PAGE_SIZE system, a write of 820 bytes object to a 864-bytes
size class at offset 15560.  15560 + 864 is more than 16384 so zsmalloc
attempts to memcpy() it to two physical pages.  However, 16384 - 15560 =
824 which is more than 820, so the object in fact doesn't span two
physical pages, and there is no data to write to the second physical page.

We always know the exact size in bytes of the object that we are about to
write (store), so use it instead of class->size.

Link: https://lkml.kernel.org/r/20250507054312.4135983-1-senozhatsky@chromium.org
Fixes: 44f7641349 ("zsmalloc: introduce new object mapping API")
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reported-by: Igor Belousov <igor.b@beldev.am>
Tested-by: Igor Belousov <igor.b@beldev.am>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:26:07 -07:00
Kirill A. Shutemov
fefc075182 mm/page_alloc: fix race condition in unaccepted memory handling
The page allocator tracks the number of zones that have unaccepted memory
using static_branch_enc/dec() and uses that static branch in hot paths to
determine if it needs to deal with unaccepted memory.

Borislav and Thomas pointed out that the tracking is racy: operations on
static_branch are not serialized against adding/removing unaccepted pages
to/from the zone.

Sanity checks inside static_branch machinery detects it:

WARNING: CPU: 0 PID: 10 at kernel/jump_label.c:276 __static_key_slow_dec_cpuslocked+0x8e/0xa0

The comment around the WARN() explains the problem:

	/*
	 * Warn about the '-1' case though; since that means a
	 * decrement is concurrent with a first (0->1) increment. IOW
	 * people are trying to disable something that wasn't yet fully
	 * enabled. This suggests an ordering problem on the user side.
	 */

The effect of this static_branch optimization is only visible on
microbenchmark.

Instead of adding more complexity around it, remove it altogether.

Link: https://lkml.kernel.org/r/20250506133207.1009676-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Fixes: dcdfdd40fa ("mm: Add support for unaccepted memory")
Link: https://lore.kernel.org/all/20250506092445.GBaBnVXXyvnazly6iF@fat_crate.local
Reported-by: Borislav Petkov <bp@alien8.de>
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>	[6.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:26:07 -07:00
Kirill A. Shutemov
23fa022a07 mm/page_alloc: ensure try_alloc_pages() plays well with unaccepted memory
try_alloc_pages() will not attempt to allocate memory if the system has
*any* unaccepted memory.  Memory is accepted as needed and can remain in
the system indefinitely, causing the interface to always fail.

Rather than immediately giving up, attempt to use already accepted memory
on free lists.

Pass 'alloc_flags' to cond_accept_memory() and do not accept new memory
for ALLOC_TRYLOCK requests.

Found via code inspection - only BPF uses this at present and the
runtime effects are unclear.

Link: https://lkml.kernel.org/r/20250506112509.905147-2-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Fixes: 97769a53f1 ("mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation")
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:26:07 -07:00
Lorenzo Stoakes
d55582d6c9 MAINTAINERS: add mm GUP section
As part of the ongoing efforts to sub-divide memory management
maintainership and reviewership, establish a section for GUP (Get User
Pages) support and add appropriate maintainers and reviewers.

Link: https://lkml.kernel.org/r/20250506173601.97562-1-lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:26:07 -07:00
David Wang
0ae0227fa3 mm/codetag: move tag retrieval back upfront in __free_pages()
Commit 51ff4d7486 ("mm: avoid extra mem_alloc_profiling_enabled()
 checks") introduces a possible use-after-free scenario, when page
is non-compound, page[0] could be released by other thread right
after put_page_testzero failed in current thread, pgalloc_tag_sub_pages
afterwards would manipulate an invalid page for accounting remaining
pages:

[timeline]   [thread1]                     [thread2]
  |          alloc_page non-compound
  V
  |                                        get_page, rf counter inc
  V
  |          in ___free_pages
  |          put_page_testzero fails
  V
  |                                        put_page, page released
  V
  |          in ___free_pages,
  |          pgalloc_tag_sub_pages
  |          manipulate an invalid page
  V

Restore __free_pages() to its state before, retrieve alloc tag
beforehand.

Link: https://lkml.kernel.org/r/20250505193034.91682-1-00107082@163.com
Fixes: 51ff4d7486 ("mm: avoid extra mem_alloc_profiling_enabled() checks")
Signed-off-by: David Wang <00107082@163.com>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:26:06 -07:00
Kairui Song
4b7c0857f8 mm/memory: fix mapcount / refcount sanity check for mTHP reuse
The following WARNING was triggered during swap stress test with mTHP
enabled:

[ 6609.335758] ------------[ cut here ]------------
[ 6609.337758] WARNING: CPU: 82 PID: 755116 at mm/memory.c:3794 do_wp_page+0x1084/0x10e0
[ 6609.340922] Modules linked in: zram virtiofs
[ 6609.342699] CPU: 82 UID: 0 PID: 755116 Comm: sh Kdump: loaded Not tainted 6.15.0-rc1+ #1429 PREEMPT(voluntary)
[ 6609.347620] Hardware name: Red Hat KVM/RHEL-AV, BIOS 0.0.0 02/06/2015
[ 6609.349909] RIP: 0010:do_wp_page+0x1084/0x10e0
[ 6609.351532] Code: ff ff 48 c7 c6 80 ba 49 82 4c 89 ef e8 95 fd fe ff 0f 0b bd f5 ff ff ff e9 43 fb ff ff 41 83 a9 bc 12 00 00 01 e9 5c fb ff ff <0f> 0b e9 a6 fc ff ff 65 ff 00 f0 48 0f b
a 6d 00 1f 0f 83 82 fc ff
[ 6609.357959] RSP: 0000:ffffc90002273d40 EFLAGS: 00010287
[ 6609.359915] RAX: 000000000000000f RBX: 0000000000000000 RCX: 000fffffffe00000
[ 6609.362606] RDX: 0000000000000010 RSI: 000055a119ac1000 RDI: ffffea000ae6ec00
[ 6609.365143] RBP: ffffea000ae6ec68 R08: 84000002b9bb1025 R09: 000055a119ab6000
[ 6609.367569] R10: ffff8881caa2ad80 R11: 0000000000000000 R12: ffff8881caa2ad80
[ 6609.370255] R13: ffffea000ae6ec00 R14: 000055a119ac1c9c R15: ffffc90002273dd8
[ 6609.373007] FS:  00007f08e467f740(0000) GS:ffff88a07c214000(0000) knlGS:0000000000000000
[ 6609.375999] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6609.377946] CR2: 000055a119ac1c9c CR3: 00000001adfd6005 CR4: 0000000000770eb0
[ 6609.380376] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 6609.382853] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 6609.385216] PKRU: 55555554
[ 6609.386141] Call Trace:
[ 6609.387017]  <TASK>
[ 6609.387718]  ? ___pte_offset_map+0x1b/0x110
[ 6609.389056]  __handle_mm_fault+0xa51/0xf00
[ 6609.390363]  ? exc_page_fault+0x6a/0x140
[ 6609.391629]  handle_mm_fault+0x13d/0x360
[ 6609.392856]  do_user_addr_fault+0x2f2/0x7f0
[ 6609.394160]  ? sigprocmask+0x77/0xa0
[ 6609.395375]  exc_page_fault+0x6a/0x140
[ 6609.396735]  asm_exc_page_fault+0x26/0x30
[ 6609.398224] RIP: 0033:0x55a1050bc18b
[ 6609.399567] Code: 8b 3f 4d 85 ff 74 40 41 39 5f 18 75 f2 49 8b 7f 08 44 38 27 75 e9 4c 89 c6 4c 89 45 c8 e8 bd 83 fa ff 4c 8b 45 c8 85 c0 75 d5 <41> 83 47 1c 01 48 83 c4 28 4c 89 f8 5b 4
1 5c 41 5d 41 5e 41 5f 5d
[ 6609.405971] RSP: 002b:00007ffcf5f37d90 EFLAGS: 00010246
[ 6609.407737] RAX: 0000000000000000 RBX: 00000000182768fa RCX: 0000000000000000
[ 6609.410151] RDX: 00000000000000fa RSI: 000055a105175c7b RDI: 000055a119ac1c60
[ 6609.412606] RBP: 00007ffcf5f37de0 R08: 000055a105175c7b R09: 0000000000000000
[ 6609.414998] R10: 000000004d2dfb5a R11: 0000000000000246 R12: 0000000000000050
[ 6609.417193] R13: 00000000000000fa R14: 000055a119abaf60 R15: 000055a119ac1c80
[ 6609.419268]  </TASK>
[ 6609.419928] ---[ end trace 0000000000000000 ]---

The WARN_ON here is simply incorrect.  The refcount here must be at least
the mapcount, not the opposite.  Each mapcount must have a corresponding
refcount, but the refcount may increase if other components grab the
folio, which is acceptable.  Meanwhile, having a mapcount larger than
refcount is a real problem.

So fix the WARN_ON condition.

Link: https://lkml.kernel.org/r/20250425074325.61833-1-ryncsn@gmail.com
Fixes: 1da190f4d0 ("mm: Copy-on-Write (COW) reuse support for PTE-mapped THP")
Signed-off-by: Kairui Song <kasong@tencent.com>
Reported-by: Kairui Song <kasong@tencent.com>
Closes: https://lore.kernel.org/all/CAMgjq7D+ea3eg9gRCVvRnto3Sv3_H3WVhupX4e=k8T5QAfBHbw@mail.gmail.com/
Suggested-by: David Hildenbrand <david@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:26:06 -07:00
David Hildenbrand
e9f180d7cf kernel/fork: only call untrack_pfn_clear() on VMAs duplicated for fork()
Not intuitive, but vm_area_dup() located in kernel/fork.c is not only used
for duplicating VMAs during fork(), but also for duplicating VMAs when
splitting VMAs or when mremap()'ing them.

VM_PFNMAP mappings can at least get ordinarily mremap()'ed (no change in
size) and apparently also shrunk during mremap(), which implies
duplicating the VMA in __split_vma() first.

In case of ordinary mremap() (no change in size), we first duplicate the
VMA in copy_vma_and_data()->copy_vma() to then call untrack_pfn_clear() on
the old VMA: we effectively move the VM_PAT reservation.  So the
untrack_pfn_clear() call on the new VMA duplicating is wrong in that
context.

Splitting of VMAs seems problematic, because we don't duplicate/adjust the
reservation when splitting the VMA.  Instead, in memtype_erase() -- called
during zapping/munmap -- we shrink a reservation in case only the end
address matches: Assume we split a VMA into A and B, both would share a
reservation until B is unmapped.

So when unmapping B, the reservation would be updated to cover only A. 
When unmapping A, we would properly remove the now-shrunk reservation. 
That scenario describes the mremap() shrinking (old_size > new_size),
where we split + unmap B, and the untrack_pfn_clear() on the new VMA when
is wrong.

What if we manage to split a VM_PFNMAP VMA into A and B and unmap A first?
It would be broken because we would never free the reservation.  Likely,
there are ways to trigger such a VMA split outside of mremap().

Affecting other VMA duplication was not intended, vm_area_dup() being used
outside of kernel/fork.c was an oversight.  So let's fix that for; how to
handle VMA splits better should be investigated separately.


With a simple reproducer that uses mprotect() to split such a VMA I can
trigger

x86/PAT: pat_mremap:26448 freeing invalid memtype [mem 0x00000000-0x00000fff]

Link: https://lkml.kernel.org/r/20250422144942.2871395-1-david@redhat.com
Fixes: dc84bc2aba ("x86/mm/pat: Fix VM_PAT handling when fork() fails in copy_page_range()")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Rik van Riel <riel@surriel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:26:06 -07:00
Wupeng Ma
a833a693a4 mm: hugetlb: fix incorrect fallback for subpool
During our testing with hugetlb subpool enabled, we observe that
hstate->resv_huge_pages may underflow into negative values.  Root cause
analysis reveals a race condition in subpool reservation fallback handling
as follow:

hugetlb_reserve_pages()
    /* Attempt subpool reservation */
    gbl_reserve = hugepage_subpool_get_pages(spool, chg);

    /* Global reservation may fail after subpool allocation */
    if (hugetlb_acct_memory(h, gbl_reserve) < 0)
        goto out_put_pages;

out_put_pages:
    /* This incorrectly restores reservation to subpool */
    hugepage_subpool_put_pages(spool, chg);

When hugetlb_acct_memory() fails after subpool allocation, the current
implementation over-commits subpool reservations by returning the full
'chg' value instead of the actual allocated 'gbl_reserve' amount.  This
discrepancy propagates to global reservations during subsequent releases,
eventually causing resv_huge_pages underflow.

This problem can be trigger easily with the following steps:
1. reverse hugepage for hugeltb allocation
2. mount hugetlbfs with min_size to enable hugetlb subpool
3. alloc hugepages with two task(make sure the second will fail due to
   insufficient amount of hugepages)
4. with for a few seconds and repeat step 3 which will make
   hstate->resv_huge_pages to go below zero.

To fix this problem, return corrent amount of pages to subpool during the
fallback after hugepage_subpool_get_pages is called.

Link: https://lkml.kernel.org/r/20250410062633.3102457-1-mawupeng1@huawei.com
Fixes: 1c5ecae3a9 ("hugetlbfs: add minimum size accounting to subpools")
Signed-off-by: Wupeng Ma <mawupeng1@huawei.com>
Tested-by: Joshua Hahn <joshua.hahnjy@gmail.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Ma Wupeng <mawupeng1@huawei.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:26:06 -07:00
Linus Torvalds
6f5bf947ba Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 ITS mitigation from Dave Hansen:
 "Mitigate Indirect Target Selection (ITS) issue.

  I'd describe this one as a good old CPU bug where the behavior is
  _obviously_ wrong, but since it just results in bad predictions it
  wasn't wrong enough to notice. Well, the researchers noticed and also
  realized that thus bug undermined a bunch of existing indirect branch
  mitigations.

  Thus the unusually wide impact on this one. Details:

  ITS is a bug in some Intel CPUs that affects indirect branches
  including RETs in the first half of a cacheline. Due to ITS such
  branches may get wrongly predicted to a target of (direct or indirect)
  branch that is located in the second half of a cacheline. Researchers
  at VUSec found this behavior and reported to Intel.

  Affected processors:

   - Cascade Lake, Cooper Lake, Whiskey Lake V, Coffee Lake R, Comet
     Lake, Ice Lake, Tiger Lake and Rocket Lake.

  Scope of impact:

   - Guest/host isolation:

     When eIBRS is used for guest/host isolation, the indirect branches
     in the VMM may still be predicted with targets corresponding to
     direct branches in the guest.

   - Intra-mode using cBPF:

     cBPF can be used to poison the branch history to exploit ITS.
     Realigning the indirect branches and RETs mitigates this attack
     vector.

   - User/kernel:

     With eIBRS enabled user/kernel isolation is *not* impacted by ITS.

   - Indirect Branch Prediction Barrier (IBPB):

     Due to this bug indirect branches may be predicted with targets
     corresponding to direct branches which were executed prior to IBPB.
     This will be fixed in the microcode.

  Mitigation:

  As indirect branches in the first half of cacheline are affected, the
  mitigation is to replace those indirect branches with a call to thunk that
  is aligned to the second half of the cacheline.

  RETs that take prediction from RSB are not affected, but they may be
  affected by RSB-underflow condition. So, RETs in the first half of
  cacheline are also patched to a return thunk that executes the RET aligned
  to second half of cacheline"

* tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  selftest/x86/bugs: Add selftests for ITS
  x86/its: FineIBT-paranoid vs ITS
  x86/its: Use dynamic thunks for indirect branches
  x86/ibt: Keep IBT disabled during alternative patching
  mm/execmem: Unify early execmem_cache behaviour
  x86/its: Align RETs in BHB clear sequence to avoid thunking
  x86/its: Add support for RSB stuffing mitigation
  x86/its: Add "vmexit" option to skip mitigation on some CPUs
  x86/its: Enable Indirect Target Selection mitigation
  x86/its: Add support for ITS-safe return thunk
  x86/its: Add support for ITS-safe indirect thunk
  x86/its: Enumerate Indirect Target Selection (ITS) bug
  Documentation: x86/bugs/its: Add ITS documentation
2025-05-11 17:23:03 -07:00
Linus Torvalds
caf12fa9c0 Merge tag 'ibti-hisory-for-linus-2025-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 IBTI mitigation from Dave Hansen:
 "Mitigate Intra-mode Branch History Injection via classic BFP programs

  This adds the branch history clearing mitigation to cBPF programs for
  x86. Intra-mode BHI attacks via cBPF a.k.a IBTI-History was reported
  by researchers at VUSec.

  For hardware that doesn't support BHI_DIS_S, the recommended
  mitigation is to run the short software sequence followed by the IBHF
  instruction after cBPF execution. On hardware that does support
  BHI_DIS_S, enable BHI_DIS_S and execute the IBHF after cBPF execution.

  The Indirect Branch History Fence (IBHF) is a new instruction that
  prevents indirect branch target predictions after the barrier from
  using branch history from before the barrier while BHI_DIS_S is
  enabled. On older systems this will map to a NOP. It is recommended to
  add this fence at the end of the cBPF program to support VM migration.
  This instruction is required on newer parts with BHI_NO to fully
  mitigate against these attacks.

  The current code disables the mitigation for anything running with the
  SYS_ADMIN capability bit set. The intention was not to waste time
  mitigating a process that has access to anything it wants anyway"

* tag 'ibti-hisory-for-linus-2025-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/bhi: Do not set BHI_DIS_S in 32-bit mode
  x86/bpf: Add IBHF call at end of classic BPF
  x86/bpf: Call branch history clearing sequence on exit
2025-05-11 17:17:06 -07:00
Linus Torvalds
82f2b0b97b Linux 6.15-rc6 2025-05-11 14:54:11 -07:00
Linus Torvalds
cd802e7e5f Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
 "ARM:

   - Avoid use of uninitialized memcache pointer in user_mem_abort()

   - Always set HCR_EL2.xMO bits when running in VHE, allowing
     interrupts to be taken while TGE=0 and fixing an ugly bug on
     AmpereOne that occurs when taking an interrupt while clearing the
     xMO bits (AC03_CPU_36)

   - Prevent VMMs from hiding support for AArch64 at any EL virtualized
     by KVM

   - Save/restore the host value for HCRX_EL2 instead of restoring an
     incorrect fixed value

   - Make host_stage2_set_owner_locked() check that the entire requested
     range is memory rather than just the first page

  RISC-V:

   - Add missing reset of smstateen CSRs

  x86:

   - Forcibly leave SMM on SHUTDOWN interception on AMD CPUs to avoid
     causing problems due to KVM stuffing INIT on SHUTDOWN (KVM needs to
     sanitize the VMCB as its state is undefined after SHUTDOWN,
     emulating INIT is the least awful choice).

   - Track the valid sync/dirty fields in kvm_run as a u64 to ensure KVM
     KVM doesn't goof a sanity check in the future.

   - Free obsolete roots when (re)loading the MMU to fix a bug where
     pre-faulting memory can get stuck due to always encountering a
     stale root.

   - When dumping GHCB state, use KVM's snapshot instead of the raw GHCB
     page to print state, so that KVM doesn't print stale/wrong
     information.

   - When changing memory attributes (e.g. shared <=> private), add
     potential hugepage ranges to the mmu_invalidate_range_{start,end}
     set so that KVM doesn't create a shared/private hugepage when the
     the corresponding attributes will become mixed (the attributes are
     commited *after* KVM finishes the invalidation).

   - Rework the SRSO mitigation to enable BP_SPEC_REDUCE only when KVM
     has at least one active VM. Effectively BP_SPEC_REDUCE when KVM is
     loaded led to very measurable performance regressions for non-KVM
     workloads"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: SVM: Set/clear SRSO's BP_SPEC_REDUCE on 0 <=> 1 VM count transitions
  KVM: arm64: Fix memory check in host_stage2_set_owner_locked()
  KVM: arm64: Kill HCRX_HOST_FLAGS
  KVM: arm64: Properly save/restore HCRX_EL2
  KVM: arm64: selftest: Don't try to disable AArch64 support
  KVM: arm64: Prevent userspace from disabling AArch64 support at any virtualisable EL
  KVM: arm64: Force HCR_EL2.xMO to 1 at all times in VHE mode
  KVM: arm64: Fix uninitialized memcache pointer in user_mem_abort()
  KVM: x86/mmu: Prevent installing hugepages when mem attributes are changing
  KVM: SVM: Update dump_ghcb() to use the GHCB snapshot fields
  KVM: RISC-V: reset smstateen CSRs
  KVM: x86/mmu: Check and free obsolete roots in kvm_mmu_reload()
  KVM: x86: Check that the high 32bits are clear in kvm_arch_vcpu_ioctl_run()
  KVM: SVM: Forcibly leave SMM mode on SHUTDOWN interception
2025-05-11 11:30:13 -07:00
Linus Torvalds
ecb9194d17 Merge tag 'mips-fixes_6.15_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:

 - Fix delayed timers

 - Fix NULL pointer deref

 - Fix wrong range check

* tag 'mips-fixes_6.15_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: Fix MAX_REG_OFFSET
  MIPS: CPS: Fix potential NULL pointer dereferences in cps_prepare_cpus()
  MIPS: rename rollback_handler with skipover_handler
  MIPS: Move r4k_wait() to .cpuidle.text section
  MIPS: Fix idle VS timer enqueue
2025-05-11 11:19:52 -07:00
Linus Torvalds
b9e62a2b8f Merge tag 'x86-urgent-2025-05-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Ingo Molnar:
 "Fix a boot regression on very old x86 CPUs without CPUID support"

* tag 'x86-urgent-2025-05-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode: Consolidate the loader enablement checking
2025-05-11 11:08:55 -07:00
Linus Torvalds
ac814cbbab Merge tag 'timers-urgent-2025-05-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc timers fixes from Ingo Molnar:

 - Fix time keeping bugs in CLOCK_MONOTONIC_COARSE clocks

 - Work around absolute relocations into vDSO code that GCC erroneously
   emits in certain arm64 build environments

 - Fix a false positive lockdep warning in the i8253 clocksource driver

* tag 'timers-urgent-2025-05-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource/i8253: Use raw_spinlock_irqsave() in clockevent_i8253_disable()
  arm64: vdso: Work around invalid absolute relocations from GCC
  timekeeping: Prevent coarse clocks going backwards
2025-05-11 10:33:25 -07:00
Linus Torvalds
fea9123979 Merge tag 'input-for-v6.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:

 - Synaptics touchpad on multiple laptops (Dynabook Portege X30L-G,
   Dynabook Portege X30-D, TUXEDO InfinityBook Pro 14 v5, Dell Precision
   M3800, HP Elitebook 850 G1) switched from PS/2 to SMBus mode

 - a number of new controllers added to xpad driver: HORI Drum
   controller, PowerA Fusion Pro 4, PowerA MOGA XP-Ultra controller,
   8BitDo Ultimate 2 Wireless Controller, 8BitDo Ultimate 3-mode
   Controller, Hyperkin DuchesS Xbox One controller

 - fixes to xpad driver to properly handle Mad Catz JOYTECH NEO SE
   Advanced and PDP Mirror's Edge Official controllers

 - fixes to xpad driver to properly handle "Share" button on some
   controllers

 - a fix for device initialization timing and for waking up the
   controller in cyttsp5 driver

 - a fix for hisi_powerkey driver to properly wake up from s2idle state

 - other assorted cleanups and fixes

* tag 'input-for-v6.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: xpad - fix xpad_device sorting
  Input: xpad - add support for several more controllers
  Input: xpad - fix Share button on Xbox One controllers
  Input: xpad - fix two controller table values
  Input: hisi_powerkey - enable system-wakeup for s2idle
  Input: synaptics - enable InterTouch on Dell Precision M3800
  Input: synaptics - enable InterTouch on TUXEDO InfinityBook Pro 14 v5
  Input: synaptics - enable InterTouch on Dynabook Portege X30L-G
  Input: synaptics - enable InterTouch on Dynabook Portege X30-D
  Input: synaptics - enable SMBus for HP Elitebook 850 G1
  Input: mtk-pmic-keys - fix possible null pointer dereference
  Input: xpad - add support for 8BitDo Ultimate 2 Wireless Controller
  Input: cyttsp5 - fix power control issue on wakeup
  MAINTAINERS: .mailmap: update Mattijs Korpershoek's email address
  dt-bindings: mediatek,mt6779-keypad: Update Mattijs' email address
  Input: stmpe-ts - use module alias instead of device table
  Input: cyttsp5 - ensure minimum reset pulse width
  Input: sparcspkr - avoid unannotated fall-through
  input/joystick: magellan: Mark __nonstring look-up table
2025-05-11 10:29:29 -07:00
Linus Torvalds
f717acc6e9 Merge tag 'fixes-2025-05-11' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock fixes from Mike Rapoport:

 - Mark set_high_memory() as __init to fix section mismatch

 - Accept memory allocated in memblock_double_array() to mitigate crash
   of SNP guests

* tag 'fixes-2025-05-11' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  memblock: Accept allocated memory before use in memblock_double_array()
  mm,mm_init: Mark set_high_memory as __init
2025-05-11 10:23:53 -07:00
Takashi Iwai
dd33993a97 ALSA: ump: Fix a typo of snd_ump_stream_msg_device_info
s/devince/device/

It's used only internally, so no any behavior changes.

Fixes: 37e0e14128 ("ALSA: ump: Support UMP Endpoint and Function Block parsing")
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20250511141147.10246-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-05-11 16:27:55 +02:00
Peter Ujfalusi
1f93d877f0 ALSA/hda: intel-sdw-acpi: Correct sdw_intel_acpi_scan() function parameter
The acpi_handle should be just a handle and not a pointer in
sdw_intel_acpi_scan() parameter list.
It is called with 'acpi_handle handle' as parameter and it is passing it to
acpi_walk_namespace, which also expects acpi_handle and not  acpi_handle*

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Link: https://patch.msgid.link/20250508181207.22113-1-peter.ujfalusi@linux.intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-05-11 16:14:21 +02:00
Takashi Iwai
ff7b190aef ALSA: seq: Fix delivery of UMP events to group ports
When an event with UMP message is sent to a UMP client, the EP port
receives always no matter where the event is sent to, as it's a
catch-all port.  OTOH, if an event is sent to EP port, and if the
event has a certain UMP Group, it should have been delivered to the
associated UMP Group port, too, but this was ignored, so far.

This patch addresses the behavior.  Now a UMP event sent to the
Endpoint port will be delivered to the subscribers of the UMP group
port the event is associated with.

The patch also does a bit of refactoring to simplify the code about
__deliver_to_subscribers().

Fixes: 177ccf811d ("ALSA: seq: Support MIDI 2.0 UMP Endpoint port")
Link: https://patch.msgid.link/20250511134528.6314-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-05-11 15:58:06 +02:00
Seongman Lee
f7387eff4b x86/sev: Fix operator precedence in GHCB_MSR_VMPL_REQ_LEVEL macro
The GHCB_MSR_VMPL_REQ_LEVEL macro lacked parentheses around the bitmask
expression, causing the shift operation to bind too early. As a result,
when requesting VMPL1 (e.g., GHCB_MSR_VMPL_REQ_LEVEL(1)), incorrect
values such as 0x000000016 were generated instead of the intended
0x100000016 (the requested VMPL level is specified in GHCBData[39:32]).

Fix the precedence issue by grouping the masked value before applying
the shift.

  [ bp: Massage commit message. ]

Fixes: 34ff659017 ("x86/sev: Use kernel provided SVSM Calling Areas")
Signed-off-by: Seongman Lee <augustus92@kaist.ac.kr>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250511092329.12680-1-cloudlee1719@gmail.com
2025-05-11 11:38:03 +02:00
Vicki Pfau
93406e9d02 Input: xpad - fix xpad_device sorting
A recent commit put one entry in the wrong place. This just moves it to the
right place.

Signed-off-by: Vicki Pfau <vi@endrift.com>
Link: https://lore.kernel.org/r/20250328234345.989761-5-vi@endrift.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-05-10 23:08:19 -07:00
Vicki Pfau
7026d23cb3 Input: xpad - add support for several more controllers
This adds support for several new controllers, all of which include
Share buttons:

- HORI Drum controller
- PowerA Fusion Pro 4
- 8BitDo Ultimate 3-mode Controller
- Hyperkin DuchesS Xbox One controller
- PowerA MOGA XP-Ultra controller

Signed-off-by: Vicki Pfau <vi@endrift.com>
Link: https://lore.kernel.org/r/20250328234345.989761-4-vi@endrift.com
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-05-10 23:08:18 -07:00
Vicki Pfau
4ef4636707 Input: xpad - fix Share button on Xbox One controllers
The Share button, if present, is always one of two offsets from the end of the
file, depending on the presence of a specific interface. As we lack parsing for
the identify packet we can't automatically determine the presence of that
interface, but we can hardcode which of these offsets is correct for a given
controller.

More controllers are probably fixable by adding the MAP_SHARE_BUTTON in the
future, but for now I only added the ones that I have the ability to test
directly.

Signed-off-by: Vicki Pfau <vi@endrift.com>
Link: https://lore.kernel.org/r/20250328234345.989761-2-vi@endrift.com
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-05-10 23:08:17 -07:00
Vicki Pfau
d05a424bea Input: xpad - fix two controller table values
Two controllers -- Mad Catz JOYTECH NEO SE Advanced and PDP Mirror's
Edge Official -- were missing the value of the mapping field, and thus
wouldn't detect properly.

Signed-off-by: Vicki Pfau <vi@endrift.com>
Link: https://lore.kernel.org/r/20250328234345.989761-1-vi@endrift.com
Fixes: 540602a43a ("Input: xpad - add a few new VID/PID combinations")
Fixes: 3492321e2e ("Input: xpad - add multiple supported devices")
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-05-10 22:54:18 -07:00
Ulf Hansson
e98960bc4d Input: hisi_powerkey - enable system-wakeup for s2idle
To wake up the system from s2idle when pressing the power-button, let's
convert from using pm_wakeup_event() to pm_wakeup_dev_event(), as it allows
us to specify the "hard" in-parameter, which needs to be set for s2idle.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://lore.kernel.org/r/20250306115021.797426-1-ulf.hansson@linaro.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-05-10 22:26:33 -07:00
Linus Torvalds
3ce9925823 Merge tag 'mm-hotfixes-stable-2025-05-10-14-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc hotfixes from Andrew Morton:
 "22 hotfixes. 13 are cc:stable and the remainder address post-6.14
  issues or aren't considered necessary for -stable kernels.

  About half are for MM. Five OCFS2 fixes and a few MAINTAINERS updates"

* tag 'mm-hotfixes-stable-2025-05-10-14-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (22 commits)
  mm: fix folio_pte_batch() on XEN PV
  nilfs2: fix deadlock warnings caused by lock dependency in init_nilfs()
  mm/hugetlb: copy the CMA flag when demoting
  mm, swap: fix false warning for large allocation with !THP_SWAP
  selftests/mm: fix a build failure on powerpc
  selftests/mm: fix build break when compiling pkey_util.c
  mm: vmalloc: support more granular vrealloc() sizing
  tools/testing/selftests: fix guard region test tmpfs assumption
  ocfs2: stop quota recovery before disabling quotas
  ocfs2: implement handshaking with ocfs2 recovery thread
  ocfs2: switch osb->disable_recovery to enum
  mailmap: map Uwe's BayLibre addresses to a single one
  MAINTAINERS: add mm THP section
  mm/userfaultfd: fix uninitialized output field for -EAGAIN race
  selftests/mm: compaction_test: support platform with huge mount of memory
  MAINTAINERS: add core mm section
  ocfs2: fix panic in failed foilio allocation
  mm/huge_memory: fix dereferencing invalid pmd migration entry
  MAINTAINERS: add reverse mapping section
  x86: disable image size check for test builds
  ...
2025-05-10 15:50:56 -07:00
Linus Torvalds
345030986d Merge tag 'driver-core-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
Pull driver core fix from Greg KH:
 "Here is a single driver core fix for a regression for platform devices
  that is a regression from a change that went into 6.15-rc1 that
  affected Pixel devices. It has been in linux-next for over a week with
  no reported problems"

* tag 'driver-core-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core:
  platform: Fix race condition during DMA configure at IOMMU probe time
2025-05-10 09:53:11 -07:00
Linus Torvalds
bec6f00f12 Merge tag 'usb-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
 "Here are some small USB driver fixes for 6.15-rc6. Included in here
  are:

   - typec driver fixes

   - usbtmc ioctl fixes

   - xhci driver fixes

   - cdnsp driver fixes

   - some gadget driver fixes

  Nothing really major, just all little stuff that people have reported
  being issues. All of these have been in linux-next this week with no
  reported issues"

* tag 'usb-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  xhci: dbc: Avoid event polling busyloop if pending rx transfers are inactive.
  usb: xhci: Don't trust the EP Context cycle bit when moving HW dequeue
  usb: usbtmc: Fix erroneous generic_read ioctl return
  usb: usbtmc: Fix erroneous wait_srq ioctl return
  usb: usbtmc: Fix erroneous get_stb ioctl error returns
  usb: typec: tcpm: delay SNK_TRY_WAIT_DEBOUNCE to SRC_TRYWAIT transition
  USB: usbtmc: use interruptible sleep in usbtmc_read
  usb: cdnsp: fix L1 resume issue for RTL_REVISION_NEW_LPM version
  usb: typec: ucsi: displayport: Fix NULL pointer access
  usb: typec: ucsi: displayport: Fix deadlock
  usb: misc: onboard_usb_dev: fix support for Cypress HX3 hubs
  usb: uhci-platform: Make the clock really optional
  usb: dwc3: gadget: Make gadget_wakeup asynchronous
  usb: gadget: Use get_status callback to set remote wakeup capability
  usb: gadget: f_ecm: Add get_status callback
  usb: host: tegra: Prevent host controller crash when OTG port is used
  usb: cdnsp: Fix issue with resuming from L1
  usb: gadget: tegra-xudc: ACK ST_RC after clearing CTRL_RUN
2025-05-10 09:18:05 -07:00
Linus Torvalds
067dd5875d Merge tag 'staging-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
 "Here are three small staging driver fixes for 6.15-rc6. These are:

   - bcm2835-camera driver fix

   - two axis-fifo driver fixes

  All of these have been in linux-next for a few weeks with no reported
  issues"

* tag 'staging-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: axis-fifo: Remove hardware resets for user errors
  staging: axis-fifo: Correct handling of tx_fifo_depth for size validation
  staging: bcm2835-camera: Initialise dev in v4l2_dev
2025-05-10 09:08:19 -07:00
Linus Torvalds
914a1fe5f8 Merge tag 'char-misc-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc/IIO driver fixes from Greg KH:
 "Here are a bunch of small driver fixes (mostly all IIO) for 6.15-rc6.
  Included in here are:

   - loads of tiny IIO driver fixes for reported issues

   - hyperv driver fix for a much-reported and worked on sysfs ring
     buffer creation bug

  All of these have been in linux-next for over a week (the IIO ones for
  many weeks now), with no reported issues"

* tag 'char-misc-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (30 commits)
  Drivers: hv: Make the sysfs node size for the ring buffer dynamic
  uio_hv_generic: Fix sysfs creation path for ring buffer
  iio: adis16201: Correct inclinometer channel resolution
  iio: adc: ad7606: fix serial register access
  iio: pressure: mprls0025pa: use aligned_s64 for timestamp
  iio: imu: adis16550: align buffers for timestamp
  staging: iio: adc: ad7816: Correct conditional logic for store mode
  iio: adc: ad7266: Fix potential timestamp alignment issue.
  iio: adc: ad7768-1: Fix insufficient alignment of timestamp.
  iio: adc: dln2: Use aligned_s64 for timestamp
  iio: accel: adxl355: Make timestamp 64-bit aligned using aligned_s64
  iio: temp: maxim-thermocouple: Fix potential lack of DMA safe buffer.
  iio: chemical: pms7003: use aligned_s64 for timestamp
  iio: chemical: sps30: use aligned_s64 for timestamp
  iio: imu: inv_mpu6050: align buffer for timestamp
  iio: imu: st_lsm6dsx: Fix wakeup source leaks on device unbind
  iio: adc: qcom-spmi-iadc: Fix wakeup source leaks on device unbind
  iio: accel: fxls8962af: Fix wakeup source leaks on device unbind
  iio: adc: ad7380: fix event threshold shift
  iio: hid-sensor-prox: Fix incorrect OFFSET calculation
  ...
2025-05-10 08:55:15 -07:00
Linus Torvalds
ed36b437d9 Merge tag 'i2c-for-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:

 - omap: use correct function to read from device tree

 - MAINTAINERS: remove Seth from ISMT maintainership

* tag 'i2c-for-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  MAINTAINERS: Remove entry for Seth Heasley
  i2c: omap: fix deprecated of_property_read_bool() use
2025-05-10 08:52:41 -07:00
Linus Torvalds
86c019597c Merge tag 'for-linus-6.15a-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:

 - A fix for the xenbus driver allowing to use a PVH Dom0 with
   Xenstore running in another domain

 - A fix for the xenbus driver addressing a rare race condition
   resulting in NULL dereferences and other problems

 - A fix for the xen-swiotlb driver fixing a problem seen on Arm
   platforms

* tag 'for-linus-6.15a-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xenbus: Use kref to track req lifetime
  xenbus: Allow PVH dom0 a non-local xenstore
  xen: swiotlb: Use swiotlb bouncing if kmalloc allocation demands it
2025-05-10 08:44:36 -07:00
Linus Torvalds
acbf235235 Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull mount fixes from Al Viro:
 "A couple of races around legalize_mnt vs umount (both fairly old and
  hard to hit) plus two bugs in move_mount(2) - both around 'move
  detached subtree in place' logics"

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fix IS_MNT_PROPAGATING uses
  do_move_mount(): don't leak MNTNS_PROPAGATING on failures
  do_umount(): add missing barrier before refcount checks in sync case
  __legitimize_mnt(): check for MNT_SYNC_UMOUNT should be under mount_lock
2025-05-10 08:36:07 -07:00
Paolo Bonzini
add20321af Merge tag 'kvm-x86-fixes-6.15-rcN' of https://github.com/kvm-x86/linux into HEAD
KVM x86 fixes for 6.15-rcN

 - Forcibly leave SMM on SHUTDOWN interception on AMD CPUs to avoid causing
   problems due to KVM stuffing INIT on SHUTDOWN (KVM needs to sanitize the
   VMCB as its state is undefined after SHUTDOWN, emulating INIT is the
   least awful choice).

 - Track the valid sync/dirty fields in kvm_run as a u64 to ensure KVM
   KVM doesn't goof a sanity check in the future.

 - Free obsolete roots when (re)loading the MMU to fix a bug where
   pre-faulting memory can get stuck due to always encountering a stale
   root.

 - When dumping GHCB state, use KVM's snapshot instead of the raw GHCB page
   to print state, so that KVM doesn't print stale/wrong information.

 - When changing memory attributes (e.g. shared <=> private), add potential
   hugepage ranges to the mmu_invalidate_range_{start,end} set so that KVM
   doesn't create a shared/private hugepage when the the corresponding
   attributes will become mixed (the attributes are commited *after* KVM
   finishes the invalidation).

 - Rework the SRSO mitigation to enable BP_SPEC_REDUCE only when KVM has at
   least one active VM.  Effectively BP_SPEC_REDUCE when KVM is loaded led
   to very measurable performance regressions for non-KVM workloads.
2025-05-10 11:11:06 -04:00
Paolo Bonzini
36867c0e94 Merge tag 'kvmarm-fixes-6.15-3' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.15, round #3

 - Avoid use of uninitialized memcache pointer in user_mem_abort()

 - Always set HCR_EL2.xMO bits when running in VHE, allowing interrupts
   to be taken while TGE=0 and fixing an ugly bug on AmpereOne that
   occurs when taking an interrupt while clearing the xMO bits
   (AC03_CPU_36)

 - Prevent VMMs from hiding support for AArch64 at any EL virtualized by
   KVM

 - Save/restore the host value for HCRX_EL2 instead of restoring an
   incorrect fixed value

 - Make host_stage2_set_owner_locked() check that the entire requested
   range is memory rather than just the first page
2025-05-10 11:10:02 -04:00
Paolo Bonzini
6a74470b33 Merge tag 'kvm-riscv-fixes-6.15-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv fixes for 6.15, take #1

- Add missing reset of smstateen CSRs
2025-05-10 11:09:26 -04:00
Wolfram Sang
b6c08bcddb Merge tag 'i2c-host-fixes-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current
i2c-host-fixes for v6.15-rc6

- omap: use correct function to read from device tree
- MAINTAINERS: remove Seth from ISMT maintainership
2025-05-10 11:41:13 +02:00
Arnd Bergmann
15eaaa71e8 Merge tag 'imx-fixes-6.15-2' of https://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes
i.MX fixes for 6.15, 2nd round:

- One more i.MX8MP nominal drive mode DT fix from Ahmad Fatoum to use
  800MHz NoC OPP
- A imx8mp-var-som DT change from Himanshu Bhavani to fix SD card
  timeout caused by LDO5

* tag 'imx-fixes-6.15-2' of https://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  arm64: dts: imx8mp-var-som: Fix LDO5 shutdown causing SD card timeout
  arm64: dts: imx8mp: use 800MHz NoC OPP for nominal drive mode

Link: https://lore.kernel.org/r/aB6h/woeyG1bSo12@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-05-10 11:10:38 +02:00
Jakub Kicinski
4d64321c4f Merge tag 'batadv-net-pullrequest-20250509' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:

====================
Here is a batman-adv bugfix:

 - fix duplicate MAC address check, by Matthias Schiffer

* tag 'batadv-net-pullrequest-20250509' of git://git.open-mesh.org/linux-merge:
  batman-adv: fix duplicate MAC address check
====================

Link: https://patch.msgid.link/20250509090240.107796-1-sw@simonwunderlich.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-09 17:09:39 -07:00
Linus Torvalds
1a33418a69 Merge tag '6.15-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:

 - Fix dentry leak which can cause umount crash

 - Add warning for parse contexts error on compounded operation

* tag '6.15-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb: client: Avoid race in open_cached_dir with lease breaks
  smb3 client: warn when parse contexts returns error on compounded operation
2025-05-09 16:45:21 -07:00
Paul Cacheux
e41b5af451 tracing: add missing trace_probe_log_clear for eprobes
Make sure trace_probe_log_clear is called in the tracing
eprobe code path, matching the trace_probe_log_init call.

Link: https://lore.kernel.org/all/20250504-fix-trace-probe-log-race-v3-1-9e99fec7eddc@gmail.com/

Signed-off-by: Paul Cacheux <paulcacheux@gmail.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2025-05-10 08:44:50 +09:00
Breno Leitao
9dda18a32b tracing: fprobe: Fix RCU warning message in list traversal
When CONFIG_PROVE_RCU_LIST is enabled, fprobe triggers the following
warning:

    WARNING: suspicious RCU usage
    kernel/trace/fprobe.c:457 RCU-list traversed in non-reader section!!

    other info that might help us debug this:
	#1: ffffffff863c4e08 (fprobe_mutex){+.+.}-{4:4}, at: fprobe_module_callback+0x7b/0x8c0

    Call Trace:
	fprobe_module_callback
	notifier_call_chain
	blocking_notifier_call_chain

This warning occurs because fprobe_remove_node_in_module() traverses an
RCU list using RCU primitives without holding an RCU read lock. However,
the function is only called from fprobe_module_callback(), which holds
the fprobe_mutex lock that provides sufficient protection for safely
traversing the list.

Fix the warning by specifying the locking design to the
CONFIG_PROVE_RCU_LIST mechanism. Add the lockdep_is_held() argument to
hlist_for_each_entry_rcu() to inform the RCU checker that fprobe_mutex
provides the required protection.

Link: https://lore.kernel.org/all/20250410-fprobe-v1-1-068ef5f41436@debian.org/

Fixes: a3dc2983ca ("tracing: fprobe: Cleanup fprobe hash when module unloading")
Signed-off-by: Breno Leitao <leitao@debian.org>
Tested-by: Antonio Quartulli <antonio@mandelbit.com>
Tested-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2025-05-10 08:28:02 +09:00
Andrew Jeffery
e4f349bd6e net: mctp: Ensure keys maintain only one ref to corresponding dev
mctp_flow_prepare_output() is called in mctp_route_output(), which
places outbound packets onto a given interface. The packet may represent
a message fragment, in which case we provoke an unbalanced reference
count to the underlying device. This causes trouble if we ever attempt
to remove the interface:

    [   48.702195] usb 1-1: USB disconnect, device number 2
    [   58.883056] unregister_netdevice: waiting for mctpusb0 to become free. Usage count = 2
    [   69.022548] unregister_netdevice: waiting for mctpusb0 to become free. Usage count = 2
    [   79.172568] unregister_netdevice: waiting for mctpusb0 to become free. Usage count = 2
    ...

Predicate the invocation of mctp_dev_set_key() in
mctp_flow_prepare_output() on not already having associated the device
with the key. It's not yet realistic to uphold the property that the key
maintains only one device reference earlier in the transmission sequence
as the route (and therefore the device) may not be known at the time the
key is associated with the socket.

Fixes: 67737c4572 ("mctp: Pass flow data & flow release events to drivers")
Acked-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Link: https://patch.msgid.link/20250508-mctp-dev-refcount-v1-1-d4f965c67bb5@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-09 16:22:53 -07:00
Al Viro
d1ddc6f1d9 fix IS_MNT_PROPAGATING uses
propagate_mnt() does not attach anything to mounts created during
propagate_mnt() itself.  What's more, anything on ->mnt_slave_list
of such new mount must also be new, so we don't need to even look
there.

When move_mount() had been introduced, we've got an additional
class of mounts to skip - if we are moving from anon namespace,
we do not want to propagate to mounts we are moving (i.e. all
mounts in that anon namespace).

Unfortunately, the part about "everything on their ->mnt_slave_list
will also be ignorable" is not true - if we have propagation graph
	A -> B -> C
and do OPEN_TREE_CLONE open_tree() of B, we get
	A -> [B <-> B'] -> C
as propagation graph, where B' is a clone of B in our detached tree.
Making B private will result in
	A -> B' -> C
C still gets propagation from A, as it would after making B private
if we hadn't done that open_tree(), but now the propagation goes
through B'.  Trying to move_mount() our detached tree on subdirectory
in A should have
	* moved B' on that subdirectory in A
	* skipped the corresponding subdirectory in B' itself
	* copied B' on the corresponding subdirectory in C.
As it is, the logics in propagation_next() and friends ends up
skipping propagation into C, since it doesn't consider anything
downstream of B'.

IOW, walking the propagation graph should only skip the ->mnt_slave_list
of new mounts; the only places where the check for "in that one
anon namespace" are applicable are propagate_one() (where we should
treat that as the same kind of thing as "mountpoint we are looking
at is not visible in the mount we are looking at") and
propagation_would_overmount().  The latter is better dealt with
in the caller (can_move_mount_beneath()); on the first call of
propagation_would_overmount() the test is always false, on the
second it is always true in "move from anon namespace" case and
always false in "move within our namespace" one, so it's easier
to just use check_mnt() before bothering with the second call and
be done with that.

Fixes: 064fe6e233 ("mount: handle mount propagation for detached mount trees")
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-05-09 18:06:27 -04:00
Al Viro
267fc3a06a do_move_mount(): don't leak MNTNS_PROPAGATING on failures
as it is, a failed move_mount(2) from anon namespace breaks
all further propagation into that namespace, including normal
mounts in non-anon namespaces that would otherwise propagate
there.

Fixes: 064fe6e233 ("mount: handle mount propagation for detached mount trees")
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-05-09 18:06:10 -04:00
Al Viro
65781e19dc do_umount(): add missing barrier before refcount checks in sync case
do_umount() analogue of the race fixed in 119e1ef80e "fix
__legitimize_mnt()/mntput() race".  Here we want to make sure that
if __legitimize_mnt() doesn't notice our lock_mount_hash(), we will
notice their refcount increment.  Harder to hit than mntput_no_expire()
one, fortunately, and consequences are milder (sync umount acting
like umount -l on a rare race with RCU pathwalk hitting at just the
wrong time instead of use-after-free galore mntput_no_expire()
counterpart used to be hit).  Still a bug...

Fixes: 48a066e72d ("RCU'd vfsmounts")
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-05-09 18:05:55 -04:00
Cosmin Ratiu
97c4e094a4 tests/ncdevmem: Fix double-free of queue array
netdev_bind_rx takes ownership of the queue array passed as parameter
and frees it, so a queue array buffer cannot be reused across multiple
netdev_bind_rx calls.

This commit fixes that by always passing in a newly created queue array
to all netdev_bind_rx calls in ncdevmem.

Fixes: 85585b4bc8 ("selftests: add ncdevmem, netcat for devmem TCP")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Link: https://patch.msgid.link/20250508084434.1933069-1-cratiu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-09 15:05:07 -07:00
Matt Johnston
f11cf946c0 net: mctp: Don't access ifa_index when missing
In mctp_dump_addrinfo, ifa_index can be used to filter interfaces, but
only when the struct ifaddrmsg is provided. Otherwise it will be
comparing to uninitialised memory - reproducible in the syzkaller case from
dhcpd, or busybox "ip addr show".

The kernel MCTP implementation has always filtered by ifa_index, so
existing userspace programs expecting to dump MCTP addresses must
already be passing a valid ifa_index value (either 0 or a real index).

BUG: KMSAN: uninit-value in mctp_dump_addrinfo+0x208/0xac0 net/mctp/device.c:128
 mctp_dump_addrinfo+0x208/0xac0 net/mctp/device.c:128
 rtnl_dump_all+0x3ec/0x5b0 net/core/rtnetlink.c:4380
 rtnl_dumpit+0xd5/0x2f0 net/core/rtnetlink.c:6824
 netlink_dump+0x97b/0x1690 net/netlink/af_netlink.c:2309

Fixes: 583be982d9 ("mctp: Add device handling and netlink interface")
Reported-by: syzbot+e76d52dadc089b9d197f@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68135815.050a0220.3a872c.000e.GAE@google.com/
Reported-by: syzbot+1065a199625a388fce60@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/681357d6.050a0220.14dd7d.000d.GAE@google.com/
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Link: https://patch.msgid.link/20250508-mctp-addr-dump-v2-1-c8a53fd2dd66@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-09 15:03:53 -07:00
Hangbin Liu
45375814eb tools/net/ynl: ethtool: fix crash when Hardware Clock info is missing
Fix a crash in the ethtool YNL implementation when Hardware Clock information
is not present in the response. This ensures graceful handling of devices or
drivers that do not provide this optional field. e.g.

  Traceback (most recent call last):
    File "/net/tools/net/ynl/pyynl/./ethtool.py", line 438, in <module>
      main()
      ~~~~^^
    File "/net/tools/net/ynl/pyynl/./ethtool.py", line 341, in main
      print(f'PTP Hardware Clock: {tsinfo["phc-index"]}')
                                   ~~~~~~^^^^^^^^^^^^^
  KeyError: 'phc-index'

Fixes: f3d07b02b2 ("tools: ynl: ethtool testing tool")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250508035414.82974-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-09 15:03:07 -07:00
Al Viro
250cf36930 __legitimize_mnt(): check for MNT_SYNC_UMOUNT should be under mount_lock
... or we risk stealing final mntput from sync umount - raising mnt_count
after umount(2) has verified that victim is not busy, but before it
has set MNT_SYNC_UMOUNT; in that case __legitimize_mnt() doesn't see
that it's safe to quietly undo mnt_count increment and leaves dropping
the reference to caller, where it'll be a full-blown mntput().

Check under mount_lock is needed; leaving the current one done before
taking that makes no sense - it's nowhere near common enough to bother
with.

Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-05-09 18:02:12 -04:00
Linus Torvalds
0e1329d404 Merge tag 'rust-fixes-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull rust fixes from Miguel Ojeda:

 - Make CFI_AUTO_DEFAULT depend on !RUST or Rust >= 1.88.0

 - Clean Rust (and Clippy) lints for the upcoming Rust 1.87.0 and 1.88.0
   releases

 - Clean objtool warning for the upcoming Rust 1.87.0 release by adding
   one more noreturn function

* tag 'rust-fixes-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
  x86/Kconfig: make CFI_AUTO_DEFAULT depend on !RUST or Rust >= 1.88
  rust: clean Rust 1.88.0's `clippy::uninlined_format_args` lint
  rust: clean Rust 1.88.0's warning about `clippy::disallowed_macros` configuration
  rust: clean Rust 1.88.0's `unnecessary_transmutes` lint
  rust: allow Rust 1.87.0's `clippy::ptr_eq` lint
  objtool/rust: add one more `noreturn` Rust function for Rust 1.87.0
2025-05-09 14:06:34 -07:00
Pawan Gupta
7a9b709e7c selftest/x86/bugs: Add selftests for ITS
Below are the tests added for Indirect Target Selection (ITS):

- its_sysfs.py - Check if sysfs reflects the correct mitigation status for
  the mitigation selected via the kernel cmdline.

- its_permutations.py - tests mitigation selection with cmdline
  permutations with other bugs like spectre_v2 and retbleed.

- its_indirect_alignment.py - verifies that for addresses in
  .retpoline_sites section that belong to lower half of cacheline are
  patched to ITS-safe thunk. Typical output looks like below:

  Site 49: function symbol: __x64_sys_restart_syscall+0x1f <0xffffffffbb1509af>
  #     vmlinux: 0xffffffff813509af:    jmp     0xffffffff81f5a8e0
  #     kcore:   0xffffffffbb1509af:    jmpq    *%rax
  #     ITS thunk NOT expected for site 49
  #     PASSED: Found *%rax
  #
  Site 50: function symbol: __resched_curr+0xb0 <0xffffffffbb181910>
  #     vmlinux: 0xffffffff81381910:    jmp     0xffffffff81f5a8e0
  #     kcore:   0xffffffffbb181910:    jmp     0xffffffffc02000fc
  #     ITS thunk expected for site 50
  #     PASSED: Found 0xffffffffc02000fc -> jmpq *%rax <scattered-thunk?>

- its_ret_alignment.py - verifies that for addresses in .return_sites
  section that belong to lower half of cacheline are patched to
  its_return_thunk. Typical output looks like below:

  Site 97: function symbol: collect_event+0x48 <0xffffffffbb007f18>
  #     vmlinux: 0xffffffff81207f18:    jmp     0xffffffff81f5b500
  #     kcore:   0xffffffffbb007f18:    jmp     0xffffffffbbd5b560
  #     PASSED: Found jmp 0xffffffffbbd5b560 <its_return_thunk>
  #
  Site 98: function symbol: collect_event+0xa4 <0xffffffffbb007f74>
  #     vmlinux: 0xffffffff81207f74:    jmp     0xffffffff81f5b500
  #     kcore:   0xffffffffbb007f74:    retq
  #     PASSED: Found retq

Some of these tests have dependency on tools like virtme-ng[1] and drgn[2].
When the dependencies are not met, the test will be skipped.

[1] https://github.com/arighi/virtme-ng
[2] https://github.com/osandov/drgn

Co-developed-by: Tao Zhang <tao1.zhang@linux.intel.com>
Signed-off-by: Tao Zhang <tao1.zhang@linux.intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
2025-05-09 13:39:45 -07:00
Peter Zijlstra
e52c1dc745 x86/its: FineIBT-paranoid vs ITS
FineIBT-paranoid was using the retpoline bytes for the paranoid check,
disabling retpolines, because all parts that have IBT also have eIBRS
and thus don't need no stinking retpolines.

Except... ITS needs the retpolines for indirect calls must not be in
the first half of a cacheline :-/

So what was the paranoid call sequence:

  <fineibt_paranoid_start>:
   0:   41 ba 78 56 34 12       mov    $0x12345678, %r10d
   6:   45 3b 53 f7             cmp    -0x9(%r11), %r10d
   a:   4d 8d 5b <f0>           lea    -0x10(%r11), %r11
   e:   75 fd                   jne    d <fineibt_paranoid_start+0xd>
  10:   41 ff d3                call   *%r11
  13:   90                      nop

Now becomes:

  <fineibt_paranoid_start>:
   0:   41 ba 78 56 34 12       mov    $0x12345678, %r10d
   6:   45 3b 53 f7             cmp    -0x9(%r11), %r10d
   a:   4d 8d 5b f0             lea    -0x10(%r11), %r11
   e:   2e e8 XX XX XX XX	cs call __x86_indirect_paranoid_thunk_r11

  Where the paranoid_thunk looks like:

   1d:  <ea>                    (bad)
   __x86_indirect_paranoid_thunk_r11:
   1e:  75 fd                   jne 1d
   __x86_indirect_its_thunk_r11:
   20:  41 ff eb                jmp *%r11
   23:  cc                      int3

[ dhansen: remove initialization to false ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
2025-05-09 13:39:36 -07:00
Peter Zijlstra
872df34d7c x86/its: Use dynamic thunks for indirect branches
ITS mitigation moves the unsafe indirect branches to a safe thunk. This
could degrade the prediction accuracy as the source address of indirect
branches becomes same for different execution paths.

To improve the predictions, and hence the performance, assign a separate
thunk for each indirect callsite. This is also a defense-in-depth measure
to avoid indirect branches aliasing with each other.

As an example, 5000 dynamic thunks would utilize around 16 bits of the
address space, thereby gaining entropy. For a BTB that uses
32 bits for indexing, dynamic thunks could provide better prediction
accuracy over fixed thunks.

Have ITS thunks be variable sized and use EXECMEM_MODULE_TEXT such that
they are both more flexible (got to extend them later) and live in 2M TLBs,
just like kernel code, avoiding undue TLB pressure.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
2025-05-09 13:36:58 -07:00
Pawan Gupta
ebebe30794 x86/ibt: Keep IBT disabled during alternative patching
cfi_rewrite_callers() updates the fineIBT hash matching at the caller side,
but except for paranoid-mode it relies on apply_retpoline() and friends for
any ENDBR relocation. This could temporarily cause an indirect branch to
land on a poisoned ENDBR.

For instance, with para-virtualization enabled, a simple wrmsrl() could
have an indirect branch pointing to native_write_msr() who's ENDBR has been
relocated due to fineIBT:

<wrmsrl>:
       push   %rbp
       mov    %rsp,%rbp
       mov    %esi,%eax
       mov    %rsi,%rdx
       shr    $0x20,%rdx
       mov    %edi,%edi
       mov    %rax,%rsi
       call   *0x21e65d0(%rip)        # <pv_ops+0xb8>
       ^^^^^^^^^^^^^^^^^^^^^^^

Such an indirect call during the alternative patching could #CP if the
caller is not *yet* adjusted for the new target ENDBR. To prevent a false
 #CP, keep CET-IBT disabled until all callers are patched.

Patching during the module load does not need to be guarded by IBT-disable
because the module code is not executed until the patching is complete.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
2025-05-09 13:33:35 -07:00
Peter Zijlstra
d6d1e3e658 mm/execmem: Unify early execmem_cache behaviour
Early kernel memory is RWX, only at the end of early boot (before SMP)
do we mark things ROX. Have execmem_cache mirror this behaviour for
early users.

This avoids having to remember what code is execmem and what is not --
we can poke everything with impunity ;-) Also performance for not
having to do endless text_poke_mm switches.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
2025-05-09 13:33:20 -07:00
Rob Herring (Arm)
09acc3266c arm64: dts: amazon: Fix simple-bus node name schema warnings
Fix a couple of node name warnings from the schema checks:

arch/arm64/boot/dts/amazon/alpine-v2-evp.dt.yaml: io-fabric: $nodename:0: 'io-fabric' does not match '^(bus|soc|axi|ahb|apb)(@[0-9a-f]+)?$'
arch/arm64/boot/dts/amazon/alpine-v3-evp.dt.yaml: io-fabric: $nodename:0: 'io-fabric' does not match '^(bus|soc|axi|ahb|apb)(@[0-9a-f]+)?$'

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20250409210255.1541298-1-robh@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-05-09 22:28:46 +02:00
Wolfram Sang
3238532ba5 MAINTAINERS: delete email for Shiraz Hashim
The email address bounced. I couldn't find a newer one in recent git
history (last activity 9 years ago), so delete this email entry.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20250331190731.5094-2-wsa+renesas@sang-engineering.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-05-09 22:27:27 +02:00
Pawan Gupta
f0cd7091cc x86/its: Align RETs in BHB clear sequence to avoid thunking
The software mitigation for BHI is to execute BHB clear sequence at syscall
entry, and possibly after a cBPF program. ITS mitigation thunks RETs in the
lower half of the cacheline. This causes the RETs in the BHB clear sequence
to be thunked as well, adding unnecessary branches to the BHB clear
sequence.

Since the sequence is in hot path, align the RET instructions in the
sequence to avoid thunking.

This is how disassembly clear_bhb_loop() looks like after this change:

   0x44 <+4>:     mov    $0x5,%ecx
   0x49 <+9>:     call   0xffffffff81001d9b <clear_bhb_loop+91>
   0x4e <+14>:    jmp    0xffffffff81001de5 <clear_bhb_loop+165>
   0x53 <+19>:    int3
   ...
   0x9b <+91>:    call   0xffffffff81001dce <clear_bhb_loop+142>
   0xa0 <+96>:    ret
   0xa1 <+97>:    int3
   ...
   0xce <+142>:   mov    $0x5,%eax
   0xd3 <+147>:   jmp    0xffffffff81001dd6 <clear_bhb_loop+150>
   0xd5 <+149>:   nop
   0xd6 <+150>:   sub    $0x1,%eax
   0xd9 <+153>:   jne    0xffffffff81001dd3 <clear_bhb_loop+147>
   0xdb <+155>:   sub    $0x1,%ecx
   0xde <+158>:   jne    0xffffffff81001d9b <clear_bhb_loop+91>
   0xe0 <+160>:   ret
   0xe1 <+161>:   int3
   0xe2 <+162>:   int3
   0xe3 <+163>:   int3
   0xe4 <+164>:   int3
   0xe5 <+165>:   lfence
   0xe8 <+168>:   pop    %rbp
   0xe9 <+169>:   ret

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
2025-05-09 13:22:05 -07:00
Pawan Gupta
facd226f7e x86/its: Add support for RSB stuffing mitigation
When retpoline mitigation is enabled for spectre-v2, enabling
call-depth-tracking and RSB stuffing also mitigates ITS. Add cmdline option
indirect_target_selection=stuff to allow enabling RSB stuffing mitigation.

When retpoline mitigation is not enabled, =stuff option is ignored, and
default mitigation for ITS is deployed.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
2025-05-09 13:22:05 -07:00
Pawan Gupta
2665281a07 x86/its: Add "vmexit" option to skip mitigation on some CPUs
Ice Lake generation CPUs are not affected by guest/host isolation part of
ITS. If a user is only concerned about KVM guests, they can now choose a
new cmdline option "vmexit" that will not deploy the ITS mitigation when
CPU is not affected by guest/host isolation. This saves the performance
overhead of ITS mitigation on Ice Lake gen CPUs.

When "vmexit" option selected, if the CPU is affected by ITS guest/host
isolation, the default ITS mitigation is deployed.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
2025-05-09 13:22:05 -07:00
Pawan Gupta
f4818881c4 x86/its: Enable Indirect Target Selection mitigation
Indirect Target Selection (ITS) is a bug in some pre-ADL Intel CPUs with
eIBRS. It affects prediction of indirect branch and RETs in the
lower half of cacheline. Due to ITS such branches may get wrongly predicted
to a target of (direct or indirect) branch that is located in the upper
half of the cacheline.

Scope of impact
===============

Guest/host isolation
--------------------
When eIBRS is used for guest/host isolation, the indirect branches in the
VMM may still be predicted with targets corresponding to branches in the
guest.

Intra-mode
----------
cBPF or other native gadgets can be used for intra-mode training and
disclosure using ITS.

User/kernel isolation
---------------------
When eIBRS is enabled user/kernel isolation is not impacted.

Indirect Branch Prediction Barrier (IBPB)
-----------------------------------------
After an IBPB, indirect branches may be predicted with targets
corresponding to direct branches which were executed prior to IBPB. This is
mitigated by a microcode update.

Add cmdline parameter indirect_target_selection=off|on|force to control the
mitigation to relocate the affected branches to an ITS-safe thunk i.e.
located in the upper half of cacheline. Also add the sysfs reporting.

When retpoline mitigation is deployed, ITS safe-thunks are not needed,
because retpoline sequence is already ITS-safe. Similarly, when call depth
tracking (CDT) mitigation is deployed (retbleed=stuff), ITS safe return
thunk is not used, as CDT prevents RSB-underflow.

To not overcomplicate things, ITS mitigation is not supported with
spectre-v2 lfence;jmp mitigation. Moreover, it is less practical to deploy
lfence;jmp mitigation on ITS affected parts anyways.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
2025-05-09 13:22:05 -07:00
Pawan Gupta
a75bf27fe4 x86/its: Add support for ITS-safe return thunk
RETs in the lower half of cacheline may be affected by ITS bug,
specifically when the RSB-underflows. Use ITS-safe return thunk for such
RETs.

RETs that are not patched:

- RET in retpoline sequence does not need to be patched, because the
  sequence itself fills an RSB before RET.
- RET in Call Depth Tracking (CDT) thunks __x86_indirect_{call|jump}_thunk
  and call_depth_return_thunk are not patched because CDT by design
  prevents RSB-underflow.
- RETs in .init section are not reachable after init.
- RETs that are explicitly marked safe with ANNOTATE_UNRET_SAFE.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
2025-05-09 13:22:05 -07:00
Pawan Gupta
8754e67ad4 x86/its: Add support for ITS-safe indirect thunk
Due to ITS, indirect branches in the lower half of a cacheline may be
vulnerable to branch target injection attack.

Introduce ITS-safe thunks to patch indirect branches in the lower half of
cacheline with the thunk. Also thunk any eBPF generated indirect branches
in emit_indirect_jump().

Below category of indirect branches are not mitigated:

- Indirect branches in the .init section are not mitigated because they are
  discarded after boot.
- Indirect branches that are explicitly marked retpoline-safe.

Note that retpoline also mitigates the indirect branches against ITS. This
is because the retpoline sequence fills an RSB entry before RET, and it
does not suffer from RSB-underflow part of the ITS.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
2025-05-09 13:22:04 -07:00
Pawan Gupta
159013a7ca x86/its: Enumerate Indirect Target Selection (ITS) bug
ITS bug in some pre-Alderlake Intel CPUs may allow indirect branches in the
first half of a cache line get predicted to a target of a branch located in
the second half of the cache line.

Set X86_BUG_ITS on affected CPUs. Mitigation to follow in later commits.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
2025-05-09 13:22:04 -07:00
Pawan Gupta
1ac116ce64 Documentation: x86/bugs/its: Add ITS documentation
Add the admin-guide for Indirect Target Selection (ITS).

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
2025-05-09 13:22:04 -07:00
Linus Torvalds
ea34704d6a Merge tag 'drm-fixes-2025-05-10' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
 "Weekly drm fixes, bit bigger than last week, but overall amdgpu/xe
  with some ivpu bits and a random few fixes, and dropping the
  ttm_backup struct which wrapped struct file and was recently
  frowned at.

  drm:
   - Fix overflow when generating wedged event

  ttm:
   - Fix documentation
   - Remove struct ttm_backup

  panel:
   - simple: Fix timings for AUO G101EVN010

  amdgpu:
   - DC FP fixes
   - Freesync fix
   - DMUB AUX fixes
   - VCN fix
   - Hibernation fixes
   - HDP fixes

  xe:
   - Prevent PF queue overflow
   - Hold all forcewake during mocs test
   - Remove GSC flush on reset path
   - Fix forcewake put on error path
   - Fix runtime warning when building without svm

  i915:
   - Fix oops on resume after disconnecting DP MST sinks during suspend
   - Fix SPLC num_waiters refcounting

  ivpu:
   - Increase timeouts
   - Fix deadlock in cmdq ioctl
   - Unlock mutices in correct order

  v3d:
   - Avoid memory leak in job handling"

* tag 'drm-fixes-2025-05-10' of https://gitlab.freedesktop.org/drm/kernel: (32 commits)
  drm/i915/dp: Fix determining SST/MST mode during MTP TU state computation
  drm/xe: Add config control for svm flush work
  drm/xe: Release force wake first then runtime power
  drm/xe/gsc: do not flush the GSC worker from the reset path
  drm/xe/tests/mocs: Hold XE_FORCEWAKE_ALL for LNCF regs
  drm/xe: Add page queue multiplier
  drm/amdgpu/hdp7: use memcfg register to post the write for HDP flush
  drm/amdgpu/hdp6: use memcfg register to post the write for HDP flush
  drm/amdgpu/hdp5.2: use memcfg register to post the write for HDP flush
  drm/amdgpu/hdp5: use memcfg register to post the write for HDP flush
  drm/amdgpu/hdp4: use memcfg register to post the write for HDP flush
  drm/amdgpu: fix pm notifier handling
  Revert "drm/amd: Stop evicting resources on APUs in suspend"
  drm/amdgpu/vcn: using separate VCN1_AON_SOC offset
  drm/amd/display: Fix wrong handling for AUX_DEFER case
  drm/amd/display: Copy AUX read reply data whenever length > 0
  drm/amd/display: Remove incorrect checking in dmub aux handler
  drm/amd/display: Fix the checking condition in dmub aux handling
  drm/amd/display: Shift DMUB AUX reply command if necessary
  drm/amd/display: Call FP Protect Before Mode Programming/Mode Support
  ...
2025-05-09 12:41:34 -07:00
Dave Airlie
c2c64ed09c Merge tag 'drm-intel-fixes-2025-05-09' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes
drm/i915 fixes for v6.15-rc6:
- Fix oops on resume after disconnecting DP MST sinks during suspend
- Fix SPLC num_waiters refcounting

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/87tt5umeaw.fsf@intel.com
2025-05-10 05:07:18 +10:00
Dave Airlie
f7be784caf Merge tag 'drm-xe-fixes-2025-05-09' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- Prevent PF queue overflow
- Hold all forcewake during mocs test
- Remove GSC flush on reset path
- Fix forcewake put on error path
- Fix runtime warning when building without svm

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/jffqa56f2zp4i5ztz677cdspgxhnw7qfop3dd3l2epykfpfvza@q2nw6wapsphz
2025-05-10 05:02:46 +10:00
I Hsin Cheng
eb0851e144 drm/meson: Use 1000ULL when operating with mode->clock
Coverity scan reported the usage of "mode->clock * 1000" may lead to
integer overflow. Use "1000ULL" instead of "1000"
when utilizing it to avoid potential integer overflow issue.

Link: https://scan5.scan.coverity.com/#/project-view/10074/10063?selectedIssue=1646759
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Fixes: 1017560164 ("drm/meson: use unsigned long long / Hz for frequency types")
Link: https://lore.kernel.org/r/20250505184338.678540-1-richard120310@gmail.com
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
2025-05-09 20:53:07 +02:00
Linus Torvalds
50358c251e Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
 "Move the arm64_use_ng_mappings variable from the .bss to the .data
  section as it is accessed very early during boot with the MMU off and
  before the .bss has been initialised.

  This could lead to incorrect idmap page table"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: cpufeature: Move arm64_use_ng_mappings to the .data section to prevent wrong idmap generation
2025-05-09 11:30:26 -07:00
Linus Torvalds
3013c33dcb Merge tag 'riscv-for-linus-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:

 - The compressed half-word misaligned access instructions (c.lhu, c.lh,
   and c.sh) from the Zcb extension are now properly emulated

 - A series of fixes to properly emulate permissions while handling
   userspace misaligned accesses

 - A pair of fixes for PR_GET_TAGGED_ADDR_CTRL to avoid accessing the
   envcfg CSR on systems that don't support that CSR, and to report
   those failures up to userspace

 - The .rela.dyn section is no longer stripped from vmlinux, as it is
   necessary to relocate the kernel under some conditions (including
   kexec)

* tag 'riscv-for-linus-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Disallow PR_GET_TAGGED_ADDR_CTRL without Supm
  scripts: Do not strip .rela.dyn section
  riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL
  riscv: misaligned: use get_user() instead of __get_user()
  riscv: misaligned: enable IRQs while handling misaligned accesses
  riscv: misaligned: factorize trap handling
  riscv: misaligned: Add handling for ZCB instructions
2025-05-09 11:17:50 -07:00
Waiman Long
39b5ef791d cgroup/cpuset: Extend kthread_is_per_cpu() check to all PF_NO_SETAFFINITY tasks
Commit ec5fbdfb99 ("cgroup/cpuset: Enable update_tasks_cpumask()
on top_cpuset") enabled us to pull CPUs dedicated to child partitions
from tasks in top_cpuset by ignoring per cpu kthreads. However, there
can be other kthreads that are not per cpu but have PF_NO_SETAFFINITY
flag set to indicate that we shouldn't mess with their CPU affinity.
For other kthreads, their affinity will be changed to skip CPUs dedicated
to child partitions whether it is an isolating or a scheduling one.

As all the per cpu kthreads have PF_NO_SETAFFINITY set, the
PF_NO_SETAFFINITY tasks are essentially a superset of per cpu kthreads.
Fix this issue by dropping the kthread_is_per_cpu() check and checking
the PF_NO_SETAFFINITY flag instead.

Fixes: ec5fbdfb99 ("cgroup/cpuset: Enable update_tasks_cpumask() on top_cpuset")
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-05-09 07:35:14 -10:00
Linus Torvalds
cc9f0629ca Merge tag 'block-6.15-20250509' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:

 - Fix for a regression in this series for loop and read/write iterator
   handling

 - zone append block update tweak

 - remove a broken IO priority test

 - NVMe pull request via Christoph:
      - unblock ctrl state transition for firmware update (Daniel
        Wagner)

* tag 'block-6.15-20250509' of git://git.kernel.dk/linux:
  block: remove test of incorrect io priority level
  nvme: unblock ctrl state transition for firmware update
  block: only update request sector if needed
  loop: Add sanity check for read/write_iter
2025-05-09 10:34:50 -07:00
Linus Torvalds
7380c60b28 Merge tag 'io_uring-6.15-20250509' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:

 - Fix for linked timeouts arming and firing wrt prep and issue of the
   request being managed by the linked timeout

 - Fix for a CQE ordering issue between requests with multishot and
   using the same buffer group. This is a dumbed down version for this
   release and for stable, it'll get improved for v6.16

 - Tweak the SQPOLL submit batch size. A previous commit made SQPOLL
   manage its own task_work and chose a tiny batch size, bump it from 8
   to 32 to fix a performance regression due to that

* tag 'io_uring-6.15-20250509' of git://git.kernel.dk/linux:
  io_uring/sqpoll: Increase task_work submission batch size
  io_uring: ensure deferred completions are flushed for multishot
  io_uring: always arm linked timeouts prior to issue
2025-05-09 09:26:46 -07:00
Linus Torvalds
29fe5d50df Merge tag 'modules-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux
Pull modules fix from Petr Pavlu:
 "A single fix to prevent use of an uninitialized completion pointer
  when releasing a module_kobject in specific situations.

  This addresses a latent bug exposed by commit f95bbfe185 ("drivers:
  base: handle module_kobject creation")"

* tag 'modules-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux:
  module: ensure that kobject_put() is safe for module type kobjects
2025-05-09 09:09:49 -07:00
Arnd Bergmann
33e79299f9 Merge tag 'asahi-soc-fixes-6.15' of https://github.com/AsahiLinux/linux into arm/fixes
Apple SoC fixes for 6.15

This tag contains two small commits since rc1:
- Add a .mailmap entry requested by Asahi Lina to better filter her
  emails
- Mark the power domains for the touchbar support introduced with 6.15
  as always on since the driver cannot initialize the touchbar from
  scratch after the domains are powered off (e.g. during suspend).

* tag 'asahi-soc-fixes-6.15' of https://github.com/AsahiLinux/linux:
  arm64: dts: apple: touchbar: Mark ps_dispdfr_be as always-on
  mailmap: Update email for Asahi Lina

Link: https://lore.kernel.org/r/20250423145047.3098-1-sven@svenpeter.dev
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-05-09 18:02:16 +02:00
Arnd Bergmann
6053915252 Merge tag 'riscv-sophgo-dt-fixes-for-v6.15-rc1' of https://github.com/sophgo/linux into arm/fixes
RISC-V Sophgo Devicetree fixes for v6.15-rc1

Just one minor fix to correct DMA data-width
configuration for CV18xx.

Signed-off-by: Chen Wang <unicorn_wang@outlook.com>

* tag 'riscv-sophgo-dt-fixes-for-v6.15-rc1' of https://github.com/sophgo/linux:
  riscv: dts: sophgo: fix DMA data-width configuration for CV18xx

Link: https://lore.kernel.org/r/MA0P287MB2262454C19B8899BC1694D04FE832@MA0P287MB2262.INDP287.PROD.OUTLOOK.COM
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-05-09 18:01:07 +02:00
Arnd Bergmann
e36f6de6ad Merge tag 'amlogic-fixes-for-v6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/amlogic/linux into arm/fixes
Amlogic Fixes for v6.15:
- fix reference to unknown/untested PWM clock on ARM/ARM64 boards
- fix missing clkc_audio node on dreambox ARM64 DT

* tag 'amlogic-fixes-for-v6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/amlogic/linux:
  arm64: dts: amlogic: dreambox: fix missing clkc_audio node
  arm64: dts: amlogic: g12: fix reference to unknown/untested PWM clock
  arm64: dts: amlogic: gx: fix reference to unknown/untested PWM clock
  ARM: dts: amlogic: meson8b: fix reference to unknown/untested PWM clock
  ARM: dts: amlogic: meson8: fix reference to unknown/untested PWM clock

Link: https://lore.kernel.org/r/e9c520a1-b986-49e1-b9b1-67511c187716@linaro.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-05-09 18:00:03 +02:00
Arnd Bergmann
81b7cf868a Merge tag 'v6.15-rockchip-dtsfixes1' of https://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into arm/fixes
Removal of operating-points above what the rk3588j soc is rated for, and
a number of smaller fixes: Turing RK1 fan can spin down again, fixed pins,
pinmuxing and clocks and some devicetree-correctnes improvements.

* tag 'v6.15-rockchip-dtsfixes1' of https://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
  arm64: dts: rockchip: fix Sige5 RTC interrupt pin
  arm64: dts: rockchip: Assign RT5616 MCLK rate on rk3588-friendlyelec-cm3588
  arm64: dts: rockchip: Align wifi node name with bindings in CB2
  arm64: dts: rockchip: Fix mmc-pwrseq clock name on rock-pi-4
  arm64: dts: rockchip: Use "regulator-fixed" for btreg on px30-engicam for vcc3v3-btreg
  arm64: dts: rockchip: Add pinmuxing for eMMC on QNAP TS433
  arm64: dts: rockchip: Remove overdrive-mode OPPs from RK3588J SoC dtsi
  arm64: dts: rockchip: Allow Turing RK1 cooling fan to spin down

Link: https://lore.kernel.org/r/2923598.88bMQJbFj6@diego
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-05-09 17:57:39 +02:00
Dave Hansen
fea4e317f9 x86/mm: Eliminate window where TLB flushes may be inadvertently skipped
tl;dr: There is a window in the mm switching code where the new CR3 is
set and the CPU should be getting TLB flushes for the new mm.  But
should_flush_tlb() has a bug and suppresses the flush.  Fix it by
widening the window where should_flush_tlb() sends an IPI.

Long Version:

=== History ===

There were a few things leading up to this.

First, updating mm_cpumask() was observed to be too expensive, so it was
made lazier.  But being lazy caused too many unnecessary IPIs to CPUs
due to the now-lazy mm_cpumask().  So code was added to cull
mm_cpumask() periodically[2].  But that culling was a bit too aggressive
and skipped sending TLB flushes to CPUs that need them.  So here we are
again.

=== Problem ===

The too-aggressive code in should_flush_tlb() strikes in this window:

	// Turn on IPIs for this CPU/mm combination, but only
	// if should_flush_tlb() agrees:
	cpumask_set_cpu(cpu, mm_cpumask(next));

	next_tlb_gen = atomic64_read(&next->context.tlb_gen);
	choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush);
	load_new_mm_cr3(need_flush);
	// ^ After 'need_flush' is set to false, IPIs *MUST*
	// be sent to this CPU and not be ignored.

        this_cpu_write(cpu_tlbstate.loaded_mm, next);
	// ^ Not until this point does should_flush_tlb()
	// become true!

should_flush_tlb() will suppress TLB flushes between load_new_mm_cr3()
and writing to 'loaded_mm', which is a window where they should not be
suppressed.  Whoops.

=== Solution ===

Thankfully, the fuzzy "just about to write CR3" window is already marked
with loaded_mm==LOADED_MM_SWITCHING.  Simply checking for that state in
should_flush_tlb() is sufficient to ensure that the CPU is targeted with
an IPI.

This will cause more TLB flush IPIs.  But the window is relatively small
and I do not expect this to cause any kind of measurable performance
impact.

Update the comment where LOADED_MM_SWITCHING is written since it grew
yet another user.

Peter Z also raised a concern that should_flush_tlb() might not observe
'loaded_mm' and 'is_lazy' in the same order that switch_mm_irqs_off()
writes them.  Add a barrier to ensure that they are observed in the
order they are written.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Rik van Riel <riel@surriel.com>
Link: https://lore.kernel.org/oe-lkp/202411282207.6bd28eae-lkp@intel.com/ [1]
Fixes: 6db2526c1d ("x86/mm/tlb: Only trim the mm_cpumask once a second") [2]
Reported-by: Stephen Dolan <sdolan@janestreet.com>
Cc: stable@vger.kernel.org
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-05-09 08:00:31 -07:00
Himanshu Bhavani
c688898313 arm64: dts: imx8mp-var-som: Fix LDO5 shutdown causing SD card timeout
Fix SD card timeout issue caused by LDO5 regulator getting disabled
after boot.

The kernel log shows LDO5 being disabled, which leads to a timeout
on USDHC2:
[   33.760561] LDO5: disabling
[   81.119861] mmc1: Timeout waiting for hardware interrupt.

To prevent this, set regulator-boot-on and regulator-always-on for
LDO5. Also add the vqmmc regulator to properly support 1.8V/3.3V
signaling for USDHC2 using a GPIO-controlled regulator.

Fixes: 6c2a1f4f71 ("arm64: dts: imx8mp-var-som-symphony: Add Variscite Symphony board and VAR-SOM-MX8MP SoM")
Signed-off-by: Himanshu Bhavani <himanshu.bhavani@siliconsignals.io>
Acked-by: Tarang Raval <tarang.raval@siliconsignals.io>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2025-05-09 22:48:35 +08:00
Gabriel Krisman Bertazi
92835cebab io_uring/sqpoll: Increase task_work submission batch size
Our QA team reported a 10%-23%, throughput reduction on an io_uring
sqpoll testcase doing IO to a null_blk, that I traced back to a
reduction of the device submission queue depth utilization. It turns out
that, after commit af5d68f889 ("io_uring/sqpoll: manage task_work
privately"), we capped the number of task_work entries that can be
completed from a single spin of sqpoll to only 8 entries, before the
sqpoll goes around to (potentially) sleep.  While this cap doesn't drive
the submission side directly, it impacts the completion behavior, which
affects the number of IO queued by fio per sqpoll cycle on the
submission side, and io_uring ends up seeing less ios per sqpoll cycle.
As a result, block layer plugging is less effective, and we see more
time spent inside the block layer in profilings charts, and increased
submission latency measured by fio.

There are other places that have increased overhead once sqpoll sleeps
more often, such as the sqpoll utilization calculation.  But, in this
microbenchmark, those were not representative enough in perf charts, and
their removal didn't yield measurable changes in throughput.  The major
overhead comes from the fact we plug less, and less often, when submitting
to the block layer.

My benchmark is:

fio --ioengine=io_uring --direct=1 --iodepth=128 --runtime=300 --bs=4k \
    --invalidate=1 --time_based  --ramp_time=10 --group_reporting=1 \
    --filename=/dev/nullb0 --name=RandomReads-direct-nullb-sqpoll-4k-1 \
    --rw=randread --numjobs=1 --sqthread_poll

In one machine, tested on top of Linux 6.15-rc1, we have the following
baseline:
  READ: bw=4994MiB/s (5236MB/s), 4994MiB/s-4994MiB/s (5236MB/s-5236MB/s), io=439GiB (471GB), run=90001-90001msec

With this patch:
  READ: bw=5762MiB/s (6042MB/s), 5762MiB/s-5762MiB/s (6042MB/s-6042MB/s), io=506GiB (544GB), run=90001-90001msec

which is a 15% improvement in measured bandwidth.  The average
submission latency is noticeably lowered too.  As measured by
fio:

Baseline:
   lat (usec): min=20, max=241, avg=99.81, stdev=3.38
Patched:
   lat (usec): min=26, max=226, avg=86.48, stdev=4.82

If we look at blktrace, we can also see the plugging behavior is
improved. In the baseline, we end up limited to plugging 8 requests in
the block layer regardless of the device queue depth size, while after
patching we can drive more io, and we manage to utilize the full device
queue.

In the baseline, after a stabilization phase, an ordinary submission
looks like:
  254,0    1    49942     0.016028795  5977  U   N [iou-sqp-5976] 7

After patching, I see consistently more requests per unplug.
  254,0    1     4996     0.001432872  3145  U   N [iou-sqp-3144] 32

Ideally, the cap size would at least be the deep enough to fill the
device queue, but we can't predict that behavior, or assume all IO goes
to a single device, and thus can't guess the ideal batch size.  We also
don't want to let the tw run unbounded, though I'm not sure it would
really be a problem.  Instead, let's just give it a more sensible value
that will allow for more efficient batching.  I've tested with different
cap values, and initially proposed to increase the cap to 1024.  Jens
argued it is too big of a bump and I observed that, with 32, I'm no
longer able to observe this bottleneck in any of my machines.

Fixes: af5d68f889 ("io_uring/sqpoll: manage task_work privately")
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Link: https://lore.kernel.org/r/20250508181203.3785544-1-krisman@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-09 07:56:53 -06:00
David S. Miller
12f4ee312c Merge branch 'net_sched-gso_skb-flushing'
Cong Wang says:

====================
net_sched: Fix gso_skb flushing during qdisc change

This patchset contains a bug fix and its test cases, please check each
patch description for more details. To keep the bug fix minimum, I
intentionally limit the code changes to the cases reported here.

---
v2: added a missing qlen--
    fixed the new boolean parameter for two qdiscs

====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-09 12:34:38 +01:00
Cong Wang
16ce349b15 selftests/tc-testing: Add qdisc limit trimming tests
Added new test cases for FQ, FQ_CODEL, FQ_PIE, and HHF qdiscs to verify queue
trimming behavior when the qdisc limit is dynamically reduced.

Each test injects packets, reduces the qdisc limit, and checks that the new
limit is enforced. This is still best effort since timing qdisc backlog
is not easy.

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-09 12:34:38 +01:00
Cong Wang
2d3cbfd6d5 net_sched: Flush gso_skb list too during ->change()
Previously, when reducing a qdisc's limit via the ->change() operation, only
the main skb queue was trimmed, potentially leaving packets in the gso_skb
list. This could result in NULL pointer dereference when we only check
sch->limit against sch->q.qlen.

This patch introduces a new helper, qdisc_dequeue_internal(), which ensures
both the gso_skb list and the main queue are properly flushed when trimming
excess packets. All relevant qdiscs (codel, fq, fq_codel, fq_pie, hhf, pie)
are updated to use this helper in their ->change() routines.

Fixes: 76e3cc126b ("codel: Controlled Delay AQM")
Fixes: 4b549a2ef4 ("fq_codel: Fair Queue Codel AQM")
Fixes: afe4fd0624 ("pkt_sched: fq: Fair Queue packet scheduler")
Fixes: ec97ecf1eb ("net: sched: add Flow Queue PIE packet scheduler")
Fixes: 10239edf86 ("net-qdisc-hhf: Heavy-Hitter Filter (HHF) qdisc")
Fixes: d4b36210c2 ("net: pkt_sched: PIE AQM scheme")
Reported-by: Will <willsroot@protonmail.com>
Reported-by: Savy <savy@syst3mfailure.io>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-09 12:34:38 +01:00
Jeremy Bongio
04679f3c27 fs: Remove redundant errseq_set call in mark_buffer_write_io_error.
mark_buffer_write_io_error sets sb->s_wb_err to -EIO twice.
Once in mapping_set_error and once in errseq_set.
Only mapping_set_error checks if bh->b_assoc_map->host is NULL.

Discovered during null pointer dereference during writeback
to a failing device:

[<ffffffff9a416dc8>] ? mark_buffer_write_io_error+0x98/0xc0
[<ffffffff9a416dbe>] ? mark_buffer_write_io_error+0x8e/0xc0
[<ffffffff9ad4bda0>] end_buffer_async_write+0x90/0xd0
[<ffffffff9ad4e3eb>] end_bio_bh_io_sync+0x2b/0x40
[<ffffffff9adbafe6>] blk_update_request+0x1b6/0x480
[<ffffffff9adbb3d8>] blk_mq_end_request+0x18/0x30
[<ffffffff9adbc6aa>] blk_mq_dispatch_rq_list+0x4da/0x8e0
[<ffffffff9adc0a68>] __blk_mq_sched_dispatch_requests+0x218/0x6a0
[<ffffffff9adc07fa>] blk_mq_sched_dispatch_requests+0x3a/0x80
[<ffffffff9adbbb98>] blk_mq_run_hw_queue+0x108/0x330
[<ffffffff9adbcf58>] blk_mq_flush_plug_list+0x178/0x5f0
[<ffffffff9adb6741>] __blk_flush_plug+0x41/0x120
[<ffffffff9adb6852>] blk_finish_plug+0x22/0x40
[<ffffffff9ad47cb0>] wb_writeback+0x150/0x280
[<ffffffff9ac5343f>] ? set_worker_desc+0x9f/0xc0
[<ffffffff9ad4676e>] wb_workfn+0x24e/0x4a0

Fixes: 485e9605c0 ("fs/buffer.c: record blockdev write errors in super_block that it backs")
Signed-off-by: Jeremy Bongio <jbongio@google.com>
Link: https://lore.kernel.org/20250507123010.1228243-1-jbongio@google.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-09 12:31:57 +02:00
Ahmad Fatoum
9379508f06 arm64: dts: imx8mp: use 800MHz NoC OPP for nominal drive mode
When running in nominal drive mode, the maximum allowed frequency for
the NoC is 800MHz, but the OPP table for the i.MX8MP interconnect device
listed the 1GHz operating point for the NoC, regardless of the active
mode.

The newly introduced imx8mp-nominal.dtsi header reconfigures the clock
controller to observe nominal drive mode limits, so have it modify the
maximum NoC OPP as well.

Fixes: 255fbd9eab ("arm64: dts: imx8mp: Add optional nominal drive mode DTSI")
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2025-05-09 17:53:20 +08:00
Kai Vehmanen
6052f05254 ASoc: SOF: topology: connect DAI to a single DAI link
The partial matching of DAI widget to link names, can cause problems if
one of the widget names is a substring of another. E.g. with names
"Foo1" and Foo10", it's not possible to correctly link up "Foo1".

Modify the logic so that if multiple DAI links match the widget stream
name, prioritize a full match if one is found.

Fixes: fe88788779 ("ASoC: SOF: topology: Use partial match for connecting DAI link and DAI widget")
Link: https://github.com/thesofproject/linux/issues/5308
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Link: https://patch.msgid.link/20250509085318.13936-1-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-09 18:20:15 +09:00
Peter Ujfalusi
4e7010826e ASoC: SOF: Intel: hda-bus: Use PIO mode on ACE2+ platforms
Keep using the PIO mode for commands on ACE2+ platforms, similarly how
the legacy stack is configured.

Fixes: 05cf17f1bf ("ASoC: SOF: Intel: hda-bus: Use PIO mode for Lunar Lake")
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250509081308.13784-1-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-09 18:20:14 +09:00
Peter Ujfalusi
98db16f314 ASoC: SOF: ipc4-pcm: Delay reporting is only supported for playback direction
The firmware does not provide any information for capture streams via the
shared pipeline registers.

To avoid reporting invalid delay value for capture streams to user space
we need to disable it.

Fixes: af74dbd0db ("ASoC: SOF: ipc4-pcm: allocate time info for pcm delay feature")
Cc: stable@vger.kernel.org
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com>
Link: https://patch.msgid.link/20250509085951.15696-1-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-09 18:20:13 +09:00
Peter Ujfalusi
4d14b1069e ASoC: SOF: ipc4-control: Use SOF_CTRL_CMD_BINARY as numid for bytes_ext
The header.numid is set to scontrol->comp_id in bytes_ext_get and it is
ignored during bytes_ext_put.
The use of comp_id is not quite great as it is kernel internal
identification number.

Set the header.numid to SOF_CTRL_CMD_BINARY during get and validate the
numid during put to provide consistent and compatible identification
number as IPC3.

For IPC4 existing tooling also ignored the numid but with the use of
SOF_CTRL_CMD_BINARY the different handling of the blobs can be dropped,
providing better user experience.

Reported-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
Closes: https://github.com/thesofproject/linux/issues/5282
Fixes: a062c8899f ("ASoC: SOF: ipc4-control: Add support for bytes control get and put")
Cc: stable@vger.kernel.org
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com>
Link: https://patch.msgid.link/20250509085633.14930-1-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-09 18:20:13 +09:00
Imre Deak
732b87a409 drm/i915/dp: Fix determining SST/MST mode during MTP TU state computation
Determining the SST/MST mode during state computation must be done based
on the output type stored in the CRTC state, which in turn is set once
based on the modeset connector's SST vs. MST type and will not change as
long as the connector is using the CRTC. OTOH the MST mode indicated by
the given connector's intel_dp::is_mst flag can change independently of
the above output type, based on what sink is at any moment plugged to
the connector.

Fix the state computation accordingly.

Cc: Jani Nikula <jani.nikula@intel.com>
Fixes: f6971d7427 ("drm/i915/mst: adapt intel_dp_mtp_tu_compute_config() for 128b/132b SST")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4607
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://lore.kernel.org/r/20250507151953.251846-1-imre.deak@intel.com
(cherry picked from commit 0f45696ddb)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-05-09 10:15:52 +03:00
Tom Lendacky
da8bf5daa5 memblock: Accept allocated memory before use in memblock_double_array()
When increasing the array size in memblock_double_array() and the slab
is not yet available, a call to memblock_find_in_range() is used to
reserve/allocate memory. However, the range returned may not have been
accepted, which can result in a crash when booting an SNP guest:

  RIP: 0010:memcpy_orig+0x68/0x130
  Code: ...
  RSP: 0000:ffffffff9cc03ce8 EFLAGS: 00010006
  RAX: ff11001ff83e5000 RBX: 0000000000000000 RCX: fffffffffffff000
  RDX: 0000000000000bc0 RSI: ffffffff9dba8860 RDI: ff11001ff83e5c00
  RBP: 0000000000002000 R08: 0000000000000000 R09: 0000000000002000
  R10: 000000207fffe000 R11: 0000040000000000 R12: ffffffff9d06ef78
  R13: ff11001ff83e5000 R14: ffffffff9dba7c60 R15: 0000000000000c00
  memblock_double_array+0xff/0x310
  memblock_add_range+0x1fb/0x2f0
  memblock_reserve+0x4f/0xa0
  memblock_alloc_range_nid+0xac/0x130
  memblock_alloc_internal+0x53/0xc0
  memblock_alloc_try_nid+0x3d/0xa0
  swiotlb_init_remap+0x149/0x2f0
  mem_init+0xb/0xb0
  mm_core_init+0x8f/0x350
  start_kernel+0x17e/0x5d0
  x86_64_start_reservations+0x14/0x30
  x86_64_start_kernel+0x92/0xa0
  secondary_startup_64_no_verify+0x194/0x19b

Mitigate this by calling accept_memory() on the memory range returned
before the slab is available.

Prior to v6.12, the accept_memory() interface used a 'start' and 'end'
parameter instead of 'start' and 'size', therefore the accept_memory()
call must be adjusted to specify 'start + size' for 'end' when applying
to kernels prior to v6.12.

Cc: stable@vger.kernel.org # see patch description, needs adjustments for <= 6.11
Fixes: dcdfdd40fa ("mm: Add support for unaccepted memory")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/da1ac73bf4ded761e21b4e4bb5178382a580cd73.1746725050.git.thomas.lendacky@amd.com
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
2025-05-09 08:53:12 +03:00
Nathan Chancellor
6b3ab7f2cb net: qede: Initialize qede_ll_ops with designated initializer
After a recent change [1] in clang's randstruct implementation to
randomize structures that only contain function pointers, there is an
error because qede_ll_ops get randomized but does not use a designated
initializer for the first member:

  drivers/net/ethernet/qlogic/qede/qede_main.c:206:2: error: a randomized struct can only be initialized with a designated initializer
    206 |         {
        |         ^

Explicitly initialize the common member using a designated initializer
to fix the build.

Cc: stable@vger.kernel.org
Fixes: 035f7f87b7 ("randstruct: Enable Clang support")
Link: 04364fb888 [1]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20250507-qede-fix-clang-randstruct-v1-1-5ccc15626fba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-08 19:20:57 -07:00
Jakub Kicinski
ea9a83d7f3 Merge tag 'for-net-2025-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

 - MGMT: Fix MGMT_OP_ADD_DEVICE invalid device flags
 - hci_event: Fix not using key encryption size when its known

* tag 'for-net-2025-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
  Bluetooth: hci_event: Fix not using key encryption size when its known
  Bluetooth: MGMT: Fix MGMT_OP_ADD_DEVICE invalid device flags
====================

Link: https://patch.msgid.link/20250508150927.385675-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-08 18:38:00 -07:00
Dave Airlie
80e12f3e2a Merge tag 'amd-drm-fixes-6.15-2025-05-08' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.15-2025-05-08:

amdgpu:
- DC FP fixes
- Freesync fix
- DMUB AUX fixes
- VCN fix
- Hibernation fixes
- HDP fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250508194102.3242372-1-alexander.deucher@amd.com
2025-05-09 11:11:25 +10:00
Dave Airlie
20a4c8142e Merge tag 'drm-misc-fixes-2025-05-08' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:

drm:
- Fix overflow when generating wedged event

ivpu:
- Increate timeouts
- Fix deadlock in cmdq ioctl
- Unlock mutices in correct order

panel:
- simple: Fix timings for AUO G101EVN010

ttm:
- Fix documentation
- Remove struct ttm_backup

v3d:
- Avoid memory leak in job handling

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20250508104939.GA76697@2a02-2454-fd5e-fd00-c110-cbf2-6528-c5be.dyn6.pyur.net
2025-05-09 08:52:24 +10:00
Linus Torvalds
9c69f88849 Merge tag 'bcachefs-2025-05-08' of git://evilpiepirate.org/bcachefs
Pull bcachefs fixes from Kent Overstreet:

 - Some fixes to help with filesystem analysis: ensure superblock
   error count gets written if we go ERO, don't discard the journal
   aggressively (so it's available for list_journal -a)

 - Fix lost wakeup on arm causing us to get stuck when reading btree
   nodes

 - Fix fsck failing to exit on ctrl-c

 - An additional fix for filesystems with misaligned bucket sizes: we
   now ensure that allocations are properly aligned

 - Setting background target but not promote target will now leave that
   data cached on the foreground target, as it used to

 - Revert a change to when we allocate the VFS superblock, this was done
   for implementing blk_holder_ops but ended up not being needed, and
   allocating a superblock and not setting SB_BORN while we do recovery
   caused sync() calls and other things to hang

 - Assorted fixes for harmless error messages that caused concern to
   users

* tag 'bcachefs-2025-05-08' of git://evilpiepirate.org/bcachefs:
  bcachefs: Don't aggressively discard the journal
  bcachefs: Ensure superblock gets written when we go ERO
  bcachefs: Filter out harmless EROFS error messages
  bcachefs: journal_shutdown is EROFS, not EIO
  bcachefs: Call bch2_fs_start before getting vfs superblock
  bcachefs: fix hung task timeout in journal read
  bcachefs: Add missing barriers before wake_up_bit()
  bcachefs: Ensure proper write alignment
  bcachefs: Improve want_cached_ptr()
  bcachefs: thread_with_stdio: fix spinning instead of exiting
2025-05-08 14:28:49 -07:00
André Draszik
3e14c7207a clk: s2mps11: initialise clk_hw_onecell_data::num before accessing ::hws[] in probe()
With UBSAN enabled, we're getting the following trace:

    UBSAN: array-index-out-of-bounds in .../drivers/clk/clk-s2mps11.c:186:3
    index 0 is out of range for type 'struct clk_hw *[] __counted_by(num)' (aka 'struct clk_hw *[]')

This is because commit f316cdff8d ("clk: Annotate struct
clk_hw_onecell_data with __counted_by") annotated the hws member of
that struct with __counted_by, which informs the bounds sanitizer about
the number of elements in hws, so that it can warn when hws is accessed
out of bounds.

As noted in that change, the __counted_by member must be initialised
with the number of elements before the first array access happens,
otherwise there will be a warning from each access prior to the
initialisation because the number of elements is zero. This occurs in
s2mps11_clk_probe() due to ::num being assigned after ::hws access.

Move the assignment to satisfy the requirement of assign-before-access.

Cc: stable@vger.kernel.org
Fixes: f316cdff8d ("clk: Annotate struct clk_hw_onecell_data with __counted_by")
Signed-off-by: André Draszik <andre.draszik@linaro.org>
Link: https://lore.kernel.org/r/20250326-s2mps11-ubsan-v1-1-fcc6fce5c8a9@linaro.org
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2025-05-08 14:01:44 -07:00
Shuicheng Lin
564467e9d0 drm/xe: Add config control for svm flush work
Without CONFIG_DRM_XE_GPUSVM set, GPU SVM is not initialized thus below
warning pops. Refine the flush work code to be controlled by the config
to avoid below warning:
"
[  453.132028] ------------[ cut here ]------------
[  453.132527] WARNING: CPU: 9 PID: 4491 at kernel/workqueue.c:4205 __flush_work+0x379/0x3a0
[  453.133355] Modules linked in: xe drm_ttm_helper ttm gpu_sched drm_buddy drm_suballoc_helper drm_gpuvm drm_exec
[  453.134352] CPU: 9 UID: 0 PID: 4491 Comm: xe_exec_mix_mod Tainted: G     U  W           6.15.0-rc3+ #7 PREEMPT(full)
[  453.135405] Tainted: [U]=USER, [W]=WARN
...
[  453.136921] RIP: 0010:__flush_work+0x379/0x3a0
[  453.137417] Code: 8b 45 00 48 8b 55 08 89 c7 48 c1 e8 04 83 e7 08 83 e0 0f 83 cf 02 89 c6 48 0f ba 6d 00 03 e9 d5 fe ff ff 0f 0b e9 db fd ff ff <0f> 0b 45 31 e4 e9 d1 fd ff ff 0f 0b e9 03 ff ff ff 0f 0b e9 d6 fe
[  453.139250] RSP: 0018:ffffc90000c67b18 EFLAGS: 00010246
[  453.139782] RAX: 0000000000000000 RBX: ffff888108a24000 RCX: 0000000000002000
[  453.140521] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8881016d61c8
[  453.141253] RBP: ffff8881016d61c8 R08: 0000000000000000 R09: 0000000000000000
[  453.141985] R10: 0000000000000000 R11: 0000000008a24000 R12: 0000000000000001
[  453.142709] R13: 0000000000000002 R14: 0000000000000000 R15: ffff888107db8c00
[  453.143450] FS:  00007f44853d4c80(0000) GS:ffff8882f469b000(0000) knlGS:0000000000000000
[  453.144276] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  453.144853] CR2: 00007f4487629228 CR3: 00000001016aa000 CR4: 00000000000406f0
[  453.145594] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  453.146320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  453.147061] Call Trace:
[  453.147336]  <TASK>
[  453.147579]  ? tick_nohz_tick_stopped+0xd/0x30
[  453.148067]  ? xas_load+0x9/0xb0
[  453.148435]  ? xa_load+0x6f/0xb0
[  453.148781]  __xe_vm_bind_ioctl+0xbd5/0x1500 [xe]
[  453.149338]  ? dev_printk_emit+0x48/0x70
[  453.149762]  ? _dev_printk+0x57/0x80
[  453.150148]  ? drm_ioctl+0x17c/0x440
[  453.150544]  ? __drm_dev_vprintk+0x36/0x90
[  453.150983]  ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[  453.151575]  ? drm_ioctl_kernel+0x9f/0xf0
[  453.151998]  ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[  453.152560]  drm_ioctl_kernel+0x9f/0xf0
[  453.152968]  drm_ioctl+0x20f/0x440
[  453.153332]  ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[  453.153893]  ? ioctl_has_perm.constprop.0.isra.0+0xae/0x100
[  453.154489]  ? memory_bm_test_bit+0x5/0x60
[  453.154935]  xe_drm_ioctl+0x47/0x70 [xe]
[  453.155419]  __x64_sys_ioctl+0x8d/0xc0
[  453.155824]  do_syscall_64+0x47/0x110
[  453.156228]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
"

v2 (Matt):
    refine commit message to have more details
    add Fixes tag
    move the code to xe_svm.h which already have the config
    remove a blank line per codestyle suggestion

Fixes: 63f6e480d1 ("drm/xe: Add SVM garbage collector")
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250502170052.1787973-1-shuicheng.lin@intel.com
(cherry picked from commit 9d80698bcd)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-08 13:57:52 -07:00
Shuicheng Lin
9d271a4f5b drm/xe: Release force wake first then runtime power
xe_force_wake_get() is dependent on xe_pm_runtime_get(), so for
the release path, xe_force_wake_put() should be called first then
xe_pm_runtime_put().
Combine the error path and normal path together with goto.

Fixes: 85d547608e ("drm/xe/xe_gt_debugfs: Update handling of xe_force_wake_get return")
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://lore.kernel.org/r/20250507022302.2187527-1-shuicheng.lin@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 432cd94efd)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-08 12:44:51 -07:00
Daniele Ceraolo Spurio
03552d8ac0 drm/xe/gsc: do not flush the GSC worker from the reset path
The workqueue used for the reset worker is marked as WQ_MEM_RECLAIM,
while the GSC one isn't (and can't be as we need to do memory
allocations in the gsc worker). Therefore, we can't flush the latter
from the former.

The reason why we had such a flush was to avoid interrupting either
the GSC FW load or in progress GSC proxy operations. GSC proxy
operations fall into 2 categories:

1) GSC proxy init: this only happens once immediately after GSC FW load
   and does not support being interrupted. The only way to recover from
   an interruption of the proxy init is to do an FLR and re-load the GSC.

2) GSC proxy request: this can happen in response to a request that
   the driver sends to the GSC. If this is interrupted, the GSC FW will
   timeout and the driver request will be failed, but overall the GSC
   will keep working fine.

Flushing the work allowed us to avoid interruption in both cases (unless
the hang came from the GSC engine itself, in which case we're toast
anyway). However, a failure on a proxy request is tolerable if we're in
a scenario where we're triggering a GT reset (i.e., something is already
gone pretty wrong), so what we really need to avoid is interrupting
the init flow, which we can do by polling on the register that reports
when the proxy init is complete (as that ensure us that all the load and
init operations have been completed).

Note that during suspend we still want to do a flush of the worker to
make sure it completes any operations involving the HW before the power
is cut.

v2: fix spelling in commit msg, rename waiter function (Julia)

Fixes: dd0e89e5ed ("drm/xe/gsc: GSC FW load")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4830
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
Link: https://lore.kernel.org/r/20250502155104.2201469-1-daniele.ceraolospurio@intel.com
(cherry picked from commit 12370bfcc4)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-08 12:44:41 -07:00
Tejas Upadhyay
51c0ee84e4 drm/xe/tests/mocs: Hold XE_FORCEWAKE_ALL for LNCF regs
LNCF registers report wrong values when XE_FORCEWAKE_GT
only is held. Holding XE_FORCEWAKE_ALL ensures correct
operations on LNCF regs.

V2(Himal):
 - Use xe_force_wake_ref_has_domain

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1999
Fixes: a6a4ea6d7d ("drm/xe: Add mocs kunit")
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250428082357.1730068-1-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
(cherry picked from commit 70a2585e58)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-08 12:44:34 -07:00
Matthew Brost
391008f34e drm/xe: Add page queue multiplier
For an unknown reason the math to determine the PF queue size does is
not correct - compute UMD applications are overflowing the PF queue
which is fatal. A multippier of 8 fixes the problem.

Fixes: 3338e4f90c ("drm/xe: Use topology to determine page fault queue size")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com>
Link: https://lore.kernel.org/r/20250408155915.78770-1-matthew.brost@intel.com
(cherry picked from commit 29582e0ea7)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-08 12:44:20 -07:00
Linus Torvalds
acaa3e726f Merge tag 'vfio-v6.15-rc6' of https://github.com/awilliam/linux-vfio
Pull vfio fix from Alex Williamson:

 - Fix an issue in vfio-pci huge_fault handling by aligning faults to
   the order, resulting in deterministic use of huge pages.  This
   avoids a race where simultaneous aligned and unaligned faults to
   the same PMD can result in a VM_FAULT_OOM and subsequent VM crash.
   (Alex Williamson)

* tag 'vfio-v6.15-rc6' of https://github.com/awilliam/linux-vfio:
  vfio/pci: Align huge faults to order
2025-05-08 12:09:22 -07:00
Jeremy Linton
adfab6b392 ACPI: PPTT: Fix processor subtable walk
The original PPTT code had a bug where the processor subtable length
was not correctly validated when encountering a truncated
acpi_pptt_processor node.

Commit 7ab4f0e37a ("ACPI PPTT: Fix coding mistakes in a couple of
sizeof() calls") attempted to fix this by validating the size is as
large as the acpi_pptt_processor node structure. This introduced a
regression where the last processor node in the PPTT table is ignored
if it doesn't contain any private resources. That results errors like:

  ACPI PPTT: PPTT table found, but unable to locate core XX (XX)
  ACPI: SPE must be homogeneous

Furthermore, it fails in a common case where the node length isn't
equal to the acpi_pptt_processor structure size, leaving the original
bug in a modified form.

Correct the regression by adjusting the loop termination conditions as
suggested by the bug reporters. An additional check performed after
the subtable node type is detected, validates the acpi_pptt_processor
node is fully contained in the PPTT table. Repeating the check in
acpi_pptt_leaf_node() is largely redundant as the node is already
known to be fully contained in the table.

The case where a final truncated node's parent property is accepted,
but the node itself is rejected should not be considered a bug.

Fixes: 7ab4f0e37a ("ACPI PPTT: Fix coding mistakes in a couple of sizeof() calls")
Reported-by: Maximilian Heyne <mheyne@amazon.de>
Closes: https://lore.kernel.org/linux-acpi/20250506-draco-taped-15f475cd@mheyne-amazon/
Reported-by: Yicong Yang <yangyicong@hisilicon.com>
Closes: https://lore.kernel.org/linux-acpi/20250507035124.28071-1-yangyicong@huawei.com/
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Tested-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Maximilian Heyne <mheyne@amazon.de>
Cc: All applicable <stable@vger.kernel.org> # 7ab4f0e37a: ACPI PPTT: Fix coding mistakes ...
Link: https://patch.msgid.link/20250508023025.1301030-1-jeremy.linton@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-05-08 20:23:16 +02:00
Palmer Dabbelt
01534f3e0d Merge tag 'riscv-fixes-6.15-rc6' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux into fixes
riscv fixes for 6.15-rc6

- A fix to handle compressed halfword load/store instructions misaligned accesses
- A fix to allow user memory access while handling a misaligned access
- 2 fixes to return an error if the pointer masking extension is not implemented on the platform but userspace still tries to access it, which caused oops on some early platforms
- A fix to prevent the stripping of .rela.dyn so that a vmlinux loaded by kexec can successfully boot

* tag 'riscv-fixes-6.15-rc6' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux:
  riscv: Disallow PR_GET_TAGGED_ADDR_CTRL without Supm
  scripts: Do not strip .rela.dyn section
  riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL
  riscv: misaligned: use get_user() instead of __get_user()
  riscv: misaligned: enable IRQs while handling misaligned accesses
  riscv: misaligned: factorize trap handling
  riscv: misaligned: Add handling for ZCB instructions
2025-05-08 09:40:21 -07:00
Alex Deucher
5a11a27677 drm/amdgpu/hdp7: use memcfg register to post the write for HDP flush
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.

Fixes: 689275140c ("drm/amdgpu/hdp7.0: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit dbc064adfc)
Cc: stable@vger.kernel.org
2025-05-08 11:48:12 -04:00
Alex Deucher
ca28e80abe drm/amdgpu/hdp6: use memcfg register to post the write for HDP flush
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.

Fixes: abe1cbaec6 ("drm/amdgpu/hdp6.0: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 84141ff615)
Cc: stable@vger.kernel.org
2025-05-08 11:47:54 -04:00
Alex Deucher
dbc988c689 drm/amdgpu/hdp5.2: use memcfg register to post the write for HDP flush
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.

Fixes: f756dbac1c ("drm/amdgpu/hdp5.2: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4a89b7698e)
Cc: stable@vger.kernel.org
2025-05-08 11:47:23 -04:00
Alex Deucher
0e33e0f339 drm/amdgpu/hdp5: use memcfg register to post the write for HDP flush
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.

Fixes: cf424020e0 ("drm/amdgpu/hdp5.0: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a5cb344033)
Cc: stable@vger.kernel.org
2025-05-08 11:46:57 -04:00
Linus Torvalds
2c89c1b655 Merge tag 'net-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
 "Including fixes from CAN, WiFi and netfilter.

  We have still a comple of regressions open due to the recent
  drivers locking refactor. The patches are in-flight, but not
  ready yet.

  Current release - regressions:

   - core: lock netdevices during dev_shutdown

   - sch_htb: make htb_deactivate() idempotent

   - eth: virtio-net: don't re-enable refill work too early

  Current release - new code bugs:

   - eth: icssg-prueth: fix kernel panic during concurrent Tx queue
     access

  Previous releases - regressions:

   - gre: fix again IPv6 link-local address generation.

   - eth: b53: fix learning on VLAN unaware bridges

  Previous releases - always broken:

   - wifi: fix out-of-bounds access during multi-link element
     defragmentation

   - can:
       - initialize spin lock on device probe
       - fix order of unregistration calls

   - openvswitch: fix unsafe attribute parsing in output_userspace()

   - eth:
       - virtio-net: fix total qstat values
       - mtk_eth_soc: reset all TX queues on DMA free
       - fbnic: firmware IPC mailbox fixes"

* tag 'net-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (55 commits)
  virtio-net: fix total qstat values
  net: export a helper for adding up queue stats
  fbnic: Do not allow mailbox to toggle to ready outside fbnic_mbx_poll_tx_ready
  fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context
  fbnic: Improve responsiveness of fbnic_mbx_poll_tx_ready
  fbnic: Cleanup handling of completions
  fbnic: Actually flush_tx instead of stalling out
  fbnic: Add additional handling of IRQs
  fbnic: Gate AXI read/write enabling on FW mailbox
  fbnic: Fix initialization of mailbox descriptor rings
  net: dsa: b53: do not set learning and unicast/multicast on up
  net: dsa: b53: fix learning on VLAN unaware bridges
  net: dsa: b53: fix toggling vlan_filtering
  net: dsa: b53: do not program vlans when vlan filtering is off
  net: dsa: b53: do not allow to configure VLAN 0
  net: dsa: b53: always rejoin default untagged VLAN on bridge leave
  net: dsa: b53: fix VLAN ID for untagged vlan on bridge leave
  net: dsa: b53: fix flushing old pvid VLAN on pvid change
  net: dsa: b53: fix clearing PVID of a port
  net: dsa: b53: keep CPU port always tagged again
  ...
2025-05-08 08:33:56 -07:00
Linus Torvalds
26a9a47ccd Merge tag 's390-6.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Heiko Carstens:

 - Fix potential use-after-free bug and missing error handling in PCI
   code

 - Fix dcssblk build error

 - Fix last breaking event handling in case of stack corruption to allow
   for better error reporting

 - Update defconfigs

* tag 's390-6.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/pci: Fix duplicate pci_dev_put() in disable_slot() when PF has child VFs
  s390/pci: Fix missing check for zpci_create_device() error return
  s390: Update defconfigs
  s390/dcssblk: Fix build error with CONFIG_DAX=m and CONFIG_DCSSBLK=y
  s390/entry: Fix last breaking event handling in case of stack corruption
  s390/configs: Enable options required for TC flow offload
  s390/configs: Enable VDPA on Nvidia ConnectX-6 network card
2025-05-08 08:29:13 -07:00
Linus Torvalds
80ae5fb229 Merge tag 'v6.15-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:

 - Fix UAF closing file table (e.g. in tree disconnect)

 - Fix potential out of bounds write

 - Fix potential memory leak parsing lease state in open

 - Fix oops in rename with empty target

* tag 'v6.15-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: Fix UAF in __close_file_table_ids
  ksmbd: prevent out-of-bounds stream writes by validating *pos
  ksmbd: fix memory leak in parse_lease_state()
  ksmbd: prevent rename with empty string
2025-05-08 08:22:35 -07:00
Jens Axboe
dd90905d5a Merge tag 'nvme-6.15-2025-05-08' of git://git.infradead.org/nvme into block-6.15
Pull NVMe fix from Christoph:

"nvme fixes for linux 6.15

 - unblock ctrl state transition for firmware update (Daniel Wagner)"

* tag 'nvme-6.15-2025-05-08' of git://git.infradead.org/nvme:
  nvme: unblock ctrl state transition for firmware update
2025-05-08 09:08:23 -06:00
Aaron Lu
c0d0a9ff6d block: remove test of incorrect io priority level
Ever since commit eca2040972b4("scsi: block: ioprio: Clean up interface
definition"), the macro IOPRIO_PRIO_LEVEL() will mask the level value to
something between 0 and 7 so necessarily, level will always be lower than
IOPRIO_NR_LEVELS(8).

Remove this obsolete check.

Reported-by: Kexin Wei <ys.weikexin@h3c.com>
Cc: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Aaron Lu <ziqianlu@bytedance.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250508083018.GA769554@bytedance
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-08 09:04:12 -06:00
James Morse
efe676a1a7 arm64: proton-pack: Add new CPUs 'k' values for branch mitigation
Update the list of 'k' values for the branch mitigation from arm's
website.

Add the values for Cortex-X1C. The MIDR_EL1 value can be found here:
https://developer.arm.com/documentation/101968/0002/Register-descriptions/AArch>

Link: https://developer.arm.com/documentation/110280/2-0/?lang=en
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
2025-05-08 15:29:28 +01:00
James Morse
f300769ead arm64: bpf: Only mitigate cBPF programs loaded by unprivileged users
Support for eBPF programs loaded by unprivileged users is typically
disabled. This means only cBPF programs need to be mitigated for BHB.

In addition, only mitigate cBPF programs that were loaded by an
unprivileged user. Privileged users can also load the same program
via eBPF, making the mitigation pointless.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
2025-05-08 15:29:03 +01:00
James Morse
0dfefc2ea2 arm64: bpf: Add BHB mitigation to the epilogue for cBPF programs
A malicious BPF program may manipulate the branch history to influence
what the hardware speculates will happen next.

On exit from a BPF program, emit the BHB mititgation sequence.

This is only applied for 'classic' cBPF programs that are loaded by
seccomp.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
2025-05-08 15:28:35 +01:00
James Morse
a1152be30a arm64: proton-pack: Expose whether the branchy loop k value
Add a helper to expose the k value of the branchy loop. This is needed
by the BPF JIT to generate the mitigation sequence in BPF programs.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
2025-05-08 15:28:35 +01:00
James Morse
e7956c92f3 arm64: proton-pack: Expose whether the platform is mitigated by firmware
is_spectre_bhb_fw_affected() allows the caller to determine if the CPU
is known to need a firmware mitigation. CPUs are either on the list
of CPUs we know about, or firmware has been queried and reported that
the platform is affected - and mitigated by firmware.

This helper is not useful to determine if the platform is mitigated
by firmware. A CPU could be on the know list, but the firmware may
not be implemented. Its affected but not mitigated.

spectre_bhb_enable_mitigation() handles this distinction by checking
the firmware state before enabling the mitigation.

Add a helper to expose this state. This will be used by the BPF JIT
to determine if calling firmware for a mitigation is necessary and
supported.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
2025-05-08 15:28:35 +01:00
James Morse
63de8abd97 arm64: insn: Add support for encoding DSB
To generate code in the eBPF epilogue that uses the DSB instruction,
insn.c needs a heler to encode the type and domain.

Re-use the crm encoding logic from the DMB instruction.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
2025-05-08 15:28:29 +01:00
Luiz Augusto von Dentz
c82b6357a5 Bluetooth: hci_event: Fix not using key encryption size when its known
This fixes the regression introduced by 50c1241e6a8a ("Bluetooth: l2cap:
Check encryption key size on incoming connection") introduced a check for
l2cap_check_enc_key_size which checks for hcon->enc_key_size which may
not be initialized if HCI_OP_READ_ENC_KEY_SIZE is still pending.

If the key encryption size is known, due previously reading it using
HCI_OP_READ_ENC_KEY_SIZE, then store it as part of link_key/smp_ltk
structures so the next time the encryption is changed their values are
used as conn->enc_key_size thus avoiding the racing against
HCI_OP_READ_ENC_KEY_SIZE.

Now that the enc_size is stored as part of key the information the code
then attempts to check that there is no downgrade of security if
HCI_OP_READ_ENC_KEY_SIZE returns a value smaller than what has been
previously stored.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=220061
Link: https://bugzilla.kernel.org/show_bug.cgi?id=220063
Fixes: 522e9ed157 ("Bluetooth: l2cap: Check encryption key size on incoming connection")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-05-08 10:24:15 -04:00
Sean Christopherson
e3417ab75a KVM: SVM: Set/clear SRSO's BP_SPEC_REDUCE on 0 <=> 1 VM count transitions
Set the magic BP_SPEC_REDUCE bit to mitigate SRSO when running VMs if and
only if KVM has at least one active VM.  Leaving the bit set at all times
unfortunately degrades performance by a wee bit more than expected.

Use a dedicated spinlock and counter instead of hooking virtualization
enablement, as changing the behavior of kvm.enable_virt_at_load based on
SRSO_BP_SPEC_REDUCE is painful, and has its own drawbacks, e.g. could
result in performance issues for flows that are sensitive to VM creation
latency.

Defer setting BP_SPEC_REDUCE until VMRUN is imminent to avoid impacting
performance on CPUs that aren't running VMs, e.g. if a setup is using
housekeeping CPUs.  Setting BP_SPEC_REDUCE in task context, i.e. without
blasting IPIs to all CPUs, also helps avoid serializing 1<=>N transitions
without incurring a gross amount of complexity (see the Link for details
on how ugly coordinating via IPIs gets).

Link: https://lore.kernel.org/all/aBOnzNCngyS_pQIW@google.com
Fixes: 8442df2b49 ("x86/bugs: KVM: Add support for SRSO_MSR_FIX")
Reported-by: Michael Larabel <Michael@michaellarabel.com>
Closes: https://www.phoronix.com/review/linux-615-amd-regression
Cc: Borislav Petkov <bp@alien8.de>
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20250505180300.973137-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-05-08 07:17:10 -07:00
Nícolas F. R. A. Prado
b7e3ec4e17 ASoC: mediatek: mt8188-mt6359: Depend on MT6359_ACCDET set or disabled
Commit 0116a7d84b ("ASoC: mediatek: mt6359: Add stub for
mt6359_accdet_enable_jack_detect") added a stub for
mt6359_accdet_enable_jack_detect() in order to allow the mt8188-mt6359
driver to be enabled without requiring the mt6359-accdet to also be
enabled, since it is not always needed.

However, in the case that CONFIG_SND_SOC_MT8188_MT6359=y and
CONFIG_SND_SOC_MT6359_ACCDET=m, a link error will happen, which commit
b19fa45715 ("ASoC: mediatek: mt8188-mt6359: select
CONFIG_SND_SOC_MT6359_ACCDET") solved by selecting
CONFIG_SND_SOC_MT6359_ACCDET.

In order to not require CONFIG_SND_SOC_MT6359_ACCDET as originally
intended, but also prevent the link error, depend on ACCDET being
enabled or disabled (which will force MT8188_MT6359=m if
MT6359_ACCDET=m).

Fixes: f35d834d67 ("ASoC: mediatek: mt8188-mt6359: Add accdet headset jack detect support")
Fixes: b19fa45715 ("ASoC: mediatek: mt8188-mt6359: select CONFIG_SND_SOC_MT6359_ACCDET")
Suggested-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20250507-mt8188-mt6359-accdet-depend-v1-1-aad70ce62964@collabora.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-08 23:12:29 +09:00
Samuel Holland
7f1c3de137 riscv: Disallow PR_GET_TAGGED_ADDR_CTRL without Supm
When the prctl() interface for pointer masking was added, it did not
check that the pointer masking ISA extension was supported, only the
individual submodes. Userspace could still attempt to disable pointer
masking and query the pointer masking state. commit 81de1afb2dd1
("riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL") disallowed
the former, as the senvcfg write could crash on older systems.
PR_GET_TAGGED_ADDR_CTRL state does not crash, because it reads only
kernel-internal state and not senvcfg, but it should still be disallowed
for consistency.

Fixes: 09d6775f50 ("riscv: Add support for userspace pointer masking")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/r/20250507145230.2272871-1-samuel.holland@sifive.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08 12:01:01 +00:00
Alexandre Ghiti
e9d86b8e17 scripts: Do not strip .rela.dyn section
The .rela.dyn section contains runtime relocations and is only emitted
for a relocatable kernel.

riscv uses this section to relocate the kernel at runtime but that section
is stripped from vmlinux. That prevents kexec to successfully load vmlinux
since it does not contain the relocations info needed.

Fixes: 559d1e45a1 ("riscv: Use --emit-relocs in order to move .rela.dyn in init")
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250408072851.90275-1-alexghiti@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08 12:01:01 +00:00
Nam Cao
ae08d55807 riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL
When userspace does PR_SET_TAGGED_ADDR_CTRL, but Supm extension is not
available, the kernel crashes:

Oops - illegal instruction [#1]
    [snip]
epc : set_tagged_addr_ctrl+0x112/0x15a
 ra : set_tagged_addr_ctrl+0x74/0x15a
epc : ffffffff80011ace ra : ffffffff80011a30 sp : ffffffc60039be10
    [snip]
status: 0000000200000120 badaddr: 0000000010a79073 cause: 0000000000000002
    set_tagged_addr_ctrl+0x112/0x15a
    __riscv_sys_prctl+0x352/0x73c
    do_trap_ecall_u+0x17c/0x20c
    andle_exception+0x150/0x15c

Fix it by checking if Supm is available.

Fixes: 09d6775f50 ("riscv: Add support for userspace pointer masking")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20250504101920.3393053-1-namcao@linutronix.de
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08 12:01:01 +00:00
Clément Léger
897e8aece3 riscv: misaligned: use get_user() instead of __get_user()
Now that we can safely handle user memory accesses while in the
misaligned access handlers, use get_user() instead of __get_user() to
have user memory access checks.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250422162324.956065-4-cleger@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08 12:00:58 +00:00
Clément Léger
453805f0a2 riscv: misaligned: enable IRQs while handling misaligned accesses
We can safely reenable IRQs if coming from userspace. This allows to
access user memory that could potentially trigger a page fault.

Fixes: b686ecdeac ("riscv: misaligned: Restrict user access to kernel memory")
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250422162324.956065-3-cleger@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08 12:00:36 +00:00
Clément Léger
fd94de9f9e riscv: misaligned: factorize trap handling
Since both load/store and user/kernel should use almost the same path and
that we are going to add some code around that, factorize it.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250422162324.956065-2-cleger@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08 12:00:13 +00:00
Dan Carpenter
0f5757667e pmdomain: core: Fix error checking in genpd_dev_pm_attach_by_id()
The error checking for of_count_phandle_with_args() does not handle
negative error codes correctly.  The problem is that "index" is a u32 so
in the condition "if (index >= num_domains)" negative error codes stored
in "num_domains" are type promoted to very high positive values and
"index" is always going to be valid.

Test for negative error codes first and then test if "index" is valid.

Fixes: 3ccf3f0cd1 ("PM / Domains: Enable genpd_dev_pm_attach_by_id|name() for single PM domain")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/aBxPQ8AI8N5v-7rL@stanley.mountain
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2025-05-08 13:29:30 +02:00
Andi Shyti
10aba126bc MAINTAINERS: Remove entry for Seth Heasley
Seth's mails bounce back, remove his maintainership.

Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20250505231511.3175151-1-andi.shyti@kernel.org
2025-05-08 12:08:56 +02:00
Paolo Abeni
3c44b2d615 Merge branch 'virtio-net-fix-total-qstat-values'
Jakub Kicinski says:

====================
virtio-net: fix total qstat values

Another small fix discovered after we enabled virtio multi-queue
in netdev CI. The queue stat test fails:

  # Exception| Exception: Qstats are lower, fetched later
  not ok 3 stats.pkt_byte_sum

The queue stats from disabled queues are supposed to be reported
in the "base" stats.
====================

Link: https://patch.msgid.link/20250507003221.823267-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08 11:56:14 +02:00
Jakub Kicinski
001160ec8c virtio-net: fix total qstat values
NIPA tests report that the interface statistics reported
via qstat are lower than those reported via ip link.
Looks like this is because some tests flip the queue
count up and down, and we end up with some of the traffic
accounted on disabled queues.

Add up counters from disabled queues.

Fixes: d888f04c09 ("virtio-net: support queue stat")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250507003221.823267-3-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08 11:56:12 +02:00
Jakub Kicinski
23fa6a23d9 net: export a helper for adding up queue stats
Older drivers and drivers with lower queue counts often have a static
array of queues, rather than allocating structs for each queue on demand.
Add a helper for adding up qstats from a queue range. Expectation is
that driver will pass a queue range [netdev->real_num_*x_queues, MAX).
It was tempting to always use num_*x_queues as the end, but virtio
seems to clamp its queue count after allocating the netdev. And this
way we can trivaly reuse the helper for [0, real_..).

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250507003221.823267-2-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08 11:56:12 +02:00
Paolo Abeni
95d2f25871 Merge branch 'fbnic-fw-ipc-mailbox-fixes'
Alexander Duyck says:

====================
fbnic: FW IPC Mailbox fixes

This series is meant to address a number of issues that have been found in
the FW IPC mailbox over the past several months.

The main issues addressed are:
1. Resolve a potential race between host and FW during initialization that
can cause the FW to only have the lower 32b of an address.
2. Block the FW from issuing DMA requests after we have closed the mailbox
and before we have started issuing requests on it.
3. Fix races in the IRQ handlers that can cause the IRQ to unmask itself if
it is being processed while we are trying to disable it.
4. Cleanup the Tx flush logic so that we actually lock down the Tx path
before we start flushing it instead of letting it free run while we are
shutting it down.
5. Fix several memory leaks that could occur if we failed initialization.
6. Cleanup the mailbox completion if we are flushing Tx since we are no
longer processing Rx.
7. Move several allocations out of a potential IRQ/atomic context.

There have been a few optimizations we also picked up since then. Rather
than split them out I just folded them into these diffs. They mostly
address minor issues such as how long it takes to initialize and/or fail so
I thought they could probably go in with the rest of the patches. They
consist of:
1. Do not sleep more than 20ms waiting on FW to respond as the 200ms value
likely originated from simulation/emulation testing.
2. Use jiffies to determine timeout instead of sleep * attempts for better
accuracy.

Reviewed-by: Jakub Kicinski <kuba@kernel.org>
====================

Link: https://patch.msgid.link/174654659243.499179.11194817277075480209.stgit@ahduyck-xeon-server.home.arpa
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08 11:33:33 +02:00
Alexander Duyck
ce2fa1dba2 fbnic: Do not allow mailbox to toggle to ready outside fbnic_mbx_poll_tx_ready
We had originally thought to have the mailbox go to ready in the background
while we were doing other things. One issue with this though is that we
can't disable it by clearing the ready state without also blocking
interrupts or calls to mbx_poll as it will just pop back to life during an
interrupt.

In order to prevent that from happening we can pull the code for toggling
to ready out of the interrupt path and instead place it in the
fbnic_mbx_poll_tx_ready path so that it becomes the only spot where the
Rx/Tx can toggle to the ready state. By doing this we can prevent races
where we disable the DMA and/or free buffers only to have an interrupt fire
and undo what we have done.

Fixes: da3cde0820 ("eth: fbnic: Add FW communication mechanism")
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/174654722518.499179.11612865740376848478.stgit@ahduyck-xeon-server.home.arpa
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08 11:33:30 +02:00
Alexander Duyck
1b34d1c1dc fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context
This change pulls the call to fbnic_fw_xmit_cap_msg out of
fbnic_mbx_init_desc_ring and instead places it in the polling function for
getting the Tx ready. Doing that we can avoid the potential issue with an
interrupt coming in later from the firmware that causes it to get fired in
interrupt context.

Fixes: 20d2e88cc7 ("eth: fbnic: Add initial messaging to notify FW of our presence")
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/174654721876.499179.9839651602256668493.stgit@ahduyck-xeon-server.home.arpa
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08 11:33:30 +02:00
Alexander Duyck
ab064f6005 fbnic: Improve responsiveness of fbnic_mbx_poll_tx_ready
There were a couple different issues found in fbnic_mbx_poll_tx_ready.
Among them were the fact that we were sleeping much longer than we actually
needed to as the actual FW could respond in under 20ms. The other issue was
that we would just keep polling the mailbox even if the device itself had
gone away.

To address the responsiveness issues we can decrease the sleeps to 20ms and
use a jiffies based timeout value rather than just counting the number of
times we slept and then polled.

To address the hardware going away we can move the check for the firmware
BAR being present from where it was and place it inside the loop after the
mailbox descriptor ring is initialized and before we sleep so that we just
abort and return an error if the device went away during initialization.

With these two changes we see a significant improvement in boot times for
the driver.

Fixes: da3cde0820 ("eth: fbnic: Add FW communication mechanism")
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/174654721224.499179.2698616208976624755.stgit@ahduyck-xeon-server.home.arpa
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08 11:33:30 +02:00
Alexander Duyck
cdbb2dc399 fbnic: Cleanup handling of completions
There was an issue in that if we were to shutdown we could be left with
a completion in flight as the mailbox went away. To address that I have
added an fbnic_mbx_evict_all_cmpl function that is meant to essentially
create a "broken pipe" type response so that all callers will receive an
error indicating that the connection has been broken as a result of us
shutting down the mailbox.

Fixes: 378e5cc1c6 ("eth: fbnic: hwmon: Add completion infrastructure for firmware requests")
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/174654720578.499179.380252598204530873.stgit@ahduyck-xeon-server.home.arpa
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08 11:33:30 +02:00
Alexander Duyck
0f9a959a0a fbnic: Actually flush_tx instead of stalling out
The fbnic_mbx_flush_tx function had a number of issues.

First, we were waiting 200ms for the firmware to process the packets. We
can drop this to 20ms and in almost all cases this should be more than
enough time. So by changing this we can significantly reduce shutdown time.

Second, we were not making sure that the Tx path was actually shut off. As
such we could still have packets added while we were flushing the mailbox.
To prevent that we can now clear the ready flag for the Tx side and it
should stay down since the interrupt is disabled.

Third, we kept re-reading the tail due to the second issue. The tail should
not move after we have started the flush so we can just read it once while
we are holding the mailbox Tx lock. By doing that we are guaranteed that
the value should be consistent.

Fourth, we were keeping a count of descriptors cleaned due to the second
and third issues called out. That count is not a valid reason to be exiting
the cleanup, and with the tail only being read once we shouldn't see any
cases where the tail moves after the disable so the tracking of count can
be dropped.

Fifth, we were using attempts * sleep time to determine how long we would
wait in our polling loop to flush out the Tx. This can be very imprecise.
In order to tighten up the timing we are shifting over to using a jiffies
value of jiffies + 10 * HZ + 1 to determine the jiffies value we should
stop polling at as this should be accurate within once sleep cycle for the
total amount of time spent polling.

Fixes: da3cde0820 ("eth: fbnic: Add FW communication mechanism")
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/174654719929.499179.16406653096197423749.stgit@ahduyck-xeon-server.home.arpa
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08 11:33:30 +02:00
Alexander Duyck
682a61281d fbnic: Add additional handling of IRQs
We have two issues that need to be addressed in our IRQ handling.

One is the fact that we can end up double-freeing IRQs in the event of an
exception handling error such as a PCIe reset/recovery that fails. To
prevent that from becoming an issue we can use the msix_vector values to
indicate that we have successfully requested/freed the IRQ by only setting
or clearing them when we have completed the given action.

The other issue is that we have several potential races in our IRQ path due
to us manipulating the mask before the vector has been truly disabled. In
order to handle that in the case of the FW mailbox we need to not
auto-enable the IRQ and instead will be enabling/disabling it separately.
In the case of the PCS vector we can mitigate this by unmapping it and
synchronizing the IRQ before we clear the mask.

The general order of operations after this change is now to request the
interrupt, poll the FW mailbox to ready, and then enable the interrupt. For
the shutdown we do the reverse where we disable the interrupt, flush any
pending Tx, and then free the IRQ. I am renaming the enable/disable to
request/free to be equivilent with the IRQ calls being used. We may see
additions in the future to enable/disable the IRQs versus request/free them
for certain use cases.

Fixes: da3cde0820 ("eth: fbnic: Add FW communication mechanism")
Fixes: 69684376ee ("eth: fbnic: Add link detection")
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/174654719271.499179.3634535105127848325.stgit@ahduyck-xeon-server.home.arpa
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08 11:33:30 +02:00
Alexander Duyck
3b12f00ddd fbnic: Gate AXI read/write enabling on FW mailbox
In order to prevent the device from throwing spurious writes and/or reads
at us we need to gate the AXI fabric interface to the PCIe until such time
as we know the FW is in a known good state.

To accomplish this we use the mailbox as a mechanism for us to recognize
that the FW has acknowledged our presence and is no longer sending any
stale message data to us.

We start in fbnic_mbx_init by calling fbnic_mbx_reset_desc_ring function,
disabling the DMA in both directions, and then invalidating all the
descriptors in each ring.

We then poll the mailbox in fbnic_mbx_poll_tx_ready and when the interrupt
is set by the FW we pick it up and mark the mailboxes as ready, while also
enabling the DMA.

Once we have completed all the transactions and need to shut down we call
into fbnic_mbx_clean which will in turn call fbnic_mbx_reset_desc_ring for
each ring and shut down the DMA and once again invalidate the descriptors.

Fixes: 3646153161 ("eth: fbnic: Add register init to set PCIe/Ethernet device config")
Fixes: da3cde0820 ("eth: fbnic: Add FW communication mechanism")
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/174654718623.499179.7445197308109347982.stgit@ahduyck-xeon-server.home.arpa
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08 11:33:30 +02:00
Alexander Duyck
f34343cc11 fbnic: Fix initialization of mailbox descriptor rings
Address to issues with the FW mailbox descriptor initialization.

We need to reverse the order of accesses when we invalidate an entry versus
writing an entry. When writing an entry we write upper and then lower as
the lower 32b contain the valid bit that makes the entire address valid.
However for invalidation we should write it in the reverse order so that
the upper is marked invalid before we update it.

Without this change we may see FW attempt to access pages with the upper
32b of the address set to 0 which will likely result in DMAR faults due to
write access failures on mailbox shutdown.

Fixes: da3cde0820 ("eth: fbnic: Add FW communication mechanism")
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/174654717972.499179.8083789731819297034.stgit@ahduyck-xeon-server.home.arpa
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08 11:33:30 +02:00
Petr Vaněk
7b08b74f3d mm: fix folio_pte_batch() on XEN PV
On XEN PV, folio_pte_batch() can incorrectly batch beyond the end of a
folio due to a corner case in pte_advance_pfn().  Specifically, when the
PFN following the folio maps to an invalidated MFN,

	expected_pte = pte_advance_pfn(expected_pte, nr);

produces a pte_none().  If the actual next PTE in memory is also
pte_none(), the pte_same() succeeds,

	if (!pte_same(pte, expected_pte))
		break;

the loop is not broken, and batching continues into unrelated memory.

For example, with a 4-page folio, the PTE layout might look like this:

[   53.465673] [ T2552] folio_pte_batch: printing PTE values at addr=0x7f1ac9dc5000
[   53.465674] [ T2552]   PTE[453] = 000000010085c125
[   53.465679] [ T2552]   PTE[454] = 000000010085d125
[   53.465682] [ T2552]   PTE[455] = 000000010085e125
[   53.465684] [ T2552]   PTE[456] = 000000010085f125
[   53.465686] [ T2552]   PTE[457] = 0000000000000000 <-- not present
[   53.465689] [ T2552]   PTE[458] = 0000000101da7125

pte_advance_pfn(PTE[456]) returns a pte_none() due to invalid PFN->MFN
mapping.  The next actual PTE (PTE[457]) is also pte_none(), so the loop
continues and includes PTE[457] in the batch, resulting in 5 batched
entries for a 4-page folio.  This triggers the following warning:

[   53.465751] [ T2552] page: refcount:85 mapcount:20 mapping:ffff88813ff4f6a8 index:0x110 pfn:0x10085c
[   53.465754] [ T2552] head: order:2 mapcount:80 entire_mapcount:0 nr_pages_mapped:4 pincount:0
[   53.465756] [ T2552] memcg:ffff888003573000
[   53.465758] [ T2552] aops:0xffffffff8226fd20 ino:82467c dentry name(?):"libc.so.6"
[   53.465761] [ T2552] flags: 0x2000000000416c(referenced|uptodate|lru|active|private|head|node=0|zone=2)
[   53.465764] [ T2552] raw: 002000000000416c ffffea0004021f08 ffffea0004021908 ffff88813ff4f6a8
[   53.465767] [ T2552] raw: 0000000000000110 ffff888133d8bd40 0000005500000013 ffff888003573000
[   53.465768] [ T2552] head: 002000000000416c ffffea0004021f08 ffffea0004021908 ffff88813ff4f6a8
[   53.465770] [ T2552] head: 0000000000000110 ffff888133d8bd40 0000005500000013 ffff888003573000
[   53.465772] [ T2552] head: 0020000000000202 ffffea0004021701 000000040000004f 00000000ffffffff
[   53.465774] [ T2552] head: 0000000300000003 8000000300000002 0000000000000013 0000000000000004
[   53.465775] [ T2552] page dumped because: VM_WARN_ON_FOLIO((_Generic((page + nr_pages - 1), const struct page *: (const struct folio *)_compound_head(page + nr_pages - 1), struct page *: (struct folio *)_compound_head(page + nr_pages - 1))) != folio)

Original code works as expected everywhere, except on XEN PV, where
pte_advance_pfn() can yield a pte_none() after balloon inflation due to
MFNs invalidation.  In XEN, pte_advance_pfn() ends up calling
__pte()->xen_make_pte()->pte_pfn_to_mfn(), which returns pte_none() when
mfn == INVALID_P2M_ENTRY.

The pte_pfn_to_mfn() documents that nastiness:

	If there's no mfn for the pfn, then just create an
	empty non-present pte.  Unfortunately this loses
	information about the original pfn, so
	pte_mfn_to_pfn is asymmetric.

While such hacks should certainly be removed, we can do better in
folio_pte_batch() and simply check ahead of time how many PTEs we can
possibly batch in our folio.

This way, we can not only fix the issue but cleanup the code: removing the
pte_pfn() check inside the loop body and avoiding end_ptr comparison +
arithmetic.

Link: https://lkml.kernel.org/r/20250502215019.822-2-arkamar@atlas.cz
Fixes: f8d937761d ("mm/memory: optimize fork() with PTE-mapped THP")
Co-developed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Petr Vaněk <arkamar@atlas.cz>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:42 -07:00
Ryusuke Konishi
fb881cd760 nilfs2: fix deadlock warnings caused by lock dependency in init_nilfs()
After commit c0e473a0d2 ("block: fix race between set_blocksize and read
paths") was merged, set_blocksize() called by sb_set_blocksize() now locks
the inode of the backing device file.  As a result of this change, syzbot
started reporting deadlock warnings due to a circular dependency involving
the semaphore "ns_sem" of the nilfs object, the inode lock of the backing
device file, and the locks that this inode lock is transitively dependent
on.

This is caused by a new lock dependency added by the above change, since
init_nilfs() calls sb_set_blocksize() in the lock section of "ns_sem". 
However, these warnings are false positives because init_nilfs() is called
in the early stage of the mount operation and the filesystem has not yet
started.

The reason why "ns_sem" is locked in init_nilfs() was to avoid a race
condition in nilfs_fill_super() caused by sharing a nilfs object among
multiple filesystem instances (super block structures) in the early
implementation.  However, nilfs objects and super block structures have
long ago become one-to-one, and there is no longer any need to use the
semaphore there.

So, fix this issue by removing the use of the semaphore "ns_sem" in
init_nilfs().

Link: https://lkml.kernel.org/r/20250503053327.12294-1-konishi.ryusuke@gmail.com
Fixes: c0e473a0d2 ("block: fix race between set_blocksize and read paths")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Reported-by: syzbot+00f7f5b884b117ee6773@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=00f7f5b884b117ee6773
Tested-by: syzbot+00f7f5b884b117ee6773@syzkaller.appspotmail.com
Reported-by: syzbot+f30591e72bfc24d4715b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f30591e72bfc24d4715b
Tested-by: syzbot+f30591e72bfc24d4715b@syzkaller.appspotmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:42 -07:00
Frank van der Linden
dac2a4f663 mm/hugetlb: copy the CMA flag when demoting
Since commit d2d7867140 ("mm/hugetlb: enable bootmem allocation from CMA
areas"), a flag is used to mark hugetlb folios as allocated from CMA. 
This flag is also used to decide if it should be freed to CMA.

However, the flag isn't copied to the smaller folios when a hugetlb folio
is broken up for demotion, which would cause it to be freed incorrectly.

Fix this by copying the flag to the smaller order hugetlb pages created
from the original one.

Link: https://lkml.kernel.org/r/20250501044325.20365-1-fvdl@google.com
Fixes: d2d7867140 ("mm/hugetlb: enable bootmem allocation from CMA areas")
Signed-off-by: Frank van der Linden <fvdl@google.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Jane Chu <Jane.Chu@oracle.com>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:42 -07:00
Kairui Song
9a9794a81a mm, swap: fix false warning for large allocation with !THP_SWAP
The !CONFIG_THP_SWAP check existed before just fine because slot cache
would reject high order allocation and let the caller split all folios and
try again.

But slot cache is gone, so large allocation will directly go to the
allocator, and the allocator should just fail silently to inform caller to
do the folio split, this is totally fine and expected.

Remove this meaningless warning.

Link: https://lkml.kernel.org/r/20250429094803.85518-1-ryncsn@gmail.com
Fixes: 0ff67f990b ("mm, swap: remove swap slot cache")
Signed-off-by: Kairui Song <kasong@tencent.com>
Reported-by: Heiko Carstens <hca@linux.ibm.com>
Closes: https://lore.kernel.org/linux-mm/20250428135252.25453B17-hca@linux.ibm.com/
Tested-by: Heiko Carstens <hca@linux.ibm.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:41 -07:00
Nysal Jan K.A.
8cf6ecb18b selftests/mm: fix a build failure on powerpc
The compiler is unaware of the size of code generated by the ".rept"
assembler directive.  This results in the compiler emitting branch
instructions where the offset to branch to exceeds the maximum allowed
value, resulting in build failures like the following:

  CC       protection_keys
  /tmp/ccypKWAE.s: Assembler messages:
  /tmp/ccypKWAE.s:2073: Error: operand out of range (0x0000000000020158
  is not between 0xffffffffffff8000 and 0x0000000000007ffc)
  /tmp/ccypKWAE.s:2509: Error: operand out of range (0x0000000000020130
  is not between 0xffffffffffff8000 and 0x0000000000007ffc)

Fix the issue by manually adding nop instructions using the preprocessor.

Link: https://lkml.kernel.org/r/20250428131937.641989-2-nysal@linux.ibm.com
Fixes: 46036188ea ("selftests/mm: build with -O2")
Reported-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Nysal Jan K.A. <nysal@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Reviewed-by: Donet Tom <donettom@linux.ibm.com>
Tested-by: Donet Tom <donettom@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:41 -07:00
Madhavan Srinivasan
22adb52862 selftests/mm: fix build break when compiling pkey_util.c
Commit 50910acd6f ("selftests/mm: use sys_pkey helpers consistently")
added a pkey_util.c to refactor some of the protection_keys functions
accessible by other tests.  But this broken the build in powerpc in two
ways,

pkey-powerpc.h: In function `arch_is_powervm':
pkey-powerpc.h:73:21: error: storage size of `buf' isn't known
   73 |         struct stat buf;
      |                     ^~~
pkey-powerpc.h:75:14: error: implicit declaration of function `stat'; did you mean `strcat'? [-Wimplicit-function-declaration]
   75 |         if ((stat("/sys/firmware/devicetree/base/ibm,partition-name", &buf) == 0) &&
      |              ^~~~
      |              strcat

Since pkey_util.c includes pkeys-helper.h, which in turn includes pkeys-powerpc.h,
stat.h including is missing for "struct stat". This is fixed by adding "sys/stat.h"
in pkeys-powerpc.h

Secondly,

pkey-powerpc.h:55:18: warning: format `%llx' expects argument of type `long long unsigned int', but argument 3 has type `u64' {aka `long unsigned int'} [-Wformat=]
   55 |         dprintf4("%s() changing %016llx to %016llx\n",
      |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   56 |                          __func__, __read_pkey_reg(), pkey_reg);
      |                                    ~~~~~~~~~~~~~~~~~
      |                                    |
      |                                    u64 {aka long unsigned int}
pkey-helpers.h:63:32: note: in definition of macro `dprintf_level'
   63 |                 sigsafe_printf(args);           \
      |                                ^~~~

These format specifier related warning are removed by adding
"__SANE_USERSPACE_TYPES__" to pkeys_utils.c.

Link: https://lkml.kernel.org/r/20250428131937.641989-1-nysal@linux.ibm.com
Fixes: 50910acd6f ("selftests/mm: use sys_pkey helpers consistently")
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Nysal Jan K.A. <nysal@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:41 -07:00
Kees Cook
a0309faf1c mm: vmalloc: support more granular vrealloc() sizing
Introduce struct vm_struct::requested_size so that the requested
(re)allocation size is retained separately from the allocated area size. 
This means that KASAN will correctly poison the correct spans of requested
bytes.  This also means we can support growing the usable portion of an
allocation that can already be supported by the existing area's existing
allocation.

Link: https://lkml.kernel.org/r/20250426001105.it.679-kees@kernel.org
Fixes: 3ddc2fefe6 ("mm: vmalloc: implement vrealloc()")
Signed-off-by: Kees Cook <kees@kernel.org>
Reported-by: Erhard Furtner <erhard_f@mailbox.org>
Closes: https://lore.kernel.org/all/20250408192503.6149a816@outsider.home/
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:41 -07:00
Lorenzo Stoakes
a8efadda86 tools/testing/selftests: fix guard region test tmpfs assumption
The current implementation of the guard region tests assume that /tmp is
mounted as tmpfs, that is shmem.

This isn't always the case, and at least one instance of a spurious test
failure has been reported as a result.

This assumption is unsafe, rushed and silly - and easily remedied by
simply using memfd, so do so.

We also have to fixup the readonly_file test to explicitly only be
applicable to file-backed cases.

Link: https://lkml.kernel.org/r/20250425162436.564002-1-lorenzo.stoakes@oracle.com
Fixes: 272f37d3e9 ("tools/selftests: expand all guard region tests to file-backed")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reported-by: Ryan Roberts <ryan.roberts@arm.com>
Closes: https://lore.kernel.org/linux-mm/a2d2766b-0ab4-437b-951a-8595a7506fe9@arm.com/
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:40 -07:00
Jan Kara
fcaf3b2683 ocfs2: stop quota recovery before disabling quotas
Currently quota recovery is synchronized with unmount using sb->s_umount
semaphore.  That is however prone to deadlocks because
flush_workqueue(osb->ocfs2_wq) called from umount code can wait for quota
recovery to complete while ocfs2_finish_quota_recovery() waits for
sb->s_umount semaphore.

Grabbing of sb->s_umount semaphore in ocfs2_finish_quota_recovery() is
only needed to protect that function from disabling of quotas from
ocfs2_dismount_volume().  Handle this problem by disabling quota recovery
early during unmount in ocfs2_dismount_volume() instead so that we can
drop acquisition of sb->s_umount from ocfs2_finish_quota_recovery().

Link: https://lkml.kernel.org/r/20250424134515.18933-6-jack@suse.cz
Fixes: 5f530de63c ("ocfs2: Use s_umount for quota recovery protection")
Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Shichangkuo <shi.changkuo@h3c.com>
Reported-by: Murad Masimov <m.masimov@mt-integration.ru>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Tested-by: Heming Zhao <heming.zhao@suse.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:40 -07:00
Jan Kara
8f947e0fd5 ocfs2: implement handshaking with ocfs2 recovery thread
We will need ocfs2 recovery thread to acknowledge transitions of
recovery_state when disabling particular types of recovery.  This is
similar to what currently happens when disabling recovery completely, just
more general.  Implement the handshake and use it for exit from recovery.

Link: https://lkml.kernel.org/r/20250424134515.18933-5-jack@suse.cz
Fixes: 5f530de63c ("ocfs2: Use s_umount for quota recovery protection")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Tested-by: Heming Zhao <heming.zhao@suse.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Murad Masimov <m.masimov@mt-integration.ru>
Cc: Shichangkuo <shi.changkuo@h3c.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:40 -07:00
Jan Kara
c0fb83088f ocfs2: switch osb->disable_recovery to enum
Patch series "ocfs2: Fix deadlocks in quota recovery", v3.

This implements another approach to fixing quota recovery deadlocks.  We
avoid grabbing sb->s_umount semaphore from ocfs2_finish_quota_recovery()
and instead stop quota recovery early in ocfs2_dismount_volume().


This patch (of 3):

We will need more recovery states than just pure enable / disable to fix
deadlocks with quota recovery.  Switch osb->disable_recovery to enum.

Link: https://lkml.kernel.org/r/20250424134301.1392-1-jack@suse.cz
Link: https://lkml.kernel.org/r/20250424134515.18933-4-jack@suse.cz
Fixes: 5f530de63c ("ocfs2: Use s_umount for quota recovery protection")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Tested-by: Heming Zhao <heming.zhao@suse.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Murad Masimov <m.masimov@mt-integration.ru>
Cc: Shichangkuo <shi.changkuo@h3c.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:40 -07:00
Uwe Kleine-König
80fbee76eb mailmap: map Uwe's BayLibre addresses to a single one
When I started working for BayLibre I wasn't aware that the mailserver
rewrote the sender address and so a few commits entered kernel history
with a working but unexpected address.  Map the unexpected to the intended
one.  This also makes the author of those commits (e.g.  32b4f1a4f0
("pwm: jz4740: Another few conversions to regmap_{set,clear}_bits()"))
match the address used in the sign-off line.

Link: https://lkml.kernel.org/r/20250424060250.3085683-2-u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:39 -07:00
Lorenzo Stoakes
09fc97b3ab MAINTAINERS: add mm THP section
As part of the ongoing efforts to sub-divide memory management
maintainership and reviewership, establish a section for Transparent Huge
Page support and add appropriate maintainers and reviewers.

[lorenzo.stoakes@oracle.com: add Dev Jain as THP reviewer]
  Link: https://lkml.kernel.org/r/327e6f2f-0f0f-48af-9ca2-3f8cadf0d8bf@lucifer.local
Link: https://lkml.kernel.org/r/20250424111632.103637-1-lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:39 -07:00
Peter Xu
9556772917 mm/userfaultfd: fix uninitialized output field for -EAGAIN race
While discussing some userfaultfd relevant issues recently, Andrea noticed
a potential ABI breakage with -EAGAIN on almost all userfaultfd ioctl()s.

Quote from Andrea, explaining how -EAGAIN was processed, and how this
should fix it (taking example of UFFDIO_COPY ioctl):

  The "mmap_changing" and "stale pmd" conditions are already reported as
  -EAGAIN written in the copy field, this does not change it. This change
  removes the subnormal case that left copy.copy uninitialized and required
  apps to explicitly set the copy field to get deterministic
  behavior (which is a requirement contrary to the documentation in both
  the manpage and source code). In turn there's no alteration to backwards
  compatibility as result of this change because userland will find the
  copy field consistently set to -EAGAIN, and not anymore sometime -EAGAIN
  and sometime uninitialized.

  Even then the change only can make a difference to non cooperative users
  of userfaultfd, so when UFFD_FEATURE_EVENT_* is enabled, which is not
  true for the vast majority of apps using userfaultfd or this unintended
  uninitialized field may have been noticed sooner.

Meanwhile, since this bug existed for years, it also almost affects all
ioctl()s that was introduced later.  Besides UFFDIO_ZEROPAGE, these also
get affected in the same way:

  - UFFDIO_CONTINUE
  - UFFDIO_POISON
  - UFFDIO_MOVE

This patch should have fixed all of them.

Link: https://lkml.kernel.org/r/20250424215729.194656-2-peterx@redhat.com
Fixes: df2cc96e77 ("userfaultfd: prevent non-cooperative events vs mcopy_atomic races")
Fixes: f619147104 ("userfaultfd: add UFFDIO_CONTINUE ioctl")
Fixes: fc71884a5f ("mm: userfaultfd: add new UFFDIO_POISON ioctl")
Fixes: adef440691 ("userfaultfd: UFFDIO_MOVE uABI")
Signed-off-by: Peter Xu <peterx@redhat.com>
Reported-by: Andrea Arcangeli <aarcange@redhat.com>
Suggested-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:39 -07:00
Feng Tang
ab00ddd802 selftests/mm: compaction_test: support platform with huge mount of memory
When running mm selftest to verify mm patches, 'compaction_test' case
failed on an x86 server with 1TB memory.  And the root cause is that it
has too much free memory than what the test supports.

The test case tries to allocate 100000 huge pages, which is about 200 GB
for that x86 server, and when it succeeds, it expects it's large than 1/3
of 80% of the free memory in system.  This logic only works for platform
with 750 GB ( 200 / (1/3) / 80% ) or less free memory, and may raise false
alarm for others.

Fix it by changing the fixed page number to self-adjustable number
according to the real number of free memory.

Link: https://lkml.kernel.org/r/20250423103645.2758-1-feng.tang@linux.alibaba.com
Fixes: bd67d5c15c ("Test compaction of mlocked memory")
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
Acked-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Tested-by: Baolin Wang <baolin.wang@inux.alibaba.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Sri Jayaramappa <sjayaram@akamai.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:39 -07:00
Lorenzo Stoakes
a47694ecb8 MAINTAINERS: add core mm section
In furtherance of ongoing efforts to ensure people are aware of who
de-facto maintains/has an interest in specific parts of mm, as well trying
to avoid get_maintainers.pl listing only Andrew and the mailing list for
mm files - establish a 'core' memory management section establishing David
as co-maintainer alongside Andrew (thanks David for volunteering!) along
with a number of relevant reviewers.

We try to keep things as fine-grained as possible, so we place only
obviously 'general' mm things here.  For files which are specific to a
particular part of mm, we prefer new entries.

Link: https://lkml.kernel.org/r/20250423123042.59082-1-lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:38 -07:00
Mark Tinguely
31d4cd4eb2 ocfs2: fix panic in failed foilio allocation
commit 7e119cff9d ("ocfs2: convert w_pages to w_folios") and commit
9a5e08652d ("ocfs2: use an array of folios instead of an array of
pages") save -ENOMEM in the folio array upon allocation failure and call
the folio array free code.

The folio array free code expects either valid folio pointers or NULL. 
Finding the -ENOMEM will result in a panic.  Fix by NULLing the error
folio entry.

Link: https://lkml.kernel.org/r/c879a52b-835c-4fa0-902b-8b2e9196dcbd@oracle.com
Fixes: 7e119cff9d ("ocfs2: convert w_pages to w_folios")
Fixes: 9a5e08652d ("ocfs2: use an array of folios instead of an array of pages")
Signed-off-by: Mark Tinguely <mark.tinguely@oracle.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:38 -07:00
Gavin Guo
be6e843fc5 mm/huge_memory: fix dereferencing invalid pmd migration entry
When migrating a THP, concurrent access to the PMD migration entry during
a deferred split scan can lead to an invalid address access, as
illustrated below.  To prevent this invalid access, it is necessary to
check the PMD migration entry and return early.  In this context, there is
no need to use pmd_to_swp_entry and pfn_swap_entry_to_page to verify the
equality of the target folio.  Since the PMD migration entry is locked, it
cannot be served as the target.

Mailing list discussion and explanation from Hugh Dickins: "An anon_vma
lookup points to a location which may contain the folio of interest, but
might instead contain another folio: and weeding out those other folios is
precisely what the "folio != pmd_folio((*pmd)" check (and the "risk of
replacing the wrong folio" comment a few lines above it) is for."

BUG: unable to handle page fault for address: ffffea60001db008
CPU: 0 UID: 0 PID: 2199114 Comm: tee Not tainted 6.14.0+ #4 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:split_huge_pmd_locked+0x3b5/0x2b60
Call Trace:
<TASK>
try_to_migrate_one+0x28c/0x3730
rmap_walk_anon+0x4f6/0x770
unmap_folio+0x196/0x1f0
split_huge_page_to_list_to_order+0x9f6/0x1560
deferred_split_scan+0xac5/0x12a0
shrinker_debugfs_scan_write+0x376/0x470
full_proxy_write+0x15c/0x220
vfs_write+0x2fc/0xcb0
ksys_write+0x146/0x250
do_syscall_64+0x6a/0x120
entry_SYSCALL_64_after_hwframe+0x76/0x7e

The bug is found by syzkaller on an internal kernel, then confirmed on
upstream.

Link: https://lkml.kernel.org/r/20250421113536.3682201-1-gavinguo@igalia.com
Link: https://lore.kernel.org/all/20250414072737.1698513-1-gavinguo@igalia.com/
Link: https://lore.kernel.org/all/20250418085802.2973519-1-gavinguo@igalia.com/
Fixes: 84c3fc4e9c ("mm: thp: check pmd migration entry in common path")
Signed-off-by: Gavin Guo <gavinguo@igalia.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Florent Revest <revest@google.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:38 -07:00
Lorenzo Stoakes
e81224f0ba MAINTAINERS: add reverse mapping section
Separate out the reverse mapping part of memory management and assign
appropriate maintainers and reviewers.

David has long been invovled in work with the reverse mapping and
continues to do so, so is well suited to maintain this area of the kernel.

I have a lot of experience working with the anonymous reverse mapping and
continue to work in this area, and also have good knowledge of the walking
code and code related to VMAs.

This helps people identify who to ask for help, and also additionally makes
life easier in review.

Link: https://lkml.kernel.org/r/20250418150052.299220-1-lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Acked-by: Harry Yoo <harry.yoo@oracle.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:38 -07:00
Guenter Roeck
00a241f528 x86: disable image size check for test builds
64-bit allyesconfig builds fail with

x86_64-linux-ld: kernel image bigger than KERNEL_IMAGE_SIZE

Bisect points to commit 6f110a5e4f ("Disable SLUB_TINY for build
testing") as the responsible commit.  Reverting that patch does indeed fix
the problem.  Further analysis shows that disabling SLUB_TINY enables
KASAN, and that KASAN is responsible for the image size increase.

Solve the build problem by disabling the image size check for test
builds.

[akpm@linux-foundation.org: add comment, fix nearby typo (sink->sync)]
[akpm@linux-foundation.org: fix comment snafu
  Link: https://lore.kernel.org/oe-kbuild-all/202504191813.4r9H6Glt-lkp@intel.com/
Link: https://lkml.kernel.org/r/20250417010950.2203847-1-linux@roeck-us.net
Fixes: 6f110a5e4f ("Disable SLUB_TINY for build testing")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Borislav Betkov <bp@alien8.de>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: <x86@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:37 -07:00
Heming Zhao
bd1261b16d ocfs2: fix the issue with discontiguous allocation in the global_bitmap
commit 4eb7b93e03 ("ocfs2: improve write IO performance when
fragmentation is high") introduced another regression.

The following ocfs2-test case can trigger this issue:
> discontig_runner.sh => activate_discontig_bg.sh => resv_unwritten:
> ${RESV_UNWRITTEN_BIN} -f ${WORK_PLACE}/large_testfile -s 0 -l \
> $((${FILE_MAJOR_SIZE_M}*1024*1024))

In my env, test disk size (by "fdisk -l <dev>"):
> 53687091200 bytes, 104857600 sectors.

Above command is:
> /usr/local/ocfs2-test/bin/resv_unwritten -f \
> /mnt/ocfs2/ocfs2-activate-discontig-bg-dir/large_testfile -s 0 -l \
> 53187969024

Error log:
> [*] Reserve 50724M space for a LARGE file, reserve 200M space for future test.
> ioctl error 28: "No space left on device"
> resv allocation failed Unknown error -1
> reserve unwritten region from 0 to 53187969024.

Call flow:
__ocfs2_change_file_space //by ioctl OCFS2_IOC_RESVSP64
 ocfs2_allocate_unwritten_extents //start:0 len:53187969024
  while()
   + ocfs2_get_clusters //cpos:0, alloc_size:1623168 (cluster number)
   + ocfs2_extend_allocation
     + ocfs2_lock_allocators
     |  + choose OCFS2_AC_USE_MAIN & ocfs2_cluster_group_search
     |
     + ocfs2_add_inode_data
        ocfs2_add_clusters_in_btree
         __ocfs2_claim_clusters
          ocfs2_claim_suballoc_bits
          + During the allocation of the final part of the large file
	    (after ~47GB), no chain had the required contiguous
            bits_wanted. Consequently, the allocation failed.

How to fix:
When OCFS2 is encountering fragmented allocation, the file system should
stop attempting bits_wanted contiguous allocation and instead provide the
largest available contiguous free bits from the cluster groups.

Link: https://lkml.kernel.org/r/20250414060125.19938-2-heming.zhao@suse.com
Fixes: 4eb7b93e03 ("ocfs2: improve write IO performance when fragmentation is high")
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Reported-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:37 -07:00
Lance Yang
df84d2fd35 mailmap: add entries for Lance Yang
I'm moving to @linux.dev and mapping my old addresses to it.

Link: https://lkml.kernel.org/r/20250417153041.38977-1-lance.yang@linux.dev
Signed-off-by: Lance Yang <lance.yang@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:37 -07:00
Jakub Kicinski
bdc6470a4f Merge branch 'net-dsa-b53-accumulated-fixes'
Jonas Gorski says:

====================
net: dsa: b53: accumulated fixes

This patchset aims at fixing most issues observed while running the
vlan_unaware_bridge, vlan_aware_bridge and local_termination selftests.

Most tests succeed with these patches on BCM53115, connected to a
BCM6368.

It took me a while to figure out that a lot of tests will fail if all
ports have the same MAC address, as the switches drop any frames with
DA == SA. Luckily BCM63XX boards often have enough MACs allocated for
all ports, so I just needed to assign them.

The still failing tests are:

FDB learning, both vlan aware aware and unaware:

This is expected, as b53 currently does not implement changing the
ageing time, and both the bridge code and DSA ignore that, so the
learned entries don't age out as expected.

ping and ping6 in vlan unaware:

These fail because of the now fixed learning, the switch trying to
forward packet ingressing on one of the standalone ports to the learned
port of the mac address when the packets ingressed on the bridged port.

The port VLAN masks only prevent forwarding to other ports, but the ARL
lookup will still happen, and the packet gets dropped because the port
isn't allowed to forward there.

I have a fix/workaround for that, but as it is a bit more controversial
and makes use of an unrelated feature, I decided to hold off from that
and post it later.

This wasn't noticed so far, because learning was never working in VLAN
unaware mode, so the traffic was always broadcast (which sidesteps the
issue).

Finally some of the multicast tests from local_termination fail, where
the reception worked except it shouldn't. This doesn't seem to me as a
super serious issue, so I didn't attempt to debug/fix these yet.

I'm not super confident I didn't break sf2 along the way, but I did
compile test and tried to find ways it cause issues (I failed to find
any). I hope Florian will tell me.
====================

Link: https://patch.msgid.link/20250429201710.330937-1-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 19:30:38 -07:00
Jonas Gorski
2e7179c628 net: dsa: b53: do not set learning and unicast/multicast on up
When a port gets set up, b53 disables learning and enables the port for
flooding. This can undo any bridge configuration on the port.

E.g. the following flow would disable learning on a port:

$ ip link add br0 type bridge
$ ip link set sw1p1 master br0 <- enables learning for sw1p1
$ ip link set br0 up
$ ip link set sw1p1 up <- disables learning again

Fix this by populating dsa_switch_ops::port_setup(), and set up initial
config there.

Fixes: f9b3827ee6 ("net: dsa: b53: Support setting learning on port")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250429201710.330937-12-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 19:30:35 -07:00
Jonas Gorski
9f34ad89bc net: dsa: b53: fix learning on VLAN unaware bridges
When VLAN filtering is off, we configure the switch to forward, but not
learn on VLAN table misses. This effectively disables learning while not
filtering.

Fix this by switching to forward and learn. Setting the learning disable
register will still control whether learning actually happens.

Fixes: dad8d7c645 ("net: dsa: b53: Properly account for VLAN filtering")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250429201710.330937-11-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 19:30:35 -07:00
Jonas Gorski
2dc2bd5711 net: dsa: b53: fix toggling vlan_filtering
To allow runtime switching between vlan aware and vlan non-aware mode,
we need to properly keep track of any bridge VLAN configuration.
Likewise, we need to know when we actually switch between both modes, to
not have to rewrite the full VLAN table every time we update the VLANs.

So keep track of the current vlan_filtering mode, and on changes, apply
the appropriate VLAN configuration.

Fixes: 0ee2af4ebb ("net: dsa: set configure_vlan_while_not_filtering to true by default")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250429201710.330937-10-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 19:30:35 -07:00
Jonas Gorski
f089652b6b net: dsa: b53: do not program vlans when vlan filtering is off
Documentation/networking/switchdev.rst says:

- with VLAN filtering turned off: the bridge is strictly VLAN unaware and its
  data path will process all Ethernet frames as if they are VLAN-untagged.
  The bridge VLAN database can still be modified, but the modifications should
  have no effect while VLAN filtering is turned off.

This breaks if we immediately apply the VLAN configuration, so skip
writing it when vlan_filtering is off.

Fixes: 0ee2af4ebb ("net: dsa: set configure_vlan_while_not_filtering to true by default")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250429201710.330937-9-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 19:30:35 -07:00
Jonas Gorski
45e9d59d39 net: dsa: b53: do not allow to configure VLAN 0
Since we cannot set forwarding destinations per VLAN, we should not have
a VLAN 0 configured, as it would allow untagged traffic to work across
ports on VLAN aware bridges regardless if a PVID untagged VLAN exists.

So remove the VLAN 0 on join, an re-add it on leave. But only do so if
we have a VLAN aware bridge, as without it, untagged traffic would
become tagged with VID 0 on a VLAN unaware bridge.

Fixes: a2482d2ce3 ("net: dsa: b53: Plug in VLAN support")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250429201710.330937-8-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 19:30:35 -07:00
Jonas Gorski
13b152ae40 net: dsa: b53: always rejoin default untagged VLAN on bridge leave
While JOIN_ALL_VLAN allows to join all VLANs, we still need to keep the
default VLAN enabled so that untagged traffic stays untagged.

So rejoin the default VLAN even for switches with JOIN_ALL_VLAN support.

Fixes: 48aea33a77 ("net: dsa: b53: Add JOIN_ALL_VLAN support")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250429201710.330937-7-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 19:30:35 -07:00
Jonas Gorski
a1c1901c5c net: dsa: b53: fix VLAN ID for untagged vlan on bridge leave
The untagged default VLAN is added to the default vlan, which may be
one, but we modify the VLAN 0 entry on bridge leave.

Fix this to use the correct VLAN entry for the default pvid.

Fixes: fea8335317 ("net: dsa: b53: Fix default VLAN ID")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250429201710.330937-6-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 19:30:34 -07:00
Jonas Gorski
083c6b28c0 net: dsa: b53: fix flushing old pvid VLAN on pvid change
Presumably the intention here was to flush the VLAN of the old pvid, not
the added VLAN again, which we already flushed before.

Fixes: a2482d2ce3 ("net: dsa: b53: Plug in VLAN support")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250429201710.330937-5-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 19:30:34 -07:00
Jonas Gorski
f480851981 net: dsa: b53: fix clearing PVID of a port
Currently the PVID of ports are only set when adding/updating VLANs with
PVID set or removing VLANs, but not when clearing the PVID flag of a
VLAN.

E.g. the following flow

$ ip link add br0 type bridge vlan_filtering 1
$ ip link set sw1p1 master bridge
$ bridge vlan add dev sw1p1 vid 10 pvid untagged
$ bridge vlan add dev sw1p1 vid 10 untagged

Would keep the PVID set as 10, despite the flag being cleared. Fix this
by checking if we need to unset the PVID on vlan updates.

Fixes: a2482d2ce3 ("net: dsa: b53: Plug in VLAN support")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250429201710.330937-4-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 19:30:34 -07:00
Jonas Gorski
425f11d4cc net: dsa: b53: keep CPU port always tagged again
The Broadcom management header does not carry the original VLAN tag
state information, just the ingress port, so for untagged frames we do
not know from which VLAN they originated.

Therefore keep the CPU port always tagged except for VLAN 0.

Fixes the following setup:

$ ip link add br0 type bridge vlan_filtering 1
$ ip link set sw1p1 master br0
$ bridge vlan add dev br0 pvid untagged self
$ ip link add sw1p2.10 link sw1p2 type vlan id 10

Where VID 10 would stay untagged on the CPU port.

Fixes: 2c32a3d3c2 ("net: dsa: b53: Do not force CPU to be always tagged")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250429201710.330937-3-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 19:30:34 -07:00
Jonas Gorski
5f93185a75 net: dsa: b53: allow leaky reserved multicast
Allow reserved multicast to ignore VLAN membership so STP and other
management protocols work without a PVID VLAN configured when using a
vlan aware bridge.

Fixes: 967dd82ffc ("net: dsa: b53: Add support for Broadcom RoboSwitch")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250429201710.330937-2-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 19:30:34 -07:00
Aaron Kling
e979a7c79f spi: tegra114: Use value to check for invalid delays
A delay unit of 0 is a valid entry, thus it is not valid to check for
unused delays. Instead, check the value field; if that is zero, the
given delay is unset.

Fixes: 4426e6b4ec ("spi: tegra114: Don't fail set_cs_timing when delays are zero")
Cc: stable@vger.kernel.org
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://patch.msgid.link/20250506-spi-tegra114-fixup-v1-1-136dc2f732f3@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-08 10:32:40 +09:00
Jakub Kicinski
ea78f20175 Merge branch 'bug-fixes-from-xdp-patch-series'
Meghana Malladi says:

====================
Bug fixes from XDP patch series

This patch series fixes the bugs introduced while adding
xdp support in the icssg driver, and were reproduced while
running xdp-trafficgen to generate xdp traffic on icssg interfaces.

v1: https://lore.kernel.org/all/20250428120459.244525-1-m-malladi@ti.com/
====================

Link: https://patch.msgid.link/20250506110546.4065715-1-m-malladi@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 18:19:17 -07:00
Meghana Malladi
1884fc85ae net: ti: icssg-prueth: Report BQL before sending XDP packets
When sending out any kind of traffic, it is essential that the driver
keeps reporting BQL of the number of bytes that have been sent so that
BQL can track the amount of data in the queue and prevents it from
overflowing. If BQL is not reported, the driver may continue sending
packets even when the queue is full, leading to packet loss, congestion
and decreased network performance. Currently this is missing in
emac_xmit_xdp_frame() and this patch fixes it.

Fixes: 62aa3246f4 ("net: ti: icssg-prueth: Add XDP support")
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250506110546.4065715-4-m-malladi@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 18:19:12 -07:00
Meghana Malladi
8b3fae3e23 net: ti: icssg-prueth: Fix kernel panic during concurrent Tx queue access
Add __netif_tx_lock() to ensure that only one packet is being
transmitted at a time to avoid race conditions in the netif_txq
struct and prevent packet data corruption. Failing to do so causes
kernel panic with the following error:

[ 2184.746764] ------------[ cut here ]------------
[ 2184.751412] kernel BUG at lib/dynamic_queue_limits.c:99!
[ 2184.756728] Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP

logs: https://gist.github.com/MeghanaMalladiTI/9c7aa5fc3b7fb03f87c74aad487956e9

The lock is acquired before calling emac_xmit_xdp_frame() and released after the
call returns. This ensures that the TX queue is protected from concurrent access
during the transmission of XDP frames.

Fixes: 62aa3246f4 ("net: ti: icssg-prueth: Add XDP support")
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250506110546.4065715-3-m-malladi@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 18:19:11 -07:00
Meghana Malladi
e5641daa0e net: ti: icssg-prueth: Set XDP feature flags for ndev
xdp_features demonstrates what all XDP capabilities are supported
on a given network device. The driver needs to set these xdp_features
flag to let the network stack know what XDP features a given driver
is supporting. These flags need to be set for a given ndev irrespective
of any XDP program being loaded or not.

Fixes: 62aa3246f4 ("net: ti: icssg-prueth: Add XDP support")
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250506110546.4065715-2-m-malladi@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 18:19:11 -07:00
Paul Chaignon
f5c79ffdc2 bpf: Clarify handling of mark and tstamp by redirect_peer
When switching network namespaces with the bpf_redirect_peer helper, the
skb->mark and skb->tstamp fields are not zeroed out like they can be on
a typical netns switch. This patch clarifies that in the helper
description.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/ccc86af26d43c5c0b776bcba2601b7479c0d46d0.1746460653.git.paul.chaignon@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 18:16:33 -07:00
Paul Chaignon
c432722994 bpf: Scrub packet on bpf_redirect_peer
When bpf_redirect_peer is used to redirect packets to a device in
another network namespace, the skb isn't scrubbed. That can lead skb
information from one namespace to be "misused" in another namespace.

As one example, this is causing Cilium to drop traffic when using
bpf_redirect_peer to redirect packets that just went through IPsec
decryption to a container namespace. The following pwru trace shows (1)
the packet path from the host's XFRM layer to the container's XFRM
layer where it's dropped and (2) the number of active skb extensions at
each function.

    NETNS       MARK  IFACE  TUPLE                                FUNC
    4026533547  d00   eth0   10.244.3.124:35473->10.244.2.158:53  xfrm_rcv_cb
                             .active_extensions = (__u8)2,
    4026533547  d00   eth0   10.244.3.124:35473->10.244.2.158:53  xfrm4_rcv_cb
                             .active_extensions = (__u8)2,
    4026533547  d00   eth0   10.244.3.124:35473->10.244.2.158:53  gro_cells_receive
                             .active_extensions = (__u8)2,
    [...]
    4026533547  0     eth0   10.244.3.124:35473->10.244.2.158:53  skb_do_redirect
                             .active_extensions = (__u8)2,
    4026534999  0     eth0   10.244.3.124:35473->10.244.2.158:53  ip_rcv
                             .active_extensions = (__u8)2,
    4026534999  0     eth0   10.244.3.124:35473->10.244.2.158:53  ip_rcv_core
                             .active_extensions = (__u8)2,
    [...]
    4026534999  0     eth0   10.244.3.124:35473->10.244.2.158:53  udp_queue_rcv_one_skb
                             .active_extensions = (__u8)2,
    4026534999  0     eth0   10.244.3.124:35473->10.244.2.158:53  __xfrm_policy_check
                             .active_extensions = (__u8)2,
    4026534999  0     eth0   10.244.3.124:35473->10.244.2.158:53  __xfrm_decode_session
                             .active_extensions = (__u8)2,
    4026534999  0     eth0   10.244.3.124:35473->10.244.2.158:53  security_xfrm_decode_session
                             .active_extensions = (__u8)2,
    4026534999  0     eth0   10.244.3.124:35473->10.244.2.158:53  kfree_skb_reason(SKB_DROP_REASON_XFRM_POLICY)
                             .active_extensions = (__u8)2,

In this case, there are no XFRM policies in the container's network
namespace so the drop is unexpected. When we decrypt the IPsec packet,
the XFRM state used for decryption is set in the skb extensions. This
information is preserved across the netns switch. When we reach the
XFRM policy check in the container's netns, __xfrm_policy_check drops
the packet with LINUX_MIB_XFRMINNOPOLS because a (container-side) XFRM
policy can't be found that matches the (host-side) XFRM state used for
decryption.

This patch fixes this by scrubbing the packet when using
bpf_redirect_peer, as is done on typical netns switches via veth
devices except skb->mark and skb->tstamp are not zeroed.

Fixes: 9aa1206e8f ("bpf: Add redirect_peer helper")
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/1728ead5e0fe45e7a6542c36bd4e3ca07a73b7d6.1746460653.git.paul.chaignon@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 18:16:33 -07:00
Lorenzo Bianconi
4a7843cc8a net: airoha: Add missing field to ppe_mbox_data struct
The official Airoha EN7581 firmware requires adding max_packet field in
ppe_mbox_data struct while the unofficial one used to develop the Airoha
EN7581 flowtable support does not require this field.
This patch does not introduce any real backwards compatible issue since
EN7581 fw is not publicly available in linux-firmware or other
repositories (e.g. OpenWrt) yet and the official fw version will use this
new layout. For this reason this change needs to be backported.
Moreover, make explicit the padding added by the compiler introducing
the rsv array in init_info struct.
At the same time use u32 instead of int for init_info and set_info
struct definitions in ppe_mbox_data struct.

Fixes: 23290c7bc1 ("net: airoha: Introduce Airoha NPU support")
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250506-airoha-en7581-fix-ppe_mbox_data-v5-1-29cabed6864d@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 18:08:41 -07:00
Jakub Kicinski
dc75a43c07 Merge tag 'nf-25-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

The following patchset contain Netfilter/IPVS fixes for net:

1) Fix KMSAN uninit-value in do_output_route4, reported by syzbot.
   Patch from Julian Anastasov.

2) ipset hashtable set type breaks up the hashtable into regions of
   2^10 buckets. Fix the macro that determines the hashtable lock
   region to protect concurrent updates. From Jozsef Kadlecsik.

* tag 'nf-25-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: ipset: fix region locking in hash types
  ipvs: fix uninit-value for saddr in do_output_route4
====================

Link: https://patch.msgid.link/20250507221952.86505-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 17:57:04 -07:00
Eelco Chaudron
6beb6835c1 openvswitch: Fix unsafe attribute parsing in output_userspace()
This patch replaces the manual Netlink attribute iteration in
output_userspace() with nla_for_each_nested(), which ensures that only
well-formed attributes are processed.

Fixes: ccb1352e76 ("net: Add Open vSwitch kernel components.")
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Link: https://patch.msgid.link/0bd65949df61591d9171c0dc13e42cea8941da10.1746541734.git.echaudro@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 16:51:02 -07:00
Alex Deucher
f690e39747 drm/amdgpu/hdp4: use memcfg register to post the write for HDP flush
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.

Fixes: c9b8dcabb5 ("drm/amdgpu/hdp4.0: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5c937b4a60)
Cc: stable@vger.kernel.org
2025-05-07 18:24:56 -04:00
Alex Deucher
4aaffc8575 drm/amdgpu: fix pm notifier handling
Set the s3/s0ix and s4 flags in the pm notifier so that we can skip
the resource evictions properly in pm prepare based on whether
we are suspending or hibernating.  Drop the eviction as processes
are not frozen at this time, we we can end up getting stuck trying
to evict VRAM while applications continue to submit work which
causes the buffers to get pulled back into VRAM.

v2: Move suspend flags out of pm notifier (Mario)

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4178
Fixes: 2965e6355d ("drm/amd: Add Suspend/Hibernate notification callback support")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 06f2dcc241)
Cc: stable@vger.kernel.org
2025-05-07 18:24:30 -04:00
Alex Deucher
d0ce1aaa85 Revert "drm/amd: Stop evicting resources on APUs in suspend"
This reverts commit 3a9626c816.

This breaks S4 because we end up setting the s3/s0ix flags
even when we are entering s4 since prepare is used by both
flows.  The causes both the S3/s0ix and s4 flags to be set
which breaks several checks in the driver which assume they
are mutually exclusive.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3634
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ce8f7d9589)
Cc: stable@vger.kernel.org
2025-05-07 18:24:04 -04:00
Ruijing Dong
b7e84fb708 drm/amdgpu/vcn: using separate VCN1_AON_SOC offset
VCN1_AON_SOC_ADDRESS_3_0 offset varies on different
VCN generations, the issue in vcn4.0.5 is caused by
a different VCN1_AON_SOC_ADDRESS_3_0 offset.

This patch does the following:

    1. use the same offset for other VCN generations.
    2. use the vcn4.0.5 special offset
    3. update vcn_4_0 and vcn_5_0

Acked-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5c89ceda99)
Cc: stable@vger.kernel.org
2025-05-07 18:23:40 -04:00
Wayne Lin
65924ec69b drm/amd/display: Fix wrong handling for AUX_DEFER case
[Why]
We incorrectly ack all bytes get written when the reply actually is defer.
When it's defer, means sink is not ready for the request. We should
retry the request.

[How]
Only reply all data get written when receive I2C_ACK|AUX_ACK. Otherwise,
reply the number of actual written bytes received from the sink.
Add some messages to facilitate debugging as well.

Fixes: ad6756b4d7 ("drm/amd/display: Shift dc link aux to aux_payload")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3637e457eb)
Cc: stable@vger.kernel.org
2025-05-07 18:23:09 -04:00
Wayne Lin
3924f45d4d drm/amd/display: Copy AUX read reply data whenever length > 0
[Why]
amdgpu_dm_process_dmub_aux_transfer_sync() should return all exact data
reply from the sink side. Don't do the analysis job in it.

[How]
Remove unnecessary check condition AUX_TRANSACTION_REPLY_AUX_ACK.

Fixes: ead08b95fa ("drm/amd/display: Fix race condition in DPIA AUX transfer")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9b540e3fe6)
Cc: stable@vger.kernel.org
2025-05-07 18:20:24 -04:00
Wayne Lin
396dc51b3b drm/amd/display: Remove incorrect checking in dmub aux handler
[Why & How]
"Request length != reply length" is expected behavior defined in spec.
It's not an invalid reply. Besides, replied data handling logic is not
designed to be written in amdgpu_dm_process_dmub_aux_transfer_sync().
Remove the incorrectly handling section.

Fixes: ead08b95fa ("drm/amd/display: Fix race condition in DPIA AUX transfer")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 81b5c6fa62)
Cc: stable@vger.kernel.org
2025-05-07 18:19:36 -04:00
Wayne Lin
bc70e11b55 drm/amd/display: Fix the checking condition in dmub aux handling
[Why & How]
Fix the checking condition for detecting AUX_RET_ERROR_PROTOCOL_ERROR.
It was wrongly checking by "not equals to"

Reviewed-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1db6c9e9b6)
Cc: stable@vger.kernel.org
2025-05-07 18:17:42 -04:00
Wayne Lin
5a3846648c drm/amd/display: Shift DMUB AUX reply command if necessary
[Why]
Defined value of dmub AUX reply command field get updated but didn't
adjust dm receiving side accordingly.

[How]
Check the received reply command value to see if it's updated version
or not. Adjust it if necessary.

Fixes: ead08b95fa ("drm/amd/display: Fix race condition in DPIA AUX transfer")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d5c9ade755)
Cc: stable@vger.kernel.org
2025-05-07 18:14:55 -04:00
Austin Zheng
eba692ca3a drm/amd/display: Call FP Protect Before Mode Programming/Mode Support
[Why]
Memory allocation occurs within dml21_validate() for adding phantom planes.
May cause kernel to be tainted due to usage of FP Start.

[How]
Move FP start from dml21_validate to before mode programming/mode support.
Calculations requiring floating point are all done within mode programming
or mode support.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Austin Zheng <Austin.Zheng@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit fe3250f108)
2025-05-07 18:14:20 -04:00
Alex Hung
2a24755774 drm/amd/display: Remove unnecessary DC_FP_START/DC_FP_END
[WHY & HOW]
Remove the unnecessary DC_FP_START/DC_FP_END pair to reduce time in
preempt_disable. It also fixes "BUG: sleeping function called from
invalid context" error messages because of calling kzalloc with
GFP_KERNEL which can sleep.

Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 94da0735b6)
2025-05-07 18:14:02 -04:00
Aurabindo Pillai
f1c6be3999 drm/amd/display: more liberal vmin/vmax update for freesync
[Why]
FAMS2 expects vmin/vmax to be updated in the case when freesync is
off, but supported. But we only update it when freesync is enabled.

[How]
Change the vsync handler such that dc_stream_adjust_vmin_vmax() its called
irrespective of whether freesync is enabled. If freesync is supported,
then there is no harm in updating vmin/vmax registers.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3546
Reviewed-by: ChiaHsuan Chung <chiahsuan.chung@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit cfb2d41831)
Cc: stable@vger.kernel.org
2025-05-07 18:08:36 -04:00
Roman Li
9984db6374 drm/amd/display: Fix invalid context error in dml helper
[Why]
"BUG: sleeping function called from invalid context" error.
after:
"drm/amd/display: Protect FPU in dml2_validate()/dml21_validate()"

The populate_dml_plane_cfg_from_plane_state() uses the GFP_KERNEL flag
for memory allocation, which shouldn't be used in atomic contexts.

The allocation is needed only for using another helper function
get_scaler_data_for_plane().

[How]
Modify helpers to pass a pointer to scaler_data within existing context,
eliminating the need for dynamic memory allocation/deallocation
and copying.

Fixes: 366e77cd49 ("drm/amd/display: Protect FPU in dml2_validate()/dml21_validate()")
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit bd3e84bc98)
Cc: stable@vger.kernel.org
2025-05-07 18:07:44 -04:00
Jozsef Kadlecsik
8478a729c0 netfilter: ipset: fix region locking in hash types
Region locking introduced in v5.6-rc4 contained three macros to handle
the region locks: ahash_bucket_start(), ahash_bucket_end() which gave
back the start and end hash bucket values belonging to a given region
lock and ahash_region() which should give back the region lock belonging
to a given hash bucket. The latter was incorrect which can lead to a
race condition between the garbage collector and adding new elements
when a hash type of set is defined with timeouts.

Fixes: f66ee0410b ("netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports")
Reported-by: Kota Toda <kota.toda@gmo-cybersecurity.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-05-07 23:57:31 +02:00
Julian Anastasov
e34090d721 ipvs: fix uninit-value for saddr in do_output_route4
syzbot reports for uninit-value for the saddr argument [1].
commit 4754957f04 ("ipvs: do not use random local source address for
tunnels") already implies that the input value of saddr
should be ignored but the code is still reading it which can prevent
to connect the route. Fix it by changing the argument to ret_saddr.

[1]
BUG: KMSAN: uninit-value in do_output_route4+0x42c/0x4d0 net/netfilter/ipvs/ip_vs_xmit.c:147
 do_output_route4+0x42c/0x4d0 net/netfilter/ipvs/ip_vs_xmit.c:147
 __ip_vs_get_out_rt+0x403/0x21d0 net/netfilter/ipvs/ip_vs_xmit.c:330
 ip_vs_tunnel_xmit+0x205/0x2380 net/netfilter/ipvs/ip_vs_xmit.c:1136
 ip_vs_in_hook+0x1aa5/0x35b0 net/netfilter/ipvs/ip_vs_core.c:2063
 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline]
 nf_hook_slow+0xf7/0x400 net/netfilter/core.c:626
 nf_hook include/linux/netfilter.h:269 [inline]
 __ip_local_out+0x758/0x7e0 net/ipv4/ip_output.c:118
 ip_local_out net/ipv4/ip_output.c:127 [inline]
 ip_send_skb+0x6a/0x3c0 net/ipv4/ip_output.c:1501
 udp_send_skb+0xfda/0x1b70 net/ipv4/udp.c:1195
 udp_sendmsg+0x2fe3/0x33c0 net/ipv4/udp.c:1483
 inet_sendmsg+0x1fc/0x280 net/ipv4/af_inet.c:851
 sock_sendmsg_nosec net/socket.c:712 [inline]
 __sock_sendmsg+0x267/0x380 net/socket.c:727
 ____sys_sendmsg+0x91b/0xda0 net/socket.c:2566
 ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2620
 __sys_sendmmsg+0x41d/0x880 net/socket.c:2702
 __compat_sys_sendmmsg net/compat.c:360 [inline]
 __do_compat_sys_sendmmsg net/compat.c:367 [inline]
 __se_compat_sys_sendmmsg net/compat.c:364 [inline]
 __ia32_compat_sys_sendmmsg+0xc8/0x140 net/compat.c:364
 ia32_sys_call+0x3ffa/0x41f0 arch/x86/include/generated/asm/syscalls_32.h:346
 do_syscall_32_irqs_on arch/x86/entry/syscall_32.c:83 [inline]
 __do_fast_syscall_32+0xb0/0x110 arch/x86/entry/syscall_32.c:306
 do_fast_syscall_32+0x38/0x80 arch/x86/entry/syscall_32.c:331
 do_SYSENTER_32+0x1f/0x30 arch/x86/entry/syscall_32.c:369
 entry_SYSENTER_compat_after_hwframe+0x84/0x8e

Uninit was created at:
 slab_post_alloc_hook mm/slub.c:4167 [inline]
 slab_alloc_node mm/slub.c:4210 [inline]
 __kmalloc_cache_noprof+0x8fa/0xe00 mm/slub.c:4367
 kmalloc_noprof include/linux/slab.h:905 [inline]
 ip_vs_dest_dst_alloc net/netfilter/ipvs/ip_vs_xmit.c:61 [inline]
 __ip_vs_get_out_rt+0x35d/0x21d0 net/netfilter/ipvs/ip_vs_xmit.c:323
 ip_vs_tunnel_xmit+0x205/0x2380 net/netfilter/ipvs/ip_vs_xmit.c:1136
 ip_vs_in_hook+0x1aa5/0x35b0 net/netfilter/ipvs/ip_vs_core.c:2063
 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline]
 nf_hook_slow+0xf7/0x400 net/netfilter/core.c:626
 nf_hook include/linux/netfilter.h:269 [inline]
 __ip_local_out+0x758/0x7e0 net/ipv4/ip_output.c:118
 ip_local_out net/ipv4/ip_output.c:127 [inline]
 ip_send_skb+0x6a/0x3c0 net/ipv4/ip_output.c:1501
 udp_send_skb+0xfda/0x1b70 net/ipv4/udp.c:1195
 udp_sendmsg+0x2fe3/0x33c0 net/ipv4/udp.c:1483
 inet_sendmsg+0x1fc/0x280 net/ipv4/af_inet.c:851
 sock_sendmsg_nosec net/socket.c:712 [inline]
 __sock_sendmsg+0x267/0x380 net/socket.c:727
 ____sys_sendmsg+0x91b/0xda0 net/socket.c:2566
 ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2620
 __sys_sendmmsg+0x41d/0x880 net/socket.c:2702
 __compat_sys_sendmmsg net/compat.c:360 [inline]
 __do_compat_sys_sendmmsg net/compat.c:367 [inline]
 __se_compat_sys_sendmmsg net/compat.c:364 [inline]
 __ia32_compat_sys_sendmmsg+0xc8/0x140 net/compat.c:364
 ia32_sys_call+0x3ffa/0x41f0 arch/x86/include/generated/asm/syscalls_32.h:346
 do_syscall_32_irqs_on arch/x86/entry/syscall_32.c:83 [inline]
 __do_fast_syscall_32+0xb0/0x110 arch/x86/entry/syscall_32.c:306
 do_fast_syscall_32+0x38/0x80 arch/x86/entry/syscall_32.c:331
 do_SYSENTER_32+0x1f/0x30 arch/x86/entry/syscall_32.c:369
 entry_SYSENTER_compat_after_hwframe+0x84/0x8e

CPU: 0 UID: 0 PID: 22408 Comm: syz.4.5165 Not tainted 6.15.0-rc3-syzkaller-00019-gbc3372351d0c #0 PREEMPT(undef)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025

Reported-by: syzbot+04b9a82855c8aed20860@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/68138dfa.050a0220.14dd7d.0017.GAE@google.com/
Fixes: 4754957f04 ("ipvs: do not use random local source address for tunnels")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@kernel.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-05-07 23:57:24 +02:00
Aditya Garg
a609cb4cc0 Input: synaptics - enable InterTouch on Dell Precision M3800
Enable InterTouch mode on Dell Precision M3800 by adding "DLL060d" to
the list of SMBus-enabled variants.

Reported-by: Markus Rathgeb <maggu2810@gmail.com>
Signed-off-by: Aditya Garg <gargaditya08@live.com>
Link: https://lore.kernel.org/r/PN3PR01MB959789DD6D574E16141E5DC4B888A@PN3PR01MB9597.INDPRD01.PROD.OUTLOOK.COM
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-05-07 14:56:05 -07:00
Aditya Garg
2abc698ac7 Input: synaptics - enable InterTouch on TUXEDO InfinityBook Pro 14 v5
Enable InterTouch mode on TUXEDO InfinityBook Pro 14 v5 by adding
"SYN1221" to the list of SMBus-enabled variants.

Add support for InterTouch on SYN1221 by adding it to the list of
SMBus-enabled variants.

Reported-by: Matthias Eilert <kernel.hias@eilert.tech>
Tested-by: Matthias Eilert <kernel.hias@eilert.tech>
Signed-off-by: Aditya Garg <gargaditya08@live.com>
Link: https://lore.kernel.org/r/PN3PR01MB9597C033C4BC20EE2A0C4543B888A@PN3PR01MB9597.INDPRD01.PROD.OUTLOOK.COM
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-05-07 14:56:04 -07:00
Aditya Garg
47d768b32e Input: synaptics - enable InterTouch on Dynabook Portege X30L-G
Enable InterTouch mode on Dynabook Portege X30L-G by adding "TOS01f6" to
the list of SMBus-enabled variants.

Reported-by: Xuntao Chi <chotaotao1qaz2wsx@gmail.com>
Tested-by: Xuntao Chi <chotaotao1qaz2wsx@gmail.com>
Signed-off-by: Aditya Garg <gargaditya08@live.com>
Link: https://lore.kernel.org/r/PN3PR01MB959786E4AC797160CDA93012B888A@PN3PR01MB9597.INDPRD01.PROD.OUTLOOK.COM
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-05-07 14:56:03 -07:00
Manuel Fombuena
6d7ea08810 Input: synaptics - enable InterTouch on Dynabook Portege X30-D
[    5.989588] psmouse serio1: synaptics: Your touchpad (PNP: TOS0213 PNP0f03) says it can support a different bus. If i2c-hid and hid-rmi are not used, you might want to try setting psmouse.synaptics_intertouch to 1 and report this to linux-input@vger.kernel.org.
[    6.039923] psmouse serio1: synaptics: Touchpad model: 1, fw: 9.32, id: 0x1e2a1, caps: 0xf00223/0x840300/0x12e800/0x52d884, board id: 3322, fw id: 2658004

The board is labelled TM3322.

Present on the Toshiba / Dynabook Portege X30-D and possibly others.

Confirmed working well with psmouse.synaptics_intertouch=1 and local build.

Signed-off-by: Manuel Fombuena <fombuena@outlook.com>
Signed-off-by: Aditya Garg <gargaditya08@live.com>
Link: https://lore.kernel.org/r/PN3PR01MB9597711E7933A08389FEC31DB888A@PN3PR01MB9597.INDPRD01.PROD.OUTLOOK.COM
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-05-07 14:56:02 -07:00
Dmitry Torokhov
f04f03d3e9 Input: synaptics - enable SMBus for HP Elitebook 850 G1
The kernel reports that the touchpad for this device can support
SMBus mode.

Reported-by: jt <enopatch@gmail.com>
Link: https://lore.kernel.org/r/iys5dbv3ldddsgobfkxldazxyp54kay4bozzmagga6emy45jop@2ebvuxgaui4u
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-05-07 14:56:01 -07:00
Kent Overstreet
8e4d28036c bcachefs: Don't aggressively discard the journal
We frequently use 'bcachefs list_journal -a' for debugging, as it
provides a record of all btree transactions, and a history of what
happened.

But it's not so useful if we immediately discard journal buckets right
after they're no longer dirty.

This tweaks journal reclaim to only discard when we're low on space,
keeping the journal mostly un-discarded.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-07 17:10:10 -04:00
Kent Overstreet
da18dabc37 bcachefs: Ensure superblock gets written when we go ERO
When we go emergency read-only, make sure we do a final write_super() to
persist counters and error counts - this can be critical for piecing
together what fsck was doing.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-07 17:09:59 -04:00
Kent Overstreet
2fea3aa76e bcachefs: Filter out harmless EROFS error messages
These just indicate that we're shutting down.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-07 16:58:32 -04:00
Kent Overstreet
473f09f362 bcachefs: journal_shutdown is EROFS, not EIO
We often filter out EROFS errors to avoid log spew after an emergency
shutdown - journal_shutdown is just another emergency shutdown error.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-07 16:58:26 -04:00
Paul Aurich
3ca02e63ed smb: client: Avoid race in open_cached_dir with lease breaks
A pre-existing valid cfid returned from find_or_create_cached_dir might
race with a lease break, meaning open_cached_dir doesn't consider it
valid, and thinks it's newly-constructed. This leaks a dentry reference
if the allocation occurs before the queued lease break work runs.

Avoid the race by extending holding the cfid_list_lock across
find_or_create_cached_dir and when the result is checked.

Cc: stable@vger.kernel.org
Reviewed-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-07 15:24:46 -05:00
Dmitry Antipov
a6aeb73997 module: ensure that kobject_put() is safe for module type kobjects
In 'lookup_or_create_module_kobject()', an internal kobject is created
using 'module_ktype'. So call to 'kobject_put()' on error handling
path causes an attempt to use an uninitialized completion pointer in
'module_kobject_release()'. In this scenario, we just want to release
kobject without an extra synchronization required for a regular module
unloading process, so adding an extra check whether 'complete()' is
actually required makes 'kobject_put()' safe.

Reported-by: syzbot+7fb8a372e1f6add936dd@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7fb8a372e1f6add936dd
Fixes: 942e443127 ("module: Fix mod->mkobj.kobj potentially freed too early")
Cc: stable@vger.kernel.org
Suggested-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Link: https://lore.kernel.org/r/20250507065044.86529-1-dmantipov@yandex.ru
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-05-07 20:24:59 +02:00
Linus Torvalds
d76bb1ebb5 Merge tag 'erofs-for-6.15-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs fixes from Gao Xiang:

 - Add a new reviewer, Hongbo Li, for better community development

 - Fix an I/O hang out of file-backed mounts

 - Address a rare data corruption caused by concurrent I/Os on the same
   deduplicated compressed data

 - Minor cleanup

* tag 'erofs-for-6.15-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: ensure the extra temporary copy is valid for shortened bvecs
  erofs: remove unused enum type
  fs/erofs/fileio: call erofs_onlinefolio_split() after bio_add_folio()
  MAINTAINERS: erofs: add myself as reviewer
2025-05-07 10:19:47 -07:00
Luiz Augusto von Dentz
1e2e3044c1 Bluetooth: MGMT: Fix MGMT_OP_ADD_DEVICE invalid device flags
Device flags could be updated in the meantime while MGMT_OP_ADD_DEVICE
is pending on hci_update_passive_scan_sync so instead of setting the
current_flags as cmd->user_data just do a lookup using
hci_conn_params_lookup and use the latest stored flags.

Fixes: a182d9c84f ("Bluetooth: MGMT: Fix Add Device to responding before completing")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-05-07 12:47:53 -04:00
Tejun Heo
428dc9fc08 sched_ext: bpf_iter_scx_dsq_new() should always initialize iterator
BPF programs may call next() and destroy() on BPF iterators even after new()
returns an error value (e.g. bpf_for_each() macro ignores error returns from
new()). bpf_iter_scx_dsq_new() could leave the iterator in an uninitialized
state after an error return causing bpf_iter_scx_dsq_next() to dereference
garbage data. Make bpf_iter_scx_dsq_new() always clear $kit->dsq so that
next() and destroy() become noops.

Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 650ba21b13 ("sched_ext: Implement DSQ iterator")
Cc: stable@vger.kernel.org # v6.12+
Acked-by: Andrea Righi <arighi@nvidia.com>
2025-05-07 06:24:07 -10:00
Jason Andryuk
1f0304dfd9 xenbus: Use kref to track req lifetime
Marek reported seeing a NULL pointer fault in the xenbus_thread
callstack:
BUG: kernel NULL pointer dereference, address: 0000000000000000
RIP: e030:__wake_up_common+0x4c/0x180
Call Trace:
 <TASK>
 __wake_up_common_lock+0x82/0xd0
 process_msg+0x18e/0x2f0
 xenbus_thread+0x165/0x1c0

process_msg+0x18e is req->cb(req).  req->cb is set to xs_wake_up(), a
thin wrapper around wake_up(), or xenbus_dev_queue_reply().  It seems
like it was xs_wake_up() in this case.

It seems like req may have woken up the xs_wait_for_reply(), which
kfree()ed the req.  When xenbus_thread resumes, it faults on the zero-ed
data.

Linux Device Drivers 2nd edition states:
"Normally, a wake_up call can cause an immediate reschedule to happen,
meaning that other processes might run before wake_up returns."
... which would match the behaviour observed.

Change to keeping two krefs on each request.  One for the caller, and
one for xenbus_thread.  Each will kref_put() when finished, and the last
will free it.

This use of kref matches the description in
Documentation/core-api/kref.rst

Link: https://lore.kernel.org/xen-devel/ZO0WrR5J0xuwDIxW@mail-itl/
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Fixes: fd8aa9095a ("xen: optimize xenbus driver for multiple concurrent xenstore accesses")
Cc: stable@vger.kernel.org
Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20250506210935.5607-1-jason.andryuk@amd.com>
2025-05-07 16:21:41 +02:00
Jason Andryuk
90989869ba xenbus: Allow PVH dom0 a non-local xenstore
Make xenbus_init() allow a non-local xenstore for a PVH dom0 - it is
currently forced to XS_LOCAL.  With Hyperlaunch booting dom0 and a
xenstore stubdom, dom0 can be handled as a regular XS_HVM following the
late init path.

Ideally we'd drop the use of xen_initial_domain() and just check for the
event channel instead.  However, ARM has a xen,enhanced no-xenstore
mode, where the event channel and PFN would both be 0.  Retain the
xen_initial_domain() check, and use that for an additional check when
the event channel is 0.

Check the full 64bit HVM_PARAM_STORE_EVTCHN value to catch the off
chance that high bits are set for the 32bit event channel.

Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
Change-Id: I5506da42e4c6b8e85079fefb2f193c8de17c7437
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20250506204456.5220-1-jason.andryuk@amd.com>
2025-05-07 16:15:27 +02:00
Linus Torvalds
707df33751 Merge tag 'media/v6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
 "Some Kconfig dependency fixes"

* tag 'media/v6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: cec: tda9950: add back i2c dependency
  media: i2c: lt6911uxe: add two selects to Kconfig
  media: platform: synopsys: VIDEO_SYNOPSYS_HDMIRX should depend on ARCH_ROCKCHIP
  media: i2c: lt6911uxe: Fix Kconfig dependencies:
  media: vivid: fix FB dependency
2025-05-07 07:00:15 -07:00
Jens Axboe
687b2bae0e io_uring: ensure deferred completions are flushed for multishot
Multishot normally uses io_req_post_cqe() to post completions, but when
stopping it, it may finish up with a deferred completion. This is fine,
except if another multishot event triggers before the deferred completions
get flushed. If this occurs, then CQEs may get reordered in the CQ ring,
as new multishot completions get posted before the deferred ones are
flushed. This can cause confusion on the application side, if strict
ordering is required for the use case.

When multishot posting via io_req_post_cqe(), flush any pending deferred
completions first, if any.

Cc: stable@vger.kernel.org # 6.1+
Reported-by: Norman Maurer <norman_maurer@apple.com>
Reported-by: Christian Mazakas <christian.mazakas@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-07 07:55:15 -06:00
John Ernberg
cd9c058489 xen: swiotlb: Use swiotlb bouncing if kmalloc allocation demands it
Xen swiotlb support was missed when the patch set starting with
4ab5f8ec7d ("mm/slab: decouple ARCH_KMALLOC_MINALIGN from
ARCH_DMA_MINALIGN") was merged.

When running Xen on iMX8QXP, a SoC without IOMMU, the effect was that USB
transfers ended up corrupted when there was more than one URB inflight at
the same time.

Add a call to dma_kmalloc_needs_bounce() to make sure that allocations too
small for DMA get bounced via swiotlb.

Closes: https://lore.kernel.org/linux-usb/ab2776f0-b838-4cf6-a12a-c208eb6aad59@actia.se/
Fixes: 4ab5f8ec7d ("mm/slab: decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN")
Cc: stable@kernel.org # v6.5+
Signed-off-by: John Ernberg <john.ernberg@actia.se>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20250502114043.1968976-2-john.ernberg@actia.se>
2025-05-07 15:27:42 +02:00
Hans de Goede
bfcfe6d335 platform/x86: asus-wmi: Fix wlan_ctrl_by_user detection
The wlan_ctrl_by_user detection was introduced by commit a50bd128f2
("asus-wmi: record wlan status while controlled by userapp").

Quoting from that commit's commit message:

"""
When you call WMIMethod(DSTS, 0x00010011) to get WLAN status, it may return

(1) 0x00050001 (On)
(2) 0x00050000 (Off)
(3) 0x00030001 (On)
(4) 0x00030000 (Off)
(5) 0x00000002 (Unknown)

(1), (2) means that the model has hardware GPIO for WLAN, you can call
WMIMethod(DEVS, 0x00010011, 1 or 0) to turn WLAN on/off.
(3), (4) means that the model doesn’t have hardware GPIO, you need to use
API or driver library to turn WLAN on/off, and call
WMIMethod(DEVS, 0x00010012, 1 or 0) to set WLAN LED status.
After you set WLAN LED status, you can see the WLAN status is changed with
WMIMethod(DSTS, 0x00010011). Because the status is recorded lastly
(ex: Windows), you can use it for synchronization.
(5) means that the model doesn’t have WLAN device.

WLAN is the ONLY special case with upper rule.
"""

The wlan_ctrl_by_user flag should be set on 0x0003000? ((3), (4) above)
return values, but the flag mistakenly also gets set on laptops with
0x0005000? ((1), (2)) return values. This is causing rfkill problems on
laptops where 0x0005000? is returned.

Fix the check to only set the wlan_ctrl_by_user flag for 0x0003000?
return values.

Fixes: a50bd128f2 ("asus-wmi: record wlan status while controlled by userapp")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219786
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Armin Wolf <W_Armin@gmx.de>
Link: https://lore.kernel.org/r/20250501131702.103360-2-hdegoede@redhat.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-05-07 15:46:34 +03:00
Runhua He
0887817e49 platform/x86/amd/pmc: Declare quirk_spurious_8042 for MECHREVO Wujie 14XA (GX4HRXL)
MECHREVO Wujie 14XA (GX4HRXL) wakes up immediately after s2idle entry.
This happens regardless of whether the laptop is plugged into AC power,
or whether any peripheral is plugged into the laptop.

Similar to commit a55bdad5df ("platform/x86/amd/pmc: Disable keyboard
wakeup on AMD Framework 13"), the MECHREVO Wujie 14XA wakes up almost
instantly after s2idle suspend entry (IRQ1 is the keyboard):

2025-04-18 17:23:57,588 DEBUG:  PM: Triggering wakeup from IRQ 9
2025-04-18 17:23:57,588 DEBUG:  PM: Triggering wakeup from IRQ 1

Add this model to the spurious_8042 quirk to workaround this.

This patch does not affect the wake-up function of the built-in keyboard.
Because the firmware of this machine adds an insurance for keyboard
wake-up events, as it always triggers an additional IRQ 9 to wake up the
system.

Suggested-by: Mingcong Bai <jeffbai@aosc.io>
Suggested-by: Xinhui Yang <cyan@cyano.uk>
Suggested-by: Rong Zhang <i@rong.moe>
Fixes: a55bdad5df ("platform/x86/amd/pmc: Disable keyboard wakeup on AMD Framework 13")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4166
Cc: Mario Limonciello <mario.limonciello@amd.com>
Link: https://zhuanldan.zhihu.com/p/730538041
Tested-by: Yemu Lu <prcups@krgm.moe>
Signed-off-by: Runhua He <hua@aosc.io>
Link: https://lore.kernel.org/r/20250507100103.995395-1-hua@aosc.io
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-05-07 15:43:43 +03:00
John Chau
a032f29a15 platform/x86: thinkpad_acpi: Support also NEC Lavie X1475JAS
Change get_thinkpad_model_data() to check for additional vendor name
"NEC" in order to support NEC Lavie X1475JAS notebook (and perhaps
more).

The reason of this works with minimal changes is because NEC Lavie
X1475JAS is a Thinkpad inside. ACPI dumps reveals its OEM ID to be
"LENOVO", BIOS version "R2PET30W" matches typical Lenovo BIOS version,
the existence of HKEY of LEN0268, with DMI fw string is "R2PHT24W".

I compiled and tested with my own machine, attached the dmesg
below as proof of work:
[    6.288932] thinkpad_acpi: ThinkPad ACPI Extras v0.26
[    6.288937] thinkpad_acpi: http://ibm-acpi.sf.net/
[    6.288938] thinkpad_acpi: ThinkPad BIOS R2PET30W (1.11 ), EC R2PHT24W
[    6.307000] thinkpad_acpi: radio switch found; radios are enabled
[    6.307030] thinkpad_acpi: This ThinkPad has standard ACPI backlight brightness control, supported by the ACPI video driver
[    6.307033] thinkpad_acpi: Disabling thinkpad-acpi brightness events by default...
[    6.320322] thinkpad_acpi: rfkill switch tpacpi_bluetooth_sw: radio is unblocked
[    6.371963] thinkpad_acpi: secondary fan control detected & enabled
[    6.391922] thinkpad_acpi: battery 1 registered (start 0, stop 85, behaviours: 0x7)
[    6.398375] input: ThinkPad Extra Buttons as /devices/platform/thinkpad_acpi/input/input13

Signed-off-by: John Chau <johnchau@0atlas.com>
Link: https://lore.kernel.org/r/20250504165513.295135-1-johnchau@0atlas.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-05-07 15:41:40 +03:00
Niklas Schnelle
05a2538f2b s390/pci: Fix duplicate pci_dev_put() in disable_slot() when PF has child VFs
With commit bcb5d6c769 ("s390/pci: introduce lock to synchronize state
of zpci_dev's") the code to ignore power off of a PF that has child VFs
was changed from a direct return to a goto to the unlock and
pci_dev_put() section. The change however left the existing pci_dev_put()
untouched resulting in a doubple put. This can subsequently cause a use
after free if the struct pci_dev is released in an unexpected state.
Fix this by removing the extra pci_dev_put().

Cc: stable@vger.kernel.org
Fixes: bcb5d6c769 ("s390/pci: introduce lock to synchronize state of zpci_dev's")
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-05-07 13:35:05 +02:00
Niklas Schnelle
42420c50c6 s390/pci: Fix missing check for zpci_create_device() error return
The zpci_create_device() function returns an error pointer that needs to
be checked before dereferencing it as a struct zpci_dev pointer. Add the
missing check in __clp_add() where it was missed when adding the
scan_list in the fixed commit. Simply not adding the device to the scan
list results in the previous behavior.

Cc: stable@vger.kernel.org
Fixes: 0467cdde8c ("s390/pci: Sort PCI functions prior to creating virtual busses")
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-05-07 13:35:05 +02:00
Jan Kara
55dd5b4db3 udf: Make sure i_lenExtents is uptodate on inode eviction
UDF maintains total length of all extents in i_lenExtents. Generally we
keep extent lengths (and thus i_lenExtents) block aligned because it
makes the file appending logic simpler. However the standard mandates
that the inode size must match the length of all extents and thus we
trim the last extent when closing the file. To catch possible bugs we
also verify that i_lenExtents matches i_size when evicting inode from
memory. Commit b405c1e58b ("udf: refactor udf_next_aext() to handle
error") however broke the code updating i_lenExtents and thus
udf_evict_inode() ended up spewing lots of errors about incorrectly
sized extents although the extents were actually sized properly. Fix the
updating of i_lenExtents to silence the errors.

Fixes: b405c1e58b ("udf: refactor udf_next_aext() to handle error")
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
2025-05-07 12:04:07 +02:00
Mostafa Saleh
3949e28786 KVM: arm64: Fix memory check in host_stage2_set_owner_locked()
I found this simple bug while preparing some patches for pKVM.
AFAICT, it should be harmless (besides crashing the kernel if it
was misbehaving)

Fixes: e94a7dea29 ("KVM: arm64: Move host page ownership tracking to the hyp vmemmap")
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Link: https://lore.kernel.org/r/20250501162450.2784043-1-smostafa@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-05-07 00:17:05 -07:00
Marc Zyngier
ef296ee98b KVM: arm64: Kill HCRX_HOST_FLAGS
HCRX_HOST_FLAGS, like most of these hardcoded setups, are not
a good match for options that can be selectively enabled or
disabled.

Nothing but the early setup is relying on it now, so kill the
macro and move the bag of bits where they belong.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250430105916.3815157-3-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-05-07 00:16:44 -07:00
Marc Zyngier
ffea7c73d1 KVM: arm64: Properly save/restore HCRX_EL2
Rather than restoring HCRX_EL2 to a fixed value on vcpu exit,
perform a full save/restore of the register, ensuring that
we don't lose bits that would have been set at some point in
the host kernel lifetime, such as the GCSEn bit.

Fixes: ff5181d8a2 ("arm64/gcs: Provide basic EL2 setup to allow GCS usage at EL0 and EL1")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250430105916.3815157-2-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-05-07 00:16:44 -07:00
Daniel Wagner
650415fca0 nvme: unblock ctrl state transition for firmware update
The original nvme subsystem design didn't have a CONNECTING state; the
state machine allowed transitions from RESETTING to LIVE directly.

With the introduction of nvme fabrics the CONNECTING state was
introduce. Over time the nvme-pci started to use the CONNECTING state as
well.

Eventually, a bug fix for the nvme-fc started to depend that the only
valid transition to LIVE was from CONNECTING. Though this change didn't
update the firmware update handler which was still depending on
RESETTING to LIVE transition.

The simplest way to address it for the time being is to switch into
CONNECTING state before going to LIVE state.

Fixes: d2fe192348 ("nvme: only allow entering LIVE from CONNECTING state")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Daniel Wagner <wagi@kernel.org>
Closes: https://lore.kernel.org/all/0134ea15-8d5f-41f7-9e9a-d7e6d82accaa@roeck-us.net
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
2025-05-07 09:01:20 +02:00
Jakub Kicinski
9540984da6 Merge tag 'wireless-2025-05-06' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:

====================
Couple of fixes:
 * iwlwifi: add two missing device entries
 * cfg80211: fix a potential out-of-bounds access
 * mac80211: fix format of TID to link mapping action frames

* tag 'wireless-2025-05-06' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: iwlwifi: add support for Killer on MTL
  wifi: mac80211: fix the type of status_code for negotiated TID to Link Mapping
  wifi: cfg80211: fix out-of-bounds access during multi-link element defragmentation
====================

Link: https://patch.msgid.link/20250506203506.158818-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-06 19:06:50 -07:00
Jakub Kicinski
2e6259d821 Merge tag 'linux-can-fixes-for-6.15-20250506' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:

====================
pull-request: can 2025-05-06

The first patch is by Antonios Salios and adds a missing
spin_lock_init() to the m_can driver.

The next 3 patches are by me and fix the unregistration order in the
mcp251xfd, rockchip_canfd and m_can driver.

The last patch is by Oliver Hartkopp and fixes RCU and BH
locking/handling in the CAN gw protocol.

* tag 'linux-can-fixes-for-6.15-20250506' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: gw: fix RCU/BH usage in cgw_create_job()
  can: mcan: m_can_class_unregister(): fix order of unregistration calls
  can: rockchip_canfd: rkcanfd_remove(): fix order of unregistration calls
  can: mcp251xfd: mcp251xfd_remove(): fix order of unregistration calls
  can: mcp251xfd: fix TDC setting for low data bit rates
  can: m_can: m_can_class_allocate_dev(): initialize spin lock on device probe
====================

Link: https://patch.msgid.link/20250506135939.652543-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-06 18:56:36 -07:00
Stanislav Fomichev
78cd408356 net: add missing instance lock to dev_set_promiscuity
Accidentally spotted while trying to understand what else needs
to be renamed to netif_ prefix. Most of the calls to dev_set_promiscuity
are adjacent to dev_set_allmulti or dev_disable_lro so it should
be safe to add the lock. Note that new netif_set_promiscuity is
currently unused, the locked paths call __dev_set_promiscuity directly.

Fixes: ad7c7b2172 ("net: hold netdev instance lock during sysfs operations")
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250506011919.2882313-1-sdf@fomichev.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-06 18:52:39 -07:00
Gao Xiang
35076d2223 erofs: ensure the extra temporary copy is valid for shortened bvecs
When compressed data deduplication is enabled, multiple logical extents
may reference the same compressed physical cluster.

The previous commit 94c43de735 ("erofs: fix wrong primary bvec
selection on deduplicated extents") already avoids using shortened
bvecs.  However, in such cases, the extra temporary buffers also
need to be preserved for later use in z_erofs_fill_other_copies() to
to prevent data corruption.

IOWs, extra temporary buffers have to be retained not only due to
varying start relative offsets (`pageofs_out`, as indicated by
`pcl->multibases`) but also because of shortened bvecs.

android.hardware.graphics.composer@2.1.so : 270696 bytes
   0:        0..  204185 |  204185 :  628019200.. 628084736 |   65536
-> 1:   204185..  225536 |   21351 :  544063488.. 544129024 |   65536
   2:   225536..  270696 |   45160 :          0..         0 |       0

com.android.vndk.v28.apex : 93814897 bytes
...
   364: 53869896..54095257 |  225361 :  543997952.. 544063488 |   65536
-> 365: 54095257..54309344 |  214087 :  544063488.. 544129024 |   65536
   366: 54309344..54514557 |  205213 :  544129024.. 544194560 |   65536
...

Both 204185 and 54095257 have the same start relative offset of 3481,
but the logical page 55 of `android.hardware.graphics.composer@2.1.so`
ranges from 225280 to 229632, forming a shortened bvec [225280, 225536)
that cannot be used for decompressing the range from 54095257 to
54309344 of `com.android.vndk.v28.apex`.

Since `pcl->multibases` is already meaningless, just mark `be->keepxcpy`
on demand for simplicity.

Again, this issue can only lead to data corruption if `-Ededupe` is on.

Fixes: 94c43de735 ("erofs: fix wrong primary bvec selection on deduplicated extents")
Reviewed-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250506101850.191506-1-hsiangkao@linux.alibaba.com
2025-05-07 09:50:51 +08:00
Cosmin Ratiu
08e9f2d584 net: Lock netdevices during dev_shutdown
__qdisc_destroy() calls into various qdiscs .destroy() op, which in turn
can call .ndo_setup_tc(), which requires the netdev instance lock.

This commit extends the critical section in
unregister_netdevice_many_notify() to cover dev_shutdown() (and
dev_tcx_uninstall() as a side-effect) and acquires the netdev instance
lock in __dev_change_net_namespace() for the other dev_shutdown() call.

This should now guarantee that for all qdisc ops, the netdev instance
lock is held during .ndo_setup_tc().

Fixes: a0527ee2df ("net: hold netdev instance lock during qdisc ndo_setup_tc")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250505194713.1723399-1-cratiu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-06 18:31:32 -07:00
Przemek Kitszel
0093cb194a ice: use DSN instead of PCI BDF for ice_adapter index
Use Device Serial Number instead of PCI bus/device/function for
the index of struct ice_adapter.

Functions on the same physical device should point to the very same
ice_adapter instance, but with two PFs, when at least one of them is
PCI-e passed-through to a VM, it is no longer the case - PFs will get
seemingly random PCI BDF values, and thus indices, what finally leds to
each of them being on their own instance of ice_adapter. That causes them
to don't attempt any synchronization of the PTP HW clock usage, or any
other future resources.

DSN works nicely in place of the index, as it is "immutable" in terms of
virtualization.

Fixes: 0e2bddf9e5 ("ice: add ice_adapter for shared data across PFs on the same NIC")
Suggested-by: Jacob Keller <jacob.e.keller@intel.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Suggested-by: Jiri Pirko <jiri@resnulli.us>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20250505161939.2083581-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-06 18:27:14 -07:00
Paweł Anikiel
5595c31c37 x86/Kconfig: make CFI_AUTO_DEFAULT depend on !RUST or Rust >= 1.88
Calling core::fmt::write() from rust code while FineIBT is enabled
results in a kernel panic:

[ 4614.199779] kernel BUG at arch/x86/kernel/cet.c:132!
[ 4614.205343] Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 4614.211781] CPU: 2 UID: 0 PID: 6057 Comm: dmabuf_dump Tainted: G     U     O       6.12.17-android16-0-g6ab38c534a43 #1 9da040f27673ec3945e23b998a0f8bd64c846599
[ 4614.227832] Tainted: [U]=USER, [O]=OOT_MODULE
[ 4614.241247] RIP: 0010:do_kernel_cp_fault+0xea/0xf0
...
[ 4614.398144] RIP: 0010:_RNvXs5_NtNtNtCs3o2tGsuHyou_4core3fmt3num3impyNtB9_7Display3fmt+0x0/0x20
[ 4614.407792] Code: 48 f7 df 48 0f 48 f9 48 89 f2 89 c6 5d e9 18 fd ff ff 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 41 81 ea 14 61 af 2c 74 03 0f 0b 90 <66> 0f 1f 00 55 48 89 e5 48 89 f2 48 8b 3f be 01 00 00 00 5d e9 e7
[ 4614.428775] RSP: 0018:ffffb95acfa4ba68 EFLAGS: 00010246
[ 4614.434609] RAX: 0000000000000000 RBX: 0000000000000010 RCX: 0000000000000000
[ 4614.442587] RDX: 0000000000000007 RSI: ffffb95acfa4ba70 RDI: ffffb95acfa4bc88
[ 4614.450557] RBP: ffffb95acfa4bae0 R08: ffff0a00ffffff05 R09: 0000000000000070
[ 4614.458527] R10: 0000000000000000 R11: ffffffffab67eaf0 R12: ffffb95acfa4bcc8
[ 4614.466493] R13: ffffffffac5d50f0 R14: 0000000000000000 R15: 0000000000000000
[ 4614.474473]  ? __cfi__RNvXs5_NtNtNtCs3o2tGsuHyou_4core3fmt3num3impyNtB9_7Display3fmt+0x10/0x10
[ 4614.484118]  ? _RNvNtCs3o2tGsuHyou_4core3fmt5write+0x1d2/0x250

This happens because core::fmt::write() calls
core::fmt::rt::Argument::fmt(), which currently has CFI disabled:

library/core/src/fmt/rt.rs:
171     // FIXME: Transmuting formatter in new and indirectly branching to/calling
172     // it here is an explicit CFI violation.
173     #[allow(inline_no_sanitize)]
174     #[no_sanitize(cfi, kcfi)]
175     #[inline]
176     pub(super) unsafe fn fmt(&self, f: &mut Formatter<'_>) -> Result {

This causes a Control Protection exception, because FineIBT has sealed
off the original function's endbr64.

This makes rust currently incompatible with FineIBT. Add a Kconfig
dependency that prevents FineIBT from getting turned on by default
if rust is enabled.

[ Rust 1.88.0 (scheduled for 2025-06-26) should have this fixed [1],
  and thus we relaxed the condition with Rust >= 1.88.

  When `objtool` lands checking for this with e.g. [2], the plan is
  to ideally run that in upstream Rust's CI to prevent regressions
  early [3], since we do not control `core`'s source code.

  Alice tested the Rust PR backported to an older compiler.

  Peter would like that Rust provides a stable `core` which can be
  pulled into the kernel: "Relying on that much out of tree code is
  'unfortunate'".

    - Miguel ]

Signed-off-by: Paweł Anikiel <panikiel@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://github.com/rust-lang/rust/pull/139632 [1]
Link: https://lore.kernel.org/rust-for-linux/20250410154556.GB9003@noisy.programming.kicks-ass.net/ [2]
Link: https://github.com/rust-lang/rust/pull/139632#issuecomment-2801950873 [3]
Link: https://lore.kernel.org/r/20250410115420.366349-1-panikiel@google.com
Link: https://lore.kernel.org/r/att0-CANiq72kjDM0cKALVy4POEzhfdT4nO7tqz0Pm7xM+3=_0+L1t=A@mail.gmail.com
[ Reduced splat. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-05-07 00:11:47 +02:00
Miguel Ojeda
211dcf7785 rust: clean Rust 1.88.0's clippy::uninlined_format_args lint
Starting with Rust 1.88.0 (expected 2025-06-26) [1], `rustc` may move
back the `uninlined_format_args` to `style` from `pedantic` (it was
there waiting for rust-analyzer suppotr), and thus we will start to see
lints like:

    warning: variables can be used directly in the `format!` string
       --> rust/macros/kunit.rs:105:37
        |
    105 |         let kunit_wrapper_fn_name = format!("kunit_rust_wrapper_{}", test);
        |                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        |
        = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args
    help: change this to
        |
    105 -         let kunit_wrapper_fn_name = format!("kunit_rust_wrapper_{}", test);
    105 +         let kunit_wrapper_fn_name = format!("kunit_rust_wrapper_{test}");

There is even a case that is a pure removal:

    warning: variables can be used directly in the `format!` string
      --> rust/macros/module.rs:51:13
       |
    51 |             format!("{field}={content}\0", field = field, content = content)
       |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       |
       = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args
    help: change this to
       |
    51 -             format!("{field}={content}\0", field = field, content = content)
    51 +             format!("{field}={content}\0")

The lints all seem like nice cleanups, thus just apply them.

We may want to disable `allow-mixed-uninlined-format-args` in the future.

Cc: stable@vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs).
Link: https://github.com/rust-lang/rust-clippy/pull/14160 [1]
Acked-by: Benno Lossin <lossin@kernel.org>
Reviewed-by: Tamir Duberstein <tamird@gmail.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20250502140237.1659624-6-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-05-07 00:11:47 +02:00
Miguel Ojeda
c016722fd5 rust: clean Rust 1.88.0's warning about clippy::disallowed_macros configuration
Starting with Rust 1.88.0 (expected 2025-06-26) [1], Clippy may start
warning about paths that do not resolve in the `disallowed_macros`
configuration:

    warning: `kernel::dbg` does not refer to an existing macro
      --> .clippy.toml:10:5
       |
    10 |     { path = "kernel::dbg", reason = "the `dbg!` macro is intended as a debugging tool" },
       |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This is a lint we requested at [2], due to the trouble debugging
the lint due to false negatives (e.g. [3]), which we use to emulate
`clippy::dbg_macro` [4]. See commit 8577c9dca7 ("rust: replace
`clippy::dbg_macro` with `disallowed_macros`") for more details.

Given the false negatives are not resolved yet, it is expected that
Clippy complains about not finding this macro.

Thus, until the false negatives are fixed (and, even then, probably we
will need to wait for the MSRV to raise enough), use the escape hatch
to allow an invalid path.

Cc: stable@vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs).
Link: https://github.com/rust-lang/rust-clippy/pull/14397 [1]
Link: https://github.com/rust-lang/rust-clippy/issues/11432 [2]
Link: https://github.com/rust-lang/rust-clippy/issues/11431 [3]
Link: https://github.com/rust-lang/rust-clippy/issues/11303 [4]
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20250502140237.1659624-5-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-05-07 00:11:47 +02:00
Miguel Ojeda
7129ea6e24 rust: clean Rust 1.88.0's unnecessary_transmutes lint
Starting with Rust 1.88.0 (expected 2025-06-26) [1][2], `rustc` may
introduce a new lint that catches unnecessary transmutes, e.g.:

     error: unnecessary transmute
         --> rust/uapi/uapi_generated.rs:23242:18
          |
    23242 |         unsafe { ::core::mem::transmute(self._bitfield_1.get(0usize, 1u8) as u8) }
          |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: replace this with: `(self._bitfield_1.get(0usize, 1u8) as u8 == 1)`
          |
          = note: `-D unnecessary-transmutes` implied by `-D warnings`
          = help: to override `-D warnings` add `#[allow(unnecessary_transmutes)]`

There are a lot of them (at least 300), but luckily they are all in
`bindgen`-generated code.

Thus clean all up by allowing it there.

Since unknown lints trigger a lint itself in older compilers, do it
conditionally so that we can keep the `unknown_lints` lint enabled.

Cc: stable@vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs).
Link: https://github.com/rust-lang/rust/pull/136083 [1]
Link: https://github.com/rust-lang/rust/issues/136067 [2]
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20250502140237.1659624-4-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-05-07 00:11:47 +02:00
Miguel Ojeda
a39f308709 rust: allow Rust 1.87.0's clippy::ptr_eq lint
Starting with Rust 1.87.0 (expected 2025-05-15) [1], Clippy may expand
the `ptr_eq` lint, e.g.:

    error: use `core::ptr::eq` when comparing raw pointers
       --> rust/kernel/list.rs:438:12
        |
    438 |         if self.first == item {
        |            ^^^^^^^^^^^^^^^^^^ help: try: `core::ptr::eq(self.first, item)`
        |
        = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#ptr_eq
        = note: `-D clippy::ptr-eq` implied by `-D warnings`
        = help: to override `-D warnings` add `#[allow(clippy::ptr_eq)]`

It is expected that a PR to relax the lint will be backported [2] by
the time Rust 1.87.0 releases, since the lint was considered too eager
(at least by default) [3].

Thus allow the lint temporarily just in case.

Cc: stable@vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs).
Link: https://github.com/rust-lang/rust-clippy/pull/14339 [1]
Link: https://github.com/rust-lang/rust-clippy/pull/14526 [2]
Link: https://github.com/rust-lang/rust-clippy/issues/14525 [3]
Link: https://lore.kernel.org/r/20250502140237.1659624-3-ojeda@kernel.org
[ Converted to `allow`s since backport was confirmed. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-05-07 00:11:02 +02:00
Miguel Ojeda
19f5ca461d objtool/rust: add one more noreturn Rust function for Rust 1.87.0
Starting with Rust 1.87.0 (expected 2025-05-15), `objtool` may report:

    rust/core.o: warning: objtool: _R..._4core9panicking9panic_fmt() falls
    through to next function _R..._4core9panicking18panic_nounwind_fmt()

    rust/core.o: warning: objtool: _R..._4core9panicking18panic_nounwind_fmt()
    falls through to next function _R..._4core9panicking5panic()

The reason is that `rust_begin_unwind` is now mangled:

    _R..._7___rustc17rust_begin_unwind

Thus add the mangled one to the list so that `objtool` knows it is
actually `noreturn`.

See commit 56d680dd23 ("objtool/rust: list `noreturn` Rust functions")
for more details.

Alternatively, we could remove the fixed one in `noreturn.h` and relax
this test to cover both, but it seems best to be strict as long as we can.

Cc: stable@vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs).
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20250502140237.1659624-2-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-05-07 00:08:51 +02:00
Stephen Boyd
2bc3ada090 Merge tag 'v6.15-rockchip-clkfixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into clk-fixes
Pull a Rockchip clk driver fix from Heiko Stuebner:

Actually define the gate-clk for the otg-phy on rk3576 to make the nvmem-
support work, that was merged for 6.15.

* tag 'v6.15-rockchip-clkfixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
  clk: rockchip: rk3576: define clk_otp_phy_g
2025-05-06 13:55:25 -07:00
Johannes Berg
ebedf8b7f0 wifi: iwlwifi: add support for Killer on MTL
For now, we need another entry for these devices, this
will be changed completely for 6.16.

Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219926
Link: https://patch.msgid.link/20250506214258.2efbdc9e9a82.I31915ec252bd1c74bd53b89a0e214e42a74b6f2e@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-05-06 21:50:19 +02:00
Michael-CY Lee
e12a42f64f wifi: mac80211: fix the type of status_code for negotiated TID to Link Mapping
The status code should be type of __le16.

Fixes: 83e897a961 ("wifi: ieee80211: add definitions for negotiated TID to Link map")
Fixes: 8f500fbc6c ("wifi: mac80211: process and save negotiated TID to Link mapping request")
Signed-off-by: Michael-CY Lee <michael-cy.lee@mediatek.com>
Link: https://patch.msgid.link/20250505081946.3927214-1-michael-cy.lee@mediatek.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-05-06 21:27:10 +02:00
Veerendranath Jakkam
023c1f2f06 wifi: cfg80211: fix out-of-bounds access during multi-link element defragmentation
Currently during the multi-link element defragmentation process, the
multi-link element length added to the total IEs length when calculating
the length of remaining IEs after the multi-link element in
cfg80211_defrag_mle(). This could lead to out-of-bounds access if the
multi-link element or its corresponding fragment elements are the last
elements in the IEs buffer.

To address this issue, correctly calculate the remaining IEs length by
deducting the multi-link element end offset from total IEs end offset.

Cc: stable@vger.kernel.org
Fixes: 2481b5da9c ("wifi: cfg80211: handle BSS data contained in ML probe responses")
Signed-off-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com>
Link: https://patch.msgid.link/20250424-fix_mle_defragmentation_oob_access-v1-1-84412a1743fa@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-05-06 21:04:40 +02:00
Alex Williamson
c1d9dac0db vfio/pci: Align huge faults to order
The vfio-pci huge_fault handler doesn't make any attempt to insert a
mapping containing the faulting address, it only inserts mappings if the
faulting address and resulting pfn are aligned.  This works in a lot of
cases, particularly in conjunction with QEMU where DMA mappings linearly
fault the mmap.  However, there are configurations where we don't get
that linear faulting and pages are faulted on-demand.

The scenario reported in the bug below is such a case, where the physical
address width of the CPU is greater than that of the IOMMU, resulting in a
VM where guest firmware has mapped device MMIO beyond the address width of
the IOMMU.  In this configuration, the MMIO is faulted on demand and
tracing indicates that occasionally the faults generate a VM_FAULT_OOM.
Given the use case, this results in a "error: kvm run failed Bad address",
killing the VM.

The host is not under memory pressure in this test, therefore it's
suspected that VM_FAULT_OOM is actually the result of a NULL return from
__pte_offset_map_lock() in the get_locked_pte() path from insert_pfn().
This suggests a potential race inserting a pte concurrent to a pmd, and
maybe indicates some deficiency in the mm layer properly handling such a
case.

Nevertheless, Peter noted the inconsistency of vfio-pci's huge_fault
handler where our mapping granularity depends on the alignment of the
faulting address relative to the order rather than aligning the faulting
address to the order to more consistently insert huge mappings.  This
change not only uses the page tables more consistently and efficiently, but
as any fault to an aligned page results in the same mapping, the race
condition suspected in the VM_FAULT_OOM is avoided.

Reported-by: Adolfo <adolfotregosa@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220057
Fixes: 09dfc8a5f2 ("vfio/pci: Fallback huge faults for unaligned pfn")
Cc: stable@vger.kernel.org
Tested-by: Adolfo <adolfotregosa@gmail.com>
Co-developed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20250502224035.3183451-1-alex.williamson@redhat.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-05-06 12:59:12 -06:00
Zhu Yanjun
d0706bfd3e RDMA/core: Fix "KASAN: slab-use-after-free Read in ib_register_device" problem
Call Trace:

 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
 print_address_description mm/kasan/report.c:408 [inline]
 print_report+0xc3/0x670 mm/kasan/report.c:521
 kasan_report+0xe0/0x110 mm/kasan/report.c:634
 strlen+0x93/0xa0 lib/string.c:420
 __fortify_strlen include/linux/fortify-string.h:268 [inline]
 get_kobj_path_length lib/kobject.c:118 [inline]
 kobject_get_path+0x3f/0x2a0 lib/kobject.c:158
 kobject_uevent_env+0x289/0x1870 lib/kobject_uevent.c:545
 ib_register_device drivers/infiniband/core/device.c:1472 [inline]
 ib_register_device+0x8cf/0xe00 drivers/infiniband/core/device.c:1393
 rxe_register_device+0x275/0x320 drivers/infiniband/sw/rxe/rxe_verbs.c:1552
 rxe_net_add+0x8e/0xe0 drivers/infiniband/sw/rxe/rxe_net.c:550
 rxe_newlink+0x70/0x190 drivers/infiniband/sw/rxe/rxe.c:225
 nldev_newlink+0x3a3/0x680 drivers/infiniband/core/nldev.c:1796
 rdma_nl_rcv_msg+0x387/0x6e0 drivers/infiniband/core/netlink.c:195
 rdma_nl_rcv_skb.constprop.0.isra.0+0x2e5/0x450
 netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
 netlink_unicast+0x53a/0x7f0 net/netlink/af_netlink.c:1339
 netlink_sendmsg+0x8d1/0xdd0 net/netlink/af_netlink.c:1883
 sock_sendmsg_nosec net/socket.c:712 [inline]
 __sock_sendmsg net/socket.c:727 [inline]
 ____sys_sendmsg+0xa95/0xc70 net/socket.c:2566
 ___sys_sendmsg+0x134/0x1d0 net/socket.c:2620
 __sys_sendmsg+0x16d/0x220 net/socket.c:2652
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0x260 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

This problem is similar to the problem that the
commit 1d6a9e7449 ("RDMA/core: Fix use-after-free when rename device name")
fixes.

The root cause is: the function ib_device_rename() renames the name with
lock. But in the function kobject_uevent(), this name is accessed without
lock protection at the same time.

The solution is to add the lock protection when this name is accessed in
the function kobject_uevent().

Fixes: 779e0bf476 ("RDMA/core: Do not indicate device ready when device enablement fails")
Link: https://patch.msgid.link/r/20250506151008.75701-1-yanjun.zhu@linux.dev
Reported-by: syzbot+e2ce9e275ecc70a30b72@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e2ce9e275ecc70a30b72
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-05-06 14:36:57 -03:00
Oscar Salvador
42e31f0daf mm,mm_init: Mark set_high_memory as __init
set_high_memory() touches arch_zone_lowest_possible_pfn which is
marked as __initdata, which creates a section mismatch.
Since the only user of the function is free_area_init() which is also marked
as __init, mark set_high_memory() as __init as well.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505060901.Qcs06UoB-lkp@intel.com/
Link: https://lore.kernel.org/r/20250506111012.108743-1-osalvador@suse.de
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
2025-05-06 20:07:25 +03:00
Linus Torvalds
0d8d44db29 Merge tag 'for-6.15-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:

 - revert device path canonicalization, this does not work as intended
   with namespaces and is not reliable in all setups

 - fix crash in scrub when checksum tree is not valid, e.g. when mounted
   with rescue=ignoredatacsums

 - fix crash when tracepoint btrfs_prelim_ref_insert is enabled

 - other minor fixups:
     - open code folio_index(), meant to be used in MM code
     - use matching type for sizeof in compression allocation

* tag 'for-6.15-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: open code folio_index() in btree_clear_folio_dirty_tag()
  Revert "btrfs: canonicalize the device path before adding it"
  btrfs: avoid NULL pointer dereference if no valid csum tree
  btrfs: handle empty eb->folios in num_extent_folios()
  btrfs: correct the order of prelim_ref arguments in btrfs__prelim_ref
  btrfs: compression: adjust cb->compressed_folios allocation type
2025-05-06 08:19:09 -07:00
Pawan Gupta
073fdbe02c x86/bhi: Do not set BHI_DIS_S in 32-bit mode
With the possibility of intra-mode BHI via cBPF, complete mitigation for
BHI is to use IBHF (history fence) instruction with BHI_DIS_S set. Since
this new instruction is only available in 64-bit mode, setting BHI_DIS_S in
32-bit mode is only a partial mitigation.

Do not set BHI_DIS_S in 32-bit mode so as to avoid reporting misleading
mitigated status. With this change IBHF won't be used in 32-bit mode, also
remove the CONFIG_X86_64 check from emit_spectre_bhb_barrier().

Suggested-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
2025-05-06 08:18:59 -07:00
Daniel Sneddon
9f725eec8f x86/bpf: Add IBHF call at end of classic BPF
Classic BPF programs can be run by unprivileged users, allowing
unprivileged code to execute inside the kernel. Attackers can use this to
craft branch history in kernel mode that can influence the target of
indirect branches.

BHI_DIS_S provides user-kernel isolation of branch history, but cBPF can be
used to bypass this protection by crafting branch history in kernel mode.
To stop intra-mode attacks via cBPF programs, Intel created a new
instruction Indirect Branch History Fence (IBHF). IBHF prevents the
predicted targets of subsequent indirect branches from being influenced by
branch history prior to the IBHF. IBHF is only effective while BHI_DIS_S is
enabled.

Add the IBHF instruction to cBPF jitted code's exit path. Add the new fence
when the hardware mitigation is enabled (i.e., X86_FEATURE_CLEAR_BHB_HW is
set) or after the software sequence (X86_FEATURE_CLEAR_BHB_LOOP) is being
used in a virtual machine. Note that X86_FEATURE_CLEAR_BHB_HW and
X86_FEATURE_CLEAR_BHB_LOOP are mutually exclusive, so the JIT compiler will
only emit the new fence, not the SW sequence, when X86_FEATURE_CLEAR_BHB_HW
is set.

Hardware that enumerates BHI_NO basically has BHI_DIS_S protections always
enabled, regardless of the value of BHI_DIS_S. Since BHI_DIS_S doesn't
protect against intra-mode attacks, enumerate BHI bug on BHI_NO hardware as
well.

Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
2025-05-06 08:18:48 -07:00
Daniel Sneddon
d4e89d212d x86/bpf: Call branch history clearing sequence on exit
Classic BPF programs have been identified as potential vectors for
intra-mode Branch Target Injection (BTI) attacks. Classic BPF programs can
be run by unprivileged users. They allow unprivileged code to execute
inside the kernel. Attackers can use unprivileged cBPF to craft branch
history in kernel mode that can influence the target of indirect branches.

Introduce a branch history buffer (BHB) clearing sequence during the JIT
compilation of classic BPF programs. The clearing sequence is the same as
is used in previous mitigations to protect syscalls. Since eBPF programs
already have their own mitigations in place, only insert the call on
classic programs that aren't run by privileged users.

Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
2025-05-06 08:18:32 -07:00
Linus Torvalds
cccd033714 Merge tag 'for-6.15/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mikulas Patocka:

 - fix reading past the end of allocated memory

 - fix missing dm_put_live_table() in dm_keyslot_evict()

* tag 'for-6.15/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm: fix copying after src array boundaries
  dm: add missing unlock on in dm_keyslot_evict()
2025-05-06 08:14:20 -07:00
Steve French
d90b023718 smb3 client: warn when parse contexts returns error on compounded operation
Coverity noticed that the rc on smb2_parse_contexts() was not being checked
in the case of compounded operations.  Since we don't want to stop parsing
the following compounded responses which are likely valid, we can't easily
error out here, but at least print a warning message if server has a bug
causing us to skip parsing the open response contexts.

Addresses-Coverity: 1639191
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-06 09:05:15 -05:00
Oliver Hartkopp
511e64e13d can: gw: fix RCU/BH usage in cgw_create_job()
As reported by Sebastian Andrzej Siewior the use of local_bh_disable()
is only feasible in uni processor systems to update the modification rules.
The usual use-case to update the modification rules is to update the data
of the modifications but not the modification types (AND/OR/XOR/SET) or
the checksum functions itself.

To omit additional memory allocations to maintain fast modification
switching times, the modification description space is doubled at gw-job
creation time so that only the reference to the active modification
description is changed under rcu protection.

Rename cgw_job::mod to cf_mod and make it a RCU pointer. Allocate in
cgw_create_job() and free it together with cgw_job in
cgw_job_free_rcu(). Update all users to dereference cgw_job::cf_mod with
a RCU accessor and if possible once.

[bigeasy: Replace mod1/mod2 from the Oliver's original patch with dynamic
allocation, use RCU annotation and accessor]

Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Closes: https://lore.kernel.org/linux-can/20231031112349.y0aLoBrz@linutronix.de/
Fixes: dd895d7f21 ("can: cangw: introduce optional uid to reference created routing jobs")
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20250429070555.cs-7b_eZ@linutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-06 15:55:36 +02:00
Marc Kleine-Budde
f695e3083a Merge patch series "can: rx-offload: fix order of unregistration calls"
Marc Kleine-Budde <mkl@pengutronix.de> says:

If a driver is removed, the driver framework invokes the driver's
remove callback. A CAN driver's remove function calls
unregister_candev(), which calls net_device_ops::ndo_stop further down
in the call stack for interfaces which are in the "up" state.

With the mcp251xfd driver the removal of the module causes the
following warning:

| WARNING: CPU: 0 PID: 352 at net/core/dev.c:7342 __netif_napi_del_locked+0xc8/0xd8

as can_rx_offload_del() deletes the NAPI, while it is still active,
because the interface is still up.

To fix the warning, first unregister the network interface, which
calls net_device_ops::ndo_stop, which disables the NAPI, and then call
can_rx_offload_del().

All other driver using the rx-offload helper have been checked and the
same issue has been found in the rockchip and m_can driver. These have
been fixed, but only compile time tested. On the mcp251xfd the fix was
tested on hardware.

Link: https://patch.msgid.link/20250502-can-rx-offload-del-v1-0-59a9b131589d@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-06 15:55:19 +02:00
Marc Kleine-Budde
0713a1b327 can: mcan: m_can_class_unregister(): fix order of unregistration calls
If a driver is removed, the driver framework invokes the driver's
remove callback. A CAN driver's remove function calls
unregister_candev(), which calls net_device_ops::ndo_stop further down
in the call stack for interfaces which are in the "up" state.

The removal of the module causes a warning, as can_rx_offload_del()
deletes the NAPI, while it is still active, because the interface is
still up.

To fix the warning, first unregister the network interface, which
calls net_device_ops::ndo_stop, which disables the NAPI, and then call
can_rx_offload_del().

Fixes: 1be37d3b04 ("can: m_can: fix periph RX path: use rx-offload to ensure skbs are sent from softirq context")
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250502-can-rx-offload-del-v1-3-59a9b131589d@pengutronix.de
Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-06 15:54:36 +02:00
Marc Kleine-Budde
037ada7a31 can: rockchip_canfd: rkcanfd_remove(): fix order of unregistration calls
If a driver is removed, the driver framework invokes the driver's
remove callback. A CAN driver's remove function calls
unregister_candev(), which calls net_device_ops::ndo_stop further down
in the call stack for interfaces which are in the "up" state.

The removal of the module causes a warning, as can_rx_offload_del()
deletes the NAPI, while it is still active, because the interface is
still up.

To fix the warning, first unregister the network interface, which
calls net_device_ops::ndo_stop, which disables the NAPI, and then call
can_rx_offload_del().

Fixes: ff60bfbaf6 ("can: rockchip_canfd: add driver for Rockchip CAN-FD controller")
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250502-can-rx-offload-del-v1-2-59a9b131589d@pengutronix.de
Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-06 15:54:04 +02:00
Johannes Thumshirn
db492e24f9 block: only update request sector if needed
In case of a ZONE APPEND write, regardless of native ZONE APPEND or the
emulation layer in the zone write plugging code, the sector the data got
written to by the device needs to be updated in the bio.

At the moment, this is done for every native ZONE APPEND write and every
request that is flagged with 'BIO_ZONE_WRITE_PLUGGING'. But thus
superfluously updates the sector for regular writes to a zoned block
device.

Check if a bio is a native ZONE APPEND write or if the bio is flagged as
'BIO_EMULATES_ZONE_APPEND', meaning the block layer's zone write plugging
code handles the ZONE APPEND and translates it into a regular write and
back. Only if one of these two criterion is met, update the sector in the
bio upon completion.

Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/dea089581cb6b777c1cd1500b38ac0b61df4b2d1.1746530748.git.jth@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-06 07:45:59 -06:00
Marc Kleine-Budde
84f5eb833f can: mcp251xfd: mcp251xfd_remove(): fix order of unregistration calls
If a driver is removed, the driver framework invokes the driver's
remove callback. A CAN driver's remove function calls
unregister_candev(), which calls net_device_ops::ndo_stop further down
in the call stack for interfaces which are in the "up" state.

With the mcp251xfd driver the removal of the module causes the
following warning:

| WARNING: CPU: 0 PID: 352 at net/core/dev.c:7342 __netif_napi_del_locked+0xc8/0xd8

as can_rx_offload_del() deletes the NAPI, while it is still active,
because the interface is still up.

To fix the warning, first unregister the network interface, which
calls net_device_ops::ndo_stop, which disables the NAPI, and then call
can_rx_offload_del().

Fixes: 55e5b97f00 ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN")
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250502-can-rx-offload-del-v1-1-59a9b131589d@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-06 15:45:46 +02:00
Kelsey Maes
5e1663810e can: mcp251xfd: fix TDC setting for low data bit rates
The TDC is currently hardcoded enabled. This means that even for lower
CAN-FD data bitrates (with a DBRP (data bitrate prescaler) > 2) a TDC
is configured. This leads to a bus-off condition.

ISO 11898-1 section 11.3.3 says "Transmitter delay compensation" (TDC)
is only applicable if DBRP is 1 or 2.

To fix the problem, switch the driver to use the TDC calculation
provided by the CAN driver framework (which respects ISO 11898-1
section 11.3.3). This has the positive side effect that userspace can
control TDC as needed.

Demonstration of the feature in action:
| $ ip link set can0 up type can bitrate 125000 dbitrate 500000 fd on
| $ ip -details link show can0
| 3: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 72 qdisc pfifo_fast state UP mode DEFAULT group default qlen 10
|     link/can  promiscuity 0  allmulti 0 minmtu 0 maxmtu 0
|     can <FD> state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0
| 	  bitrate 125000 sample-point 0.875
| 	  tq 50 prop-seg 69 phase-seg1 70 phase-seg2 20 sjw 10 brp 2
| 	  mcp251xfd: tseg1 2..256 tseg2 1..128 sjw 1..128 brp 1..256 brp_inc 1
| 	  dbitrate 500000 dsample-point 0.875
| 	  dtq 125 dprop-seg 6 dphase-seg1 7 dphase-seg2 2 dsjw 1 dbrp 5
| 	  mcp251xfd: dtseg1 1..32 dtseg2 1..16 dsjw 1..16 dbrp 1..256 dbrp_inc 1
| 	  tdcv 0..63 tdco 0..63
| 	  clock 40000000 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536 parentbus spi parentdev spi0.0
| $ ip link set can0 up type can bitrate 1000000 dbitrate 4000000 fd on
| $ ip -details link show can0
| 3: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 72 qdisc pfifo_fast state UP mode DEFAULT group default qlen 10
|     link/can  promiscuity 0  allmulti 0 minmtu 0 maxmtu 0
|     can <FD,TDC-AUTO> state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0
| 	  bitrate 1000000 sample-point 0.750
| 	  tq 25 prop-seg 14 phase-seg1 15 phase-seg2 10 sjw 5 brp 1
| 	  mcp251xfd: tseg1 2..256 tseg2 1..128 sjw 1..128 brp 1..256 brp_inc 1
| 	  dbitrate 4000000 dsample-point 0.700
| 	  dtq 25 dprop-seg 3 dphase-seg1 3 dphase-seg2 3 dsjw 1 dbrp 1
| 	  tdco 7
| 	  mcp251xfd: dtseg1 1..32 dtseg2 1..16 dsjw 1..16 dbrp 1..256 dbrp_inc 1
| 	  tdcv 0..63 tdco 0..63
| 	  clock 40000000 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536 parentbus spi parentdev spi0.0

There has been some confusion about the MCP2518FD using a relative or
absolute TDCO due to the datasheet specifying a range of [-64,63]. I
have a custom board with a 40 MHz clock and an estimated loop delay of
100 to 216 ns. During testing at a data bit rate of 4 Mbit/s I found
that using can_get_relative_tdco() resulted in bus-off errors. The
final TDCO value was 1 which corresponds to a 10% SSP in an absolute
configuration. This behavior is expected if the TDCO value is really
absolute and not relative. Using priv->can.tdc.tdco instead results in
a final TDCO of 8, setting the SSP at exactly 80%. This configuration
works.

The automatic, manual, and off TDC modes were tested at speeds up to,
and including, 8 Mbit/s on real hardware and behave as expected.

Fixes: 55e5b97f00 ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN")
Reported-by: Kelsey Maes <kelsey@vpprocess.com>
Closes: https://lore.kernel.org/all/C2121586-C87F-4B23-A933-845362C29CA1@vpprocess.com
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Kelsey Maes <kelsey@vpprocess.com>
Link: https://patch.msgid.link/20250430161501.79370-1-kelsey@vpprocess.com
[mkl: add comment]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-06 15:45:45 +02:00
Antonios Salios
dcaeeb8ae8 can: m_can: m_can_class_allocate_dev(): initialize spin lock on device probe
The spin lock tx_handling_spinlock in struct m_can_classdev is not
being initialized. This leads the following spinlock bad magic
complaint from the kernel, eg. when trying to send CAN frames with
cansend from can-utils:

| BUG: spinlock bad magic on CPU#0, cansend/95
|  lock: 0xff60000002ec1010, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
| CPU: 0 UID: 0 PID: 95 Comm: cansend Not tainted 6.15.0-rc3-00032-ga79be02bba5c #5 NONE
| Hardware name: MachineWare SIM-V (DT)
| Call Trace:
| [<ffffffff800133e0>] dump_backtrace+0x1c/0x24
| [<ffffffff800022f2>] show_stack+0x28/0x34
| [<ffffffff8000de3e>] dump_stack_lvl+0x4a/0x68
| [<ffffffff8000de70>] dump_stack+0x14/0x1c
| [<ffffffff80003134>] spin_dump+0x62/0x6e
| [<ffffffff800883ba>] do_raw_spin_lock+0xd0/0x142
| [<ffffffff807a6fcc>] _raw_spin_lock_irqsave+0x20/0x2c
| [<ffffffff80536dba>] m_can_start_xmit+0x90/0x34a
| [<ffffffff806148b0>] dev_hard_start_xmit+0xa6/0xee
| [<ffffffff8065b730>] sch_direct_xmit+0x114/0x292
| [<ffffffff80614e2a>] __dev_queue_xmit+0x3b0/0xaa8
| [<ffffffff8073b8fa>] can_send+0xc6/0x242
| [<ffffffff8073d1c0>] raw_sendmsg+0x1a8/0x36c
| [<ffffffff805ebf06>] sock_write_iter+0x9a/0xee
| [<ffffffff801d06ea>] vfs_write+0x184/0x3a6
| [<ffffffff801d0a88>] ksys_write+0xa0/0xc0
| [<ffffffff801d0abc>] __riscv_sys_write+0x14/0x1c
| [<ffffffff8079ebf8>] do_trap_ecall_u+0x168/0x212
| [<ffffffff807a830a>] handle_exception+0x146/0x152

Initializing the spin lock in m_can_class_allocate_dev solves that
problem.

Fixes: 1fa80e23c1 ("can: m_can: Introduce a tx_fifo_in_flight counter")
Signed-off-by: Antonios Salios <antonios@mwa.re>
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://patch.msgid.link/20250425111744.37604-2-antonios@mwa.re
Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-06 15:45:31 +02:00
Sean Heelan
36991c1ccd ksmbd: Fix UAF in __close_file_table_ids
A use-after-free is possible if one thread destroys the file
via __ksmbd_close_fd while another thread holds a reference to
it. The existing checks on fp->refcount are not sufficient to
prevent this.

The fix takes ft->lock around the section which removes the
file from the file table. This prevents two threads acquiring the
same file pointer via __close_file_table_ids, as well as the other
functions which retrieve a file from the IDR and which already use
this same lock.

Cc: stable@vger.kernel.org
Signed-off-by: Sean Heelan <seanheelan@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-06 08:37:02 -05:00
Norbert Szetei
0ca6df4f40 ksmbd: prevent out-of-bounds stream writes by validating *pos
ksmbd_vfs_stream_write() did not validate whether the write offset
(*pos) was within the bounds of the existing stream data length (v_len).
If *pos was greater than or equal to v_len, this could lead to an
out-of-bounds memory write.

This patch adds a check to ensure *pos is less than v_len before
proceeding. If the condition fails, -EINVAL is returned.

Cc: stable@vger.kernel.org
Signed-off-by: Norbert Szetei <norbert@doyensec.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-06 08:36:36 -05:00
Tudor Ambarus
f1aff4bc19 dm: fix copying after src array boundaries
The blammed commit copied to argv the size of the reallocated argv,
instead of the size of the old_argv, thus reading and copying from
past the old_argv allocated memory.

Following BUG_ON was hit:
[    3.038929][    T1] kernel BUG at lib/string_helpers.c:1040!
[    3.039147][    T1] Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
...
[    3.056489][    T1] Call trace:
[    3.056591][    T1]  __fortify_panic+0x10/0x18 (P)
[    3.056773][    T1]  dm_split_args+0x20c/0x210
[    3.056942][    T1]  dm_table_add_target+0x13c/0x360
[    3.057132][    T1]  table_load+0x110/0x3ac
[    3.057292][    T1]  dm_ctl_ioctl+0x424/0x56c
[    3.057457][    T1]  __arm64_sys_ioctl+0xa8/0xec
[    3.057634][    T1]  invoke_syscall+0x58/0x10c
[    3.057804][    T1]  el0_svc_common+0xa8/0xdc
[    3.057970][    T1]  do_el0_svc+0x1c/0x28
[    3.058123][    T1]  el0_svc+0x50/0xac
[    3.058266][    T1]  el0t_64_sync_handler+0x60/0xc4
[    3.058452][    T1]  el0t_64_sync+0x1b0/0x1b4
[    3.058620][    T1] Code: f800865e a9bf7bfd 910003fd 941f48aa (d4210000)
[    3.058897][    T1] ---[ end trace 0000000000000000 ]---
[    3.059083][    T1] Kernel panic - not syncing: Oops - BUG: Fatal exception

Fix it by copying the size of src, and not the size of dst, as it was.

Fixes: 5a2a6c4281 ("dm: always update the array size in realloc_argv on success")
Cc: stable@vger.kernel.org
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2025-05-06 14:06:59 +02:00
Kevin Baker
7c6fa1797a drm/panel: simple: Update timings for AUO G101EVN010
Switch to panel timings based on datasheet for the AUO G101EVN01.0
LVDS panel. Default timings were tested on the panel.

Previous mode-based timings resulted in horizontal display shift.

Signed-off-by: Kevin Baker <kevinb@ventureresearch.com>
Fixes: 4fb86404a9 ("drm/panel: simple: Add AUO G101EVN010 panel support")
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20250505170256.1385113-1-kevinb@ventureresearch.com
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20250505170256.1385113-1-kevinb@ventureresearch.com
2025-05-06 13:58:46 +02:00
Frank Wunderlich
e8716b5b0d net: ethernet: mtk_eth_soc: do not reset PSE when setting FE
Remove redundant PSE reset.
When setting FE register there is no need to reset PSE,
doing so may cause FE to work abnormal.

Link: 3a5223473e
Fixes: dee4dd10c7 ("net: ethernet: mtk_eth_soc: ppe: add support for multiple PPEs")
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Link: https://patch.msgid.link/18f0ac7d83f82defa3342c11ef0d1362f6b81e88.1746406763.git.daniel@makrotopia.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-06 13:37:34 +02:00
Daniel Golle
4db6c75124 net: ethernet: mtk_eth_soc: reset all TX queues on DMA free
The purpose of resetting the TX queue is to reset the byte and packet
count as well as to clear the software flow control XOFF bit.

MediaTek developers pointed out that netdev_reset_queue would only
resets queue 0 of the network device.

Queues that are not reset may cause unexpected issues.

Packets may stop being sent after reset and "transmit timeout" log may
be displayed.

Import fix from MediaTek's SDK to resolve this issue.

Link: 319c0d9905
Fixes: f63959c7ee ("net: ethernet: mtk_eth_soc: implement multi-queue support for per-port queues")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://patch.msgid.link/c9ff9adceac4f152239a0f65c397f13547639175.1746406763.git.daniel@makrotopia.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-06 13:37:34 +02:00
Yeoreum Yun
363cd2b81c arm64: cpufeature: Move arm64_use_ng_mappings to the .data section to prevent wrong idmap generation
The PTE_MAYBE_NG macro sets the nG page table bit according to the value
of "arm64_use_ng_mappings". This variable is currently placed in the
.bss section. create_init_idmap() is called before the .bss section
initialisation which is done in early_map_kernel(). Therefore,
data/test_prot in create_init_idmap() could be set incorrectly through
the PAGE_KERNEL -> PROT_DEFAULT -> PTE_MAYBE_NG macros.

   # llvm-objdump-21 --syms vmlinux-gcc | grep arm64_use_ng_mappings
     ffff800082f242a8 g     O .bss    0000000000000001 arm64_use_ng_mappings

The create_init_idmap() function disassembly compiled with llvm-21:

  // create_init_idmap()
  ffff80008255c058: d10103ff     	sub	sp, sp, #0x40
  ffff80008255c05c: a9017bfd     	stp	x29, x30, [sp, #0x10]
  ffff80008255c060: a90257f6     	stp	x22, x21, [sp, #0x20]
  ffff80008255c064: a9034ff4     	stp	x20, x19, [sp, #0x30]
  ffff80008255c068: 910043fd     	add	x29, sp, #0x10
  ffff80008255c06c: 90003fc8     	adrp	x8, 0xffff800082d54000
  ffff80008255c070: d280e06a     	mov	x10, #0x703     // =1795
  ffff80008255c074: 91400409     	add	x9, x0, #0x1, lsl #12 // =0x1000
  ffff80008255c078: 394a4108     	ldrb	w8, [x8, #0x290] ------------- (1)
  ffff80008255c07c: f2e00d0a     	movk	x10, #0x68, lsl #48
  ffff80008255c080: f90007e9     	str	x9, [sp, #0x8]
  ffff80008255c084: aa0103f3     	mov	x19, x1
  ffff80008255c088: aa0003f4     	mov	x20, x0
  ffff80008255c08c: 14000000     	b	0xffff80008255c08c <__pi_create_init_idmap+0x34>
  ffff80008255c090: aa082d56     	orr	x22, x10, x8, lsl #11 -------- (2)

Note (1) is loading the arm64_use_ng_mappings value in w8 and (2) is set
the text or data prot with the w8 value to set PTE_NG bit. If the .bss
section isn't initialized, x8 could include a garbage value and generate
an incorrect mapping.

Annotate arm64_use_ng_mappings as __read_mostly so that it is placed in
the .data section.

Fixes: 84b04d3e6b ("arm64: kernel: Create initial ID map from C code")
Cc: stable@vger.kernel.org # 6.9.x
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
Link: https://lore.kernel.org/r/20250502180412.3774883-1-yeoreum.yun@arm.com
[catalin.marinas@arm.com: use __read_mostly instead of __ro_after_init]
[catalin.marinas@arm.com: slight tweaking of the code comment]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-05-06 11:43:44 +01:00
Thomas Hellström
d4ad53adfe drm/ttm: Remove the struct ttm_backup abstraction
The abstraction was previously added to support separate
ttm_backup implementations.

However with the current implementation casting from a
struct file to a struct ttm_backup, we run into trouble since
struct file may have randomized the layout and gcc complains.

Remove the struct ttm_backup abstraction

Cc: dri-devel@lists.freedesktop.org
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: Kees Cook <kees@kernel.org>
Closes: https://lore.kernel.org/dri-devel/9c8dbbafdaf9f3f089da2cde5a772d69579b3795.camel@linux.intel.com/T/#mb153ab9216cb813b92bdeb36f391ad4808c2ba29
Suggested-by: Christian König <christian.koenig@amd.com>
Fixes: 70d645deac ("drm/ttm: Add helpers for shrinking")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20250502130014.3156-1-thomas.hellstrom@linux.intel.com
2025-05-06 10:57:27 +02:00
Thomas Hellström
2bb04ea9e5 drm/ttm: Fix ttm_backup kerneldoc
The docs were not properly updated from an earlier version of the code.

Fixes: e7b5d23e5d ("drm/ttm: Provide a shmem backup implementation")
Cc: Christian König <christian.koenig@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250502130101.3185-1-thomas.hellstrom@linux.intel.com
2025-05-06 10:57:11 +02:00
Luis Chamberlain
6ba0982c32 swapfile: disable swapon for bs > ps devices
Devices which have a requirement for bs > ps cannot be supported for
swap as swap still needs work. Now that the block device cache sets the
min order for block devices we need this stop gap otherwise all
swap operations are rejected.

Without this you'll end up with errors on these devices as the swap
code still needs much love to support min order.

With this we at least now put a stop gap of its use, until the
swap subsystem completes its major overhaul:

mkswap: /dev/nvme3n1: warning: wiping old swap signature.
Setting up swapspace version 1, size = 100 GiB (107374178304 bytes)
no label, UUID=6af76b5c-7e7b-4902-b7f7-4c24dde6fa36
swapon: /dev/nvme3n1: swapon failed: Invalid argument

Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/aBkS926thy9zvdZb@bombadil.infradead.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-06 10:46:19 +02:00
David Wei
4720f9707c tools: ynl-gen: validate 0 len strings from kernel
Strings from the kernel are guaranteed to be null terminated and
ynl_attr_validate() checks for this. But it doesn't check if the string
has a len of 0, which would cause problems when trying to access
data[len - 1]. Fix this by checking that len is positive.

Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://patch.msgid.link/20250503043050.861238-1-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 18:17:47 -07:00
Jakub Kicinski
c645a6b2f3 Merge branch 'selftests-drv-net-fix-ping-py-test-failure'
Mohsin Bashir says:

====================
selftests: drv: net: fix `ping.py` test failure

Fix `ping.py` test failure on an ipv6 system, and appropriately handle the
cases where either one of the two address families (ipv4, ipv6) is not
present.
====================

Link: https://patch.msgid.link/20250503013518.1722913-1-mohsin.bashr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 18:17:19 -07:00
Mohsin Bashir
4a9d494ca2 selftests: drv: net: add version indicator
Currently, the test result does not differentiate between the cases when
either one of the address families are configured or if both the address
families are configured. Ideally, the result should report if a
particular case was skipped.

./drivers/net/ping.py
TAP version 13
1..7
ok 1 ping.test_default_v4 # SKIP Test requires IPv4 connectivity
ok 2 ping.test_default_v6
ok 3 ping.test_xdp_generic_sb
ok 4 ping.test_xdp_generic_mb
ok 5 ping.test_xdp_native_sb
ok 6 ping.test_xdp_native_mb
ok 7 ping.test_xdp_offload # SKIP device does not support offloaded XDP
Totals: pass:5 fail:0 xfail:0 xpass:0 skip:2 error:0

Fixes: 75cc19c8ff ("selftests: drv-net: add xdp cases for ping.py")
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
Reviewed-by: David Wei <dw@davidwei.uk>
Link: https://patch.msgid.link/20250503013518.1722913-4-mohsin.bashr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 18:17:16 -07:00
Mohsin Bashir
8bb7d8e5cf selftests: drv: net: avoid skipping tests
On a system with either of the ipv4 or ipv6 information missing, tests
are currently skipped. Ideally, the test should run as long as at least
one address family is present. This patch make test run whenever
possible.

Before:
./drivers/net/ping.py
TAP version 13
1..6
ok 1 ping.test_default # SKIP Test requires IPv4 connectivity
ok 2 ping.test_xdp_generic_sb # SKIP Test requires IPv4 connectivity
ok 3 ping.test_xdp_generic_mb # SKIP Test requires IPv4 connectivity
ok 4 ping.test_xdp_native_sb # SKIP Test requires IPv4 connectivity
ok 5 ping.test_xdp_native_mb # SKIP Test requires IPv4 connectivity
ok 6 ping.test_xdp_offload # SKIP device does not support offloaded XDP
Totals: pass:0 fail:0 xfail:0 xpass:0 skip:6 error:0

After:
./drivers/net/ping.py
TAP version 13
1..6
ok 1 ping.test_default
ok 2 ping.test_xdp_generic_sb
ok 3 ping.test_xdp_generic_mb
ok 4 ping.test_xdp_native_sb
ok 5 ping.test_xdp_native_mb
ok 6 ping.test_xdp_offload # SKIP device does not support offloaded XDP
Totals: pass:5 fail:0 xfail:0 xpass:0 skip:1 error:0

Fixes: 75cc19c8ff ("selftests: drv-net: add xdp cases for ping.py")
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
Link: https://patch.msgid.link/20250503013518.1722913-3-mohsin.bashr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 18:17:16 -07:00
Mohsin Bashir
b344a48cbe selftests: drv: net: fix test failure on ipv6 sys
The `get_interface_info` call has ip version hard-coded which leads to
failures on an IPV6 system. The NetDrvEnv class already gathers
information about remote interface, so instead of fixing the local
implementation switch to using cfg.remote_ifname.

Before:
./drivers/net/ping.py
Traceback (most recent call last):
  File "/new_tests/./drivers/net/ping.py", line 217, in <module>
    main()
  File "/new_tests/./drivers/net/ping.py", line 204, in main
    get_interface_info(cfg)
  File "/new_tests/./drivers/net/ping.py", line 128, in get_interface_info
    raise KsftFailEx('Can not get remote interface')
net.lib.py.ksft.KsftFailEx: Can not get remote interface

After:
./drivers/net/ping.py
TAP version 13
1..6
ok 1 ping.test_default # SKIP Test requires IPv4 connectivity
ok 2 ping.test_xdp_generic_sb # SKIP Test requires IPv4 connectivity
ok 3 ping.test_xdp_generic_mb # SKIP Test requires IPv4 connectivity
ok 4 ping.test_xdp_native_sb # SKIP Test requires IPv4 connectivity
ok 5 ping.test_xdp_native_mb # SKIP Test requires IPv4 connectivity
ok 6 ping.test_xdp_offload # SKIP device does not support offloaded XDP
Totals: pass:0 fail:0 xfail:0 xpass:0 skip:6 error:0

Fixes: 75cc19c8ff ("selftests: drv-net: add xdp cases for ping.py")
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
Reviewed-by: David Wei <dw@davidwei.uk>
Link: https://patch.msgid.link/20250503013518.1722913-2-mohsin.bashr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 18:17:15 -07:00
Jakub Kicinski
ccb52a9c8d Merge branch 'gre-reapply-ipv6-link-local-address-generation-fix'
Guillaume Nault says:

====================
gre: Reapply IPv6 link-local address generation fix.

Reintroduce the IPv6 link-local address generation fix for GRE and its
kernel selftest. These patches were introduced by merge commit
b3fc5927de ("Merge branch
'gre-fix-regressions-in-ipv6-link-local-address-generation'") but have
been reverted by commit 8417db0be5 ("Merge branch
'gre-revert-ipv6-link-local-address-fix'"), because it uncovered
another bug in multipath routing. Now that this bug has been
investigated and fixed, we can apply the GRE link-local address fix
and its kernel selftest again.

For convenience, here's the original cover letter:

    IPv6 link-local address generation has some special cases for GRE
    devices. This has led to several regressions in the past, and some of
    them are still not fixed. This series fixes the remaining problems,
    like the ipv6.conf.<dev>.addr_gen_mode sysctl being ignored and the
    router discovery process not being started (see details in patch 1).

    To avoid any further regressions, patch 2 adds selftests covering
    IPv4 and IPv6 gre/gretap devices with all combinations of currently
    supported addr_gen_mode values.
====================

Link: https://patch.msgid.link/cover.1746225213.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 18:08:19 -07:00
Guillaume Nault
b6a6006b0e selftests: Add IPv6 link-local address generation tests for GRE devices.
GRE devices have their special code for IPv6 link-local address
generation that has been the source of several regressions in the past.

Add selftest to check that all gre, ip6gre, gretap and ip6gretap get an
IPv6 link-link local address in accordance with the
net.ipv6.conf.<dev>.addr_gen_mode sysctl.

Note: This patch was originally applied as commit 6f50175cca ("selftests:
  Add IPv6 link-local address generation tests for GRE devices.").
  However, it was then reverted by commit 355d940f4d ("Revert "selftests:
  Add IPv6 link-local address generation tests for GRE devices."")
  because the commit it depended on was going to be reverted. Now that
  the situation is resolved, we can add this selftest again (no changes
  since original patch, appart from context update in
  tools/testing/selftests/net/Makefile).
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/2c3a5733cb3a6e3119504361a9b9f89fda570a2d.1746225214.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 18:08:15 -07:00
Guillaume Nault
3e6a0243ff gre: Fix again IPv6 link-local address generation.
Use addrconf_addr_gen() to generate IPv6 link-local addresses on GRE
devices in most cases and fall back to using add_v4_addrs() only in
case the GRE configuration is incompatible with addrconf_addr_gen().

GRE used to use addrconf_addr_gen() until commit e5dd729460 ("ip/ip6_gre:
use the same logic as SIT interfaces when computing v6LL address")
restricted this use to gretap and ip6gretap devices, and created
add_v4_addrs() (borrowed from SIT) for non-Ethernet GRE ones.

The original problem came when commit 9af28511be ("addrconf: refuse
isatap eui64 for INADDR_ANY") made __ipv6_isatap_ifid() fail when its
addr parameter was 0. The commit says that this would create an invalid
address, however, I couldn't find any RFC saying that the generated
interface identifier would be wrong. Anyway, since gre over IPv4
devices pass their local tunnel address to __ipv6_isatap_ifid(), that
commit broke their IPv6 link-local address generation when the local
address was unspecified.

Then commit e5dd729460 ("ip/ip6_gre: use the same logic as SIT
interfaces when computing v6LL address") tried to fix that case by
defining add_v4_addrs() and calling it to generate the IPv6 link-local
address instead of using addrconf_addr_gen() (apart for gretap and
ip6gretap devices, which would still use the regular
addrconf_addr_gen(), since they have a MAC address).

That broke several use cases because add_v4_addrs() isn't properly
integrated into the rest of IPv6 Neighbor Discovery code. Several of
these shortcomings have been fixed over time, but add_v4_addrs()
remains broken on several aspects. In particular, it doesn't send any
Router Sollicitations, so the SLAAC process doesn't start until the
interface receives a Router Advertisement. Also, add_v4_addrs() mostly
ignores the address generation mode of the interface
(/proc/sys/net/ipv6/conf/*/addr_gen_mode), thus breaking the
IN6_ADDR_GEN_MODE_RANDOM and IN6_ADDR_GEN_MODE_STABLE_PRIVACY cases.

Fix the situation by using add_v4_addrs() only in the specific scenario
where the normal method would fail. That is, for interfaces that have
all of the following characteristics:

  * run over IPv4,
  * transport IP packets directly, not Ethernet (that is, not gretap
    interfaces),
  * tunnel endpoint is INADDR_ANY (that is, 0),
  * device address generation mode is EUI64.

In all other cases, revert back to the regular addrconf_addr_gen().

Also, remove the special case for ip6gre interfaces in add_v4_addrs(),
since ip6gre devices now always use addrconf_addr_gen() instead.

Note:
  This patch was originally applied as commit 183185a18f ("gre: Fix
  IPv6 link-local address generation."). However, it was then reverted
  by commit fc486c2d06 ("Revert "gre: Fix IPv6 link-local address
  generation."") because it uncovered another bug that ended up
  breaking net/forwarding/ip6gre_custom_multipath_hash.sh. That other
  bug has now been fixed by commit 4d0ab3a688 ("ipv6: Start path
  selection from the first nexthop"). Therefore we can now revive this
  GRE patch (no changes since original commit 183185a18f ("gre: Fix
  IPv6 link-local address generation.").

Fixes: e5dd729460 ("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL address")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/a88cc5c4811af36007645d610c95102dccb360a6.1746225214.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 18:08:14 -07:00
Andrew Lunn
c360eb0c3c dt-bindings: net: ethernet-controller: Add informative text about RGMII delays
Device Tree and Ethernet MAC driver writers often misunderstand RGMII
delays. Rewrite the Normative section in terms of the PCB, is the PCB
adding the 2ns delay. This meaning was previous implied by the
definition, but often wrongly interpreted due to the ambiguous wording
and looking at the definition from the wrong perspective. The new
definition concentrates clearly on the hardware, and should be less
ambiguous.

Add an Informative section to the end of the binding describing in
detail what the four RGMII delays mean. This expands on just the PCB
meaning, adding in the implications for the MAC and PHY.

Additionally, when the MAC or PHY needs to add a delay, which is
software configuration, describe how Linux does this, in the hope of
reducing errors. Make it clear other users of device tree binding may
implement the software configuration in other ways while still
conforming to the binding.

Fixes: 9d3de3c583 ("dt-bindings: net: Add YAML schemas for the generic Ethernet options")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20250430-v6-15-rc3-net-rgmii-delays-v2-1-099ae651d5e5@lunn.ch
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 16:49:18 -07:00
Johan Hovold
e66b0a8f04 i2c: omap: fix deprecated of_property_read_bool() use
Using of_property_read_bool() for non-boolean properties is deprecated
and results in a warning during runtime since commit c141ecc3ce ("of:
Warn when of_property_read_bool() is used on non-boolean properties").

Fixes: b6ef830c60 ("i2c: omap: Add support for setting mux")
Cc: Jayesh Choudhary <j-choudhary@ti.com>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Acked-by: Mukesh Kumar Savaliya <quic_msavaliy@quicinc.com>
Link: https://lore.kernel.org/r/20250415075230.16235-1-johan+linaro@kernel.org
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2025-05-06 00:14:25 +02:00
Jakub Kicinski
4397684a29 virtio-net: free xsk_buffs on error in virtnet_xsk_pool_enable()
The selftests added to our CI by Bui Quang Minh recently reveals
that there is a mem leak on the error path of virtnet_xsk_pool_enable():

unreferenced object 0xffff88800a68a000 (size 2048):
  comm "xdp_helper", pid 318, jiffies 4294692778
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace (crc 0):
    __kvmalloc_node_noprof+0x402/0x570
    virtnet_xsk_pool_enable+0x293/0x6a0 (drivers/net/virtio_net.c:5882)
    xp_assign_dev+0x369/0x670 (net/xdp/xsk_buff_pool.c:226)
    xsk_bind+0x6a5/0x1ae0
    __sys_bind+0x15e/0x230
    __x64_sys_bind+0x72/0xb0
    do_syscall_64+0xc1/0x1d0
    entry_SYSCALL_64_after_hwframe+0x77/0x7f

Acked-by: Jason Wang <jasowang@redhat.com>
Fixes: e9f3962441 ("virtio_net: xsk: rx: support fill with xsk buffer")
Link: https://patch.msgid.link/20250430163836.3029761-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 13:54:40 -07:00
Jakub Kicinski
1e20324b23 virtio-net: don't re-enable refill work too early when NAPI is disabled
Commit 4bc12818b3 ("virtio-net: disable delayed refill when pausing rx")
fixed a deadlock between reconfig paths and refill work trying to disable
the same NAPI instance. The refill work can't run in parallel with reconfig
because trying to double-disable a NAPI instance causes a stall under the
instance lock, which the reconfig path needs to re-enable the NAPI and
therefore unblock the stalled thread.

There are two cases where we re-enable refill too early. One is in the
virtnet_set_queues() handler. We call it when installing XDP:

   virtnet_rx_pause_all(vi);
   ...
   virtnet_napi_tx_disable(..);
   ...
   virtnet_set_queues(..);
   ...
   virtnet_rx_resume_all(..);

We want the work to be disabled until we call virtnet_rx_resume_all(),
but virtnet_set_queues() kicks it before NAPIs were re-enabled.

The other case is a more trivial case of mis-ordering in
__virtnet_rx_resume() found by code inspection.

Taking the spin lock in virtnet_set_queues() (requested during review)
may be unnecessary as we are under rtnl_lock and so are all paths writing
to ->refill_enabled.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Bui Quang Minh <minhquangbui99@gmail.com>
Fixes: 4bc12818b3 ("virtio-net: disable delayed refill when pausing rx")
Fixes: 413f0271f3 ("net: protect NAPI enablement with netdev_lock()")
Link: https://patch.msgid.link/20250430163758.3029367-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 13:53:53 -07:00
Jakub Kicinski
75dbdaad32 Merge branch 'net_sched-fix-a-regression-in-sch_htb'
Cong Wang says:

====================
net_sched: fix a regression in sch_htb

This patchset contains a fix for the regression reported by Alan and a
selftest to cover that case. Please see each patch description for more
details.
====================

Link: https://patch.msgid.link/20250428232955.1740419-1-xiyou.wangcong@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 13:51:38 -07:00
Cong Wang
63890286f5 selftests/tc-testing: Add a test case to cover basic HTB+FQ_CODEL case
Integrate the reproducer from Alan into TC selftests and use scapy to
generate TCP traffic instead of relying on ping command.

Cc: Alan J. Wylie <alan@wylie.me.uk>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Link: https://patch.msgid.link/20250428232955.1740419-3-xiyou.wangcong@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 13:51:33 -07:00
Cong Wang
3769478610 sch_htb: make htb_deactivate() idempotent
Alan reported a NULL pointer dereference in htb_next_rb_node()
after we made htb_qlen_notify() idempotent.

It turns out in the following case it introduced some regression:

htb_dequeue_tree():
  |-> fq_codel_dequeue()
    |-> qdisc_tree_reduce_backlog()
      |-> htb_qlen_notify()
        |-> htb_deactivate()
  |-> htb_next_rb_node()
  |-> htb_deactivate()

For htb_next_rb_node(), after calling the 1st htb_deactivate(), the
clprio[prio]->ptr could be already set to  NULL, which means
htb_next_rb_node() is vulnerable here.

For htb_deactivate(), although we checked qlen before calling it, in
case of qlen==0 after qdisc_tree_reduce_backlog(), we may call it again
which triggers the warning inside.

To fix the issues here, we need to:

1) Make htb_deactivate() idempotent, that is, simply return if we
   already call it before.
2) Make htb_next_rb_node() safe against ptr==NULL.

Many thanks to Alan for testing and for the reproducer.

Fixes: 5ba8b837b5 ("sch_htb: make htb_qlen_notify() idempotent")
Reported-by: Alan J. Wylie <alan@wylie.me.uk>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Link: https://patch.msgid.link/20250428232955.1740419-2-xiyou.wangcong@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 13:51:32 -07:00
Kent Overstreet
9c61856099 bcachefs: Call bch2_fs_start before getting vfs superblock
This reverts

1fdbe0b184 bcachefs: Make sure c->vfs_sb is set before starting fs

switched up bch2_fs_get_tree() so that we got a superblock before
calling bch2_fs_start, so that c->vfs_sb would always be initialized
while the filesystem was active.

This turned out not to be necessary, because blk_holder_ops were
implemented using our own locking, not vfs locking.

And this had the side effect of creating a super_block and doing our
full recovery (including potentially fsck) before setting SB_BORN, which
causes things like sync calls to hang until our recovery is finished.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-05 16:06:35 -04:00
Marc Zyngier
b60e285b6a KVM: arm64: selftest: Don't try to disable AArch64 support
Trying to cut the branch you are sat on is pretty dumb. And so is
trying to disable the instruction set you are executing on.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20250429114117.3618800-3-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-05-05 12:19:45 -07:00
Marc Zyngier
7af7cfbe78 KVM: arm64: Prevent userspace from disabling AArch64 support at any virtualisable EL
A sorry excuse for a selftest is trying to disable AArch64 support.
And yes, this goes as well as you can imagine.

Let's forbid this sort of things. Normal userspace shouldn't get
caught doing that.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20250429114117.3618800-2-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-05-05 12:19:45 -07:00
Marc Zyngier
859c60276e KVM: arm64: Force HCR_EL2.xMO to 1 at all times in VHE mode
We keep setting and clearing these bits depending on the role of
the host kernel, mimicking what we do for nVHE. But that's actually
pretty pointless, as we always want physical interrupts to make it
to the host, at EL2.

This has also two problems:

- it prevents IRQs from being taken when these bits are cleared
  if the implementation has chosen to implement these bits as
  masks when HCR_EL2.{TGE,xMO}=={0,0}

- it triggers a bad erratum on the AmpereOne HW, which catches
  fire on clearing these bits while an interrupt is being taken
  (AC03_CPU_36).

Let's kill these two birds with a single stone, and permanently
set the xMO bits when running VHE. This involves a bit of surgery
on code paths that rely on flipping these bits on and off for
other purposes.

Note that the earliest setting of hcr_el2 (in the init_hcr_el2
macro) is left untouched as is runs extremely early, with interrupts
disabled, and soon enough overwritten with the final value containing
the xMO bits.

Reported-by: D Scott Phillips <scott@os.amperecomputing.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250429114326.3618875-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-05-05 12:19:24 -07:00
Sebastian Ott
157dbc4a32 KVM: arm64: Fix uninitialized memcache pointer in user_mem_abort()
Commit fce886a602 ("KVM: arm64: Plumb the pKVM MMU in KVM") made the
initialization of the local memcache variable in user_mem_abort()
conditional, leaving a codepath where it is used uninitialized via
kvm_pgtable_stage2_map().

This can fail on any path that requires a stage-2 allocation
without transition via a permission fault or dirty logging.

Fix this by making sure that memcache is always valid.

Fixes: fce886a602 ("KVM: arm64: Plumb the pKVM MMU in KVM")
Signed-off-by: Sebastian Ott <sebott@redhat.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/kvmarm/3f5db4c7-ccce-fb95-595c-692fa7aad227@redhat.com/
Link: https://lore.kernel.org/r/20250505173148.33900-1-sebott@redhat.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-05-05 12:12:27 -07:00
Kent Overstreet
aed4ccbf45 bcachefs: fix hung task timeout in journal read
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-05 14:21:28 -04:00
Kent Overstreet
7a69fa6571 bcachefs: Add missing barriers before wake_up_bit()
wake_up() doesn't require a barrier - but wake_up_bit() does.

This only affected non x86, and primarily lead to lost wakeups after
btree node reads.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-05 14:19:10 -04:00
Kent Overstreet
50a7b899a0 bcachefs: Ensure proper write alignment
There was a buggy version of bcachefs-tools which picked misaligned
bucket sizes when formatting, and we're also about to do dynamic block
sizes - which will allow picking logical block size or physical block
size of the device per-write, allowing for better compression ratios at
the cost of slightly worse write performance (i.e. forcing the device to
do RMW or extra buffering).

To account for this, tweak bch2_alloc_sectors_start() to properly align
open_buckets to the blocksize of the write we're about to do.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-05 14:19:01 -04:00
Kent Overstreet
844f766e02 bcachefs: Improve want_cached_ptr()
If promote target isn't set, rebalance should still leave a cached copy
on the faster device.

Fall back to foreground_target if it's set, or allow a cached copy on
any device if neither are set.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-05 14:16:20 -04:00
Linus Torvalds
01f95500a1 Merge tag 'uml-for-linux-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux
Pull uml fix from Johannes Berg:
 "There's just a single fix here for the _nofault changes that were
  causing issues with clang, and then when we looked at it some other
  issues seemed to exist"

* tag 'uml-for-linux-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux:
  um: fix _nofault accesses
2025-05-05 08:14:53 -07:00
Linus Torvalds
7b26feb436 Merge tag 'soc-fixes-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
 "The main changes are once more for the NXP i.MX platform, addressing
  multiple regressions in recent devicetree updates for the i.MX8MM and
  i.MX6ULL SoCs, a PCIe fix for i.MX9 and a MAINTAINERS file update to
  disambiguate NXP i.MX SoCs from Sony IMX image sensors.

  The stm32 platform devicetree files get some compatibility fixes for
  the interrupt controller node.

  Another compatibility fix is done for the Arm Morello platform's cache
  controller node.

  The code changes are all for firmware drivers, fixing kernel-side bugs
  on the Arm FF-A and SCMI drivers"

* tag 'soc-fixes-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  arm64: dts: st: Use 128kB size for aliased GIC400 register access on stm32mp23 SoCs
  arm64: dts: st: Adjust interrupt-controller for stm32mp23 SoCs
  arm64: dts: st: Use 128kB size for aliased GIC400 register access on stm32mp21 SoCs
  arm64: dts: st: Adjust interrupt-controller for stm32mp21 SoCs
  arm64: dts: st: Use 128kB size for aliased GIC400 register access on stm32mp25 SoCs
  arm64: dts: st: Adjust interrupt-controller for stm32mp25 SoCs
  arm64: dts: imx8mm-verdin: Link reg_usdhc2_vqmmc to usdhc2
  MAINTAINERS: add exclude for dt-bindings to imx entry
  ARM: dts: opos6ul: add ksz8081 phy properties
  arm64: dts: imx95: Correct the range of PCIe app-reg region
  arm64: dts: imx8mp: configure GPU and NPU clocks in nominal DTSI
  arm64: dts: morello: Fix-up cache nodes
  firmware: arm_ffa: Skip Rx buffer ownership release if not acquired
  firmware: arm_scmi: Fix timeout checks on polling path
  firmware: arm_scmi: Balance device refcount when destroying devices
2025-05-05 08:07:24 -07:00
Mathias Nyman
cab63934c3 xhci: dbc: Avoid event polling busyloop if pending rx transfers are inactive.
Event polling delay is set to 0 if there are any pending requests in
either rx or tx requests lists. Checking for pending requests does
not work well for "IN" transfers as the tty driver always queues
requests to the list and TRBs to the ring, preparing to receive data
from the host.

This causes unnecessary busylooping and cpu hogging.

Only set the event polling delay to 0 if there are pending tx "write"
transfers, or if it was less than 10ms since last active data transfer
in any direction.

Cc: Łukasz Bartosik <ukaszb@chromium.org>
Fixes: fb18e5bb96 ("xhci: dbc: poll at different rate depending on data transfer activity")
Cc: stable@vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250505125630.561699-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-05 16:30:45 +02:00
Michal Pecio
6328bdc988 usb: xhci: Don't trust the EP Context cycle bit when moving HW dequeue
VIA VL805 doesn't bother updating the EP Context cycle bit when the
endpoint halts. This is seen by patching xhci_move_dequeue_past_td()
to print the cycle bits of the EP Context and the TRB at hw_dequeue
and then disconnecting a flash drive while reading it. Actual cycle
state is random as expected, but the EP Context bit is always 1.

This means that the cycle state produced by this function is wrong
half the time, and then the endpoint stops working.

Work around it by looking at the cycle bit of TD's end_trb instead
of believing the Endpoint or Stream Context. Specifically:

- rename cycle_found to hw_dequeue_found to avoid confusion
- initialize new_cycle from td->end_trb instead of hw_dequeue
- switch new_cycle toggling to happen after end_trb is found

Now a workload which regularly stalls the device works normally for
a few hours and clearly demonstrates the HW bug - the EP Context bit
is not updated in a new cycle until Set TR Dequeue overwrites it:

[  +0,000298] sd 10:0:0:0: [sdc] Attached SCSI disk
[  +0,011758] cycle bits: TRB 1 EP Ctx 1
[  +5,947138] cycle bits: TRB 1 EP Ctx 1
[  +0,065731] cycle bits: TRB 0 EP Ctx 1
[  +0,064022] cycle bits: TRB 0 EP Ctx 0
[  +0,063297] cycle bits: TRB 0 EP Ctx 0
[  +0,069823] cycle bits: TRB 0 EP Ctx 0
[  +0,063390] cycle bits: TRB 1 EP Ctx 0
[  +0,063064] cycle bits: TRB 1 EP Ctx 1
[  +0,062293] cycle bits: TRB 1 EP Ctx 1
[  +0,066087] cycle bits: TRB 0 EP Ctx 1
[  +0,063636] cycle bits: TRB 0 EP Ctx 0
[  +0,066360] cycle bits: TRB 0 EP Ctx 0

Also tested on the buggy ASM1042 which moves EP Context dequeue to
the next TRB after errors, one problem case addressed by the rework
that implemented this loop. In this case hw_dequeue can be enqueue,
so simply picking the cycle bit of TRB at hw_dequeue wouldn't work.

Commit 5255660b20 ("xhci: add quirk for host controllers that
don't update endpoint DCS") tried to solve the stale cycle problem,
but it was more complex and got reverted due to a reported issue.

Cc: Jonathan Bell <jonathan@raspberrypi.org>
Cc: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250505125630.561699-2-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-05 16:30:45 +02:00
Heiko Carstens
3a47b1e3ce s390: Update defconfigs
Just the regular update of all defconfigs.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-05-05 15:46:34 +02:00
Gerald Schaefer
833542b3e3 s390/dcssblk: Fix build error with CONFIG_DAX=m and CONFIG_DCSSBLK=y
After commit 653d7825c1 ("dcssblk: mark DAX broken, remove FS_DAX_LIMITED
support") moved the "select DAX" from config DCSSBLK to the new config
DCSSBLK_DAX, randconfig tests could result in build errors like this:

s390-linux-ld: drivers/s390/block/dcssblk.o: in function `dcssblk_shared_store':
drivers/s390/block/dcssblk.c:417: undefined reference to `kill_dax'
s390-linux-ld: drivers/s390/block/dcssblk.c:418: undefined reference to `put_dax'

This is because it's now possible to have CONFIG_DCSSBLK=y, but CONFIG_DAX=m.
Fix this by adding "depends on DAX || DAX=n" to config DCSSBLK, to make it
explicit that we want either no DAX, or the same "y/m" for both config DAX
and DCSSBLK, similar to config BLK_DEV_DM.

This also requires removing the "select DAX" from config DCSSBLK_DAX, or
else there would be a recursive dependency detected. DCSSBLK_DAX is marked
as BROKEN at the moment, and won't work well with DAX anyway, so it doesn't
really matter if it is selected.

Fixes: 653d7825c1 ("dcssblk: mark DAX broken, remove FS_DAX_LIMITED support")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202504291604.pvjonhWX-lkp@intel.com/
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-05-05 15:46:34 +02:00
Heiko Carstens
ae952eea6f s390/entry: Fix last breaking event handling in case of stack corruption
In case of stack corruption stack_invalid() is called and the expectation
is that register r10 contains the last breaking event address. This
dependency is quite subtle and broke a couple of years ago without that
anybody noticed.

Fix this by getting rid of the dependency and read the last breaking event
address from lowcore.

Fixes: 56e62a7370 ("s390: convert to generic entry")
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-05-05 15:46:34 +02:00
Konstantin Shkolnyy
d2b8111c22 s390/configs: Enable options required for TC flow offload
While testing Open vSwitch with Nvidia ConnectX-6 NIC, it was noticed
that it didn't offload TC flows into the NIC, and its log contained
many messages such as:
"failed to offload flow: No such file or directory: <network device name>"
and, upon enabling more versose logging, additionally:
"received NAK error=2 - TC classifier not found"

The options enabled here are listed as requirements in Nvidia online
documentation, among other options that were already enabled. Now all
options listed by Nvidia are enabled..

This option is also added because Fedora has it:
  CONFIG_NET_EMATCH

Signed-off-by: Konstantin Shkolnyy <kshk@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-05-05 15:46:34 +02:00
Konstantin Shkolnyy
0db8a9a943 s390/configs: Enable VDPA on Nvidia ConnectX-6 network card
ConnectX-6 is the first VDPA-capable NIC. For earlier NICs, Nvidia
implements a VDPA emulation in s/w, which hasn't been validated on s390.

Add options necessary for VDPA to work.

These options are also added because Fedora has them:
  CONFIG_VDPA_SIM
  CONFIG_VDPA_SIM_NET
  CONFIG_VDPA_SIM_BLOCK
  CONFIG_VDPA_USER
  CONFIG_VP_VDPA

Signed-off-by: Konstantin Shkolnyy <kshk@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-05-05 15:46:34 +02:00
Sebastian Andrzej Siewior
94cff94634 clocksource/i8253: Use raw_spinlock_irqsave() in clockevent_i8253_disable()
On x86 during boot, clockevent_i8253_disable() can be invoked via
x86_late_time_init -> hpet_time_init() -> pit_timer_init() which happens
with enabled interrupts.

If some of the old i8253 hardware is actually used then lockdep will notice
that i8253_lock is used in hard interrupt context. This causes lockdep to
complain because it observed the lock being acquired with interrupts
enabled and in hard interrupt context.

Make clockevent_i8253_disable() acquire the lock with
raw_spinlock_irqsave() to cure this.

[ tglx: Massage change log and use guard() ]

Fixes: c8c4076723 ("x86/timer: Skip PIT initialization on modern chipsets")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250404133116.p-XRWJXf@linutronix.de
2025-05-05 15:34:49 +02:00
Lizhi Xu
f5c84eff63 loop: Add sanity check for read/write_iter
Some file systems do not support read_iter/write_iter, such as selinuxfs
in this issue.
So before calling them, first confirm that the interface is supported and
then call it.

It is releavant in that vfs_iter_read/write have the check, and removal
of their used caused szybot to be able to hit this issue.

Fixes: f2fed441c6 ("loop: stop using vfs_iter__{read,write} for buffered I/O")
Reported-by: syzbot+6af973a3b8dfd2faefdc@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=6af973a3b8dfd2faefdc
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250428143626.3318717-1-lizhi.xu@windriver.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-05 07:18:05 -06:00
Nylon Chen
eb16b3727c riscv: misaligned: Add handling for ZCB instructions
Add support for the Zcb extension's compressed half-word instructions
(C.LHU, C.LH, and C.SH) in the RISC-V misaligned access trap handler.

Signed-off-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Nylon Chen <nylon.chen@sifive.com>
Fixes: 956d705dd2 ("riscv: Unaligned load/store handling for M_MODE")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250411073850.3699180-2-nylon.chen@sifive.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-05 13:09:38 +00:00
Christian Hewitt
0f67578587 arm64: dts: amlogic: dreambox: fix missing clkc_audio node
Add the clkc_audio node to fix audio support on Dreambox One/Two.

Fixes: 83a6f4c62c ("arm64: dts: meson: add initial support for Dreambox One/Two")
CC: stable@vger.kernel.org
Suggested-by: Emanuel Strobel <emanuel.strobel@yahoo.com>
Signed-off-by: Christian Hewitt <christianshewitt@gmail.com>
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Link: https://lore.kernel.org/r/20250503084443.3704866-1-christianshewitt@gmail.com
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
2025-05-05 14:35:46 +02:00
Geert Uytterhoeven
a73fa3690a spi: loopback-test: Do not split 1024-byte hexdumps
spi_test_print_hex_dump() prints buffers holding less than 1024 bytes in
full.  Larger buffers are truncated: only the first 512 and the last 512
bytes are printed, separated by a truncation message.  The latter is
confusing in case the buffer holds exactly 1024 bytes, as all data is
printed anyway.

Fix this by printing buffers holding up to and including 1024 bytes in
full.

Fixes: 84e0c4e5e2 ("spi: add loopback test driver to allow for spi_master regression tests")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/37ee1bc90c6554c9347040adabf04188c8f704aa.1746184171.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-05 19:30:31 +09:00
Arnd Bergmann
b19fa45715 ASoC: mediatek: mt8188-mt6359: select CONFIG_SND_SOC_MT6359_ACCDET
The driver support was added without selecting the codec, which leads to
a link failure:

aarch64-linux-ld: sound/soc/mediatek/mt8188/mt8188-mt6359.o: in function `mt8188_mt6359_init':
mt8188-mt6359.c:(.text+0x19f0): undefined reference to `mt6359_accdet_enable_jack_detect'

Fixes: f35d834d67 ("ASoC: mediatek: mt8188-mt6359: Add accdet headset jack detect support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://patch.msgid.link/20250505052106.1811802-1-arnd@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-05 19:30:27 +09:00
Vinay Belgaumkar
936b73feab drm/i915/slpc: Balance the inc/dec for num_waiters
As seen in some recent failures, SLPC num_waiters value is < 0.
This happens because the inc/dec are not balanced. We should skip
decrement for the same conditions as the increment. Currently, we
do that for power saving profile mode. This patch also ensures that
num_waiters is incremented in the case min_softlimit is at boost
freq. It ensures that we don't reduce the frequency while this request
is in flight.

v2: Add Fixes tags

Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13598
Fixes: f864a29afc ("drm/i915/slpc: Optmize waitboost for SLPC")
Fixes: 4a82ceb04a ("drm/i915/slpc: Add sysfs for SLPC power profiles")
Cc: Sk Anirban <sk.anirban@intel.com>
Reviewed-by: Sk Anirban <sk.anirban@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://lore.kernel.org/r/20250428183555.3250021-1-vinay.belgaumkar@intel.com
(cherry picked from commit d26e55085f)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-05-05 12:36:04 +03:00
Borislav Petkov (AMD)
5214a9f6c0 x86/microcode: Consolidate the loader enablement checking
Consolidate the whole logic which determines whether the microcode loader
should be enabled or not into a single function and call it everywhere.

Well, almost everywhere - not in mk_early_pgtbl_32() because there the kernel
is running without paging enabled and checking dis_ucode_ldr et al would
require physical addresses and uglification of the code.

But since this is 32-bit, the easier thing to do is to simply map the initrd
unconditionally especially since that mapping is getting removed later anyway
by zap_early_initrd_mapping() and avoid the uglification.

In doing so, address the issue of old 486er machines without CPUID
support, not booting current kernels.

  [ mingo: Fix no previous prototype for ‘microcode_loader_disabled’ [-Wmissing-prototypes] ]

Fixes: 4c585af718 ("x86/boot/32: Temporarily map initrd for microcode loading")
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/CANpbe9Wm3z8fy9HbgS8cuhoj0TREYEEkBipDuhgkWFvqX0UoVQ@mail.gmail.com
2025-05-05 10:51:00 +02:00
Johannes Berg
68025adfc1 um: fix _nofault accesses
Nathan reported [1] that when built with clang, the um kernel
crashes pretty much immediately. This turned out to be an issue
with the inline assembly I had added, when clang used %rax/%eax
for both operands. Reorder it so current->thread.segv_continue
is written first, and then the lifetime of _faulted won't have
overlap with the lifetime of segv_continue.

In the email thread Benjamin also pointed out that current->mm
is only NULL for true kernel tasks, but we could do this for a
userspace task, so the current->thread.segv_continue logic must
be lifted out of the mm==NULL check.

Finally, while looking at this, put a barrier() so the NULL
assignment to thread.segv_continue cannot be reorder before
the possibly faulting operation.

Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://lore.kernel.org/r/20250402221254.GA384@ax162 [1]
Fixes: d1d7f01f7c ("um: mark rodata read-only and implement _nofault accesses")
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-05-05 10:06:51 +02:00
Kent Overstreet
df2e19a883 bcachefs: thread_with_stdio: fix spinning instead of exiting
bch2_stdio_redirect_vprintf() was missing a check for stdio->done, i.e.
exiting.

This caused the thread attempting to print to spin, and since it was
being called from the kthread ran by thread_with_stdio, the userspace
side hung as well.

Change it to return -EPIPE - i.e. writing to a pipe that's been closed.

Reported-by: Jan Solanti <jhs@psonet.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-04 14:00:14 -04:00
Jens Axboe
b53e523261 io_uring: always arm linked timeouts prior to issue
There are a few spots where linked timeouts are armed, and not all of
them adhere to the pre-arm, attempt issue, post-arm pattern. This can
be problematic if the linked request returns that it will trigger a
callback later, and does so before the linked timeout is fully armed.

Consolidate all the linked timeout handling into __io_issue_sqe(),
rather than have it spread throughout the various issue entry points.

Cc: stable@vger.kernel.org
Link: https://github.com/axboe/liburing/issues/1390
Reported-by: Chase Hiltz <chase@path.net>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-04 09:15:58 -06:00
Mickaël Salaün
b1525d0a8d landlock: Remove KUnit test that triggers a warning
A KUnit test checking boundaries triggers a canary warning, which may be
disturbing.  Let's remove this test for now.  Hopefully, KUnit will soon
get support for suppressing warning backtraces [1].

Cc: Alessandro Carminati <acarmina@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Günther Noack <gnoack@google.com>
Reported-by: Tingmao Wang <m@maowtm.org>
Closes: https://lore.kernel.org/r/20250327213807.12964-1-m@maowtm.org
Link: https://lore.kernel.org/r/20250425193249.78b45d2589575c15f483c3d8@linux-foundation.org [1]
Link: https://lore.kernel.org/r/20250503065359.3625407-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-05-03 08:55:42 +02:00
Sean Christopherson
9129633d56 KVM: x86/mmu: Prevent installing hugepages when mem attributes are changing
When changing memory attributes on a subset of a potential hugepage, add
the hugepage to the invalidation range tracking to prevent installing a
hugepage until the attributes are fully updated.  Like the actual hugepage
tracking updates in kvm_arch_post_set_memory_attributes(), process only
the head and tail pages, as any potential hugepages that are entirely
covered by the range will already be tracked.

Note, only hugepage chunks whose current attributes are NOT mixed need to
be added to the invalidation set, as mixed attributes already prevent
installing a hugepage, and it's perfectly safe to install a smaller
mapping for a gfn whose attributes aren't changing.

Fixes: 8dd2eee9d5 ("KVM: x86/mmu: Handle page fault for private memory")
Cc: stable@vger.kernel.org
Reported-by: Michael Roth <michael.roth@amd.com>
Tested-by: Michael Roth <michael.roth@amd.com>
Link: https://lore.kernel.org/r/20250430220954.522672-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-05-02 13:39:34 -07:00
Tom Lendacky
5fea0c6c0e KVM: SVM: Update dump_ghcb() to use the GHCB snapshot fields
Commit 4e15a0ddc3 ("KVM: SEV: snapshot the GHCB before accessing it")
updated the SEV code to take a snapshot of the GHCB before using it. But
the dump_ghcb() function wasn't updated to use the snapshot locations.
This results in incorrect output from dump_ghcb() for the "is_valid" and
"valid_bitmap" fields.

Update dump_ghcb() to use the proper locations.

Fixes: 4e15a0ddc3 ("KVM: SEV: snapshot the GHCB before accessing it")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Link: https://lore.kernel.org/r/8f03878443681496008b1b37b7c4bf77a342b459.1745866531.git.thomas.lendacky@amd.com
[sean: add comment and snapshot qualifier]
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-05-02 13:39:33 -07:00
Maíra Canal
35e4079bf1 drm/v3d: Add job to pending list if the reset was skipped
When a CL/CSD job times out, we check if the GPU has made any progress
since the last timeout. If so, instead of resetting the hardware, we skip
the reset and let the timer get rearmed. This gives long-running jobs a
chance to complete.

However, when `timedout_job()` is called, the job in question is removed
from the pending list, which means it won't be automatically freed through
`free_job()`. Consequently, when we skip the reset and keep the job
running, the job won't be freed when it finally completes.

This situation leads to a memory leak, as exposed in [1] and [2].

Similarly to commit 704d3d60fe ("drm/etnaviv: don't block scheduler when
GPU is still active"), this patch ensures the job is put back on the
pending list when extending the timeout.

Cc: stable@vger.kernel.org # 6.0
Reported-by: Daivik Bhatia <dtgs1208@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12227 [1]
Closes: https://github.com/raspberrypi/linux/issues/6817 [2]
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Link: https://lore.kernel.org/r/20250430210643.57924-1-mcanal@igalia.com
Signed-off-by: Maíra Canal <mcanal@igalia.com>
2025-05-02 16:25:14 -03:00
Thomas Weißschuh
0c314cda93 arm64: vdso: Work around invalid absolute relocations from GCC
All vDSO code needs to be completely position independent.  Symbol
references are marked as hidden so the compiler emits PC-relative
relocations.

However GCC emits absolute relocations for symbol-relative references with
an offset >= 64KiB. After recent refactorings in the vDSO code this is the
case in __arch_get_vdso_u_timens_data() with a page size of 64KiB.

Work around the issue by preventing the optimizer from seeing the offsets.

Fixes: 83a2a6b8cf ("vdso/gettimeofday: Prepare do_hres_timens() for introduction of struct vdso_clock")
Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/all/20250430-vdso-absolute-reloc-v2-1-5efcc3bc4b26@linutronix.de
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120002
Closes: https://lore.kernel.org/lkml/aApGPAoctq_eoE2g@t14ultra/
2025-05-02 20:57:11 +02:00
Feng Jiang
b662b162c3 drm: Fix potential overflow issue in event_string array
When calling scnprintf() to append recovery method to event_string,
the second argument should be `sizeof(event_string) - len`, otherwise
there is a potential overflow problem.

Fixes: b7cf9f4ac1 ("drm: Introduce device wedged event")
Signed-off-by: Feng Jiang <jiangfeng@kylinos.cn>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Reviewed-by: Raag Jadav <raag.jadav@intel.com>
Link: https://lore.kernel.org/r/20250409014633.31303-1-jiangfeng@kylinos.cn
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-05-02 09:34:22 -04:00
Max Kellermann
d9ec733010 fs/eventpoll: fix endless busy loop after timeout has expired
After commit 0a65bc27bd ("eventpoll: Set epoll timeout if it's in
the future"), the following program would immediately enter a busy
loop in the kernel:

```
int main() {
  int e = epoll_create1(0);
  struct epoll_event event = {.events = EPOLLIN};
  epoll_ctl(e, EPOLL_CTL_ADD, 0, &event);
  const struct timespec timeout = {.tv_nsec = 1};
  epoll_pwait2(e, &event, 1, &timeout, 0);
}
```

This happens because the given (non-zero) timeout of 1 nanosecond
usually expires before ep_poll() is entered and then
ep_schedule_timeout() returns false, but `timed_out` is never set
because the code line that sets it is skipped.  This quickly turns
into a soft lockup, RCU stalls and deadlocks, inflicting severe
headaches to the whole system.

When the timeout has expired, we don't need to schedule a hrtimer, but
we should set the `timed_out` variable.  Therefore, I suggest moving
the ep_schedule_timeout() check into the `timed_out` expression
instead of skipping it.

brauner: Note that there was an earlier fix by Joe Damato in response to
my bug report in [1].

Fixes: 0a65bc27bd ("eventpoll: Set epoll timeout if it's in the future")
Cc: Joe Damato <jdamato@fastly.com>
Cc: stable@vger.kernel.org
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Link: https://lore.kernel.org/20250429153419.94723-1-jdamato@fastly.com [1]
Link: https://lore.kernel.org/20250429185827.3564438-1-max.kellermann@ionos.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-02 14:21:26 +02:00
Naman Jain
65995e97a1 Drivers: hv: Make the sysfs node size for the ring buffer dynamic
The ring buffer size varies across VMBus channels. The size of sysfs
node for the ring buffer is currently hardcoded to 4 MB. Userspace
clients either use fstat() or hardcode this size for doing mmap().
To address this, make the sysfs node size dynamic to reflect the
actual ring buffer size for each channel. This will ensure that
fstat() on ring sysfs node always returns the correct size of
ring buffer.

Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
Link: https://lore.kernel.org/r/20250502074811.2022-3-namjain@linux.microsoft.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-02 13:59:02 +02:00
Naman Jain
f31fe8165d uio_hv_generic: Fix sysfs creation path for ring buffer
On regular bootup, devices get registered to VMBus first, so when
uio_hv_generic driver for a particular device type is probed,
the device is already initialized and added, so sysfs creation in
hv_uio_probe() works fine. However, when the device is removed
and brought back, the channel gets rescinded and the device again gets
registered to VMBus. However this time, the uio_hv_generic driver is
already registered to probe for that device and in this case sysfs
creation is tried before the device's kobject gets initialized
completely.

Fix this by moving the core logic of sysfs creation of ring buffer,
from uio_hv_generic to HyperV's VMBus driver, where the rest of the
sysfs attributes for the channels are defined. While doing that, make
use of attribute groups and macros, instead of creating sysfs
directly, to ensure better error handling and code flow.

Problematic path:
vmbus_process_offer (A new offer comes for the VMBus device)
  vmbus_add_channel_work
    vmbus_device_register
      |-> device_register
      |     |...
      |     |-> hv_uio_probe
      |           |...
      |           |-> sysfs_create_bin_file (leads to a warning as
      |                 the primary channel's kobject, which is used to
      |                 create the sysfs file, is not yet initialized)
      |-> kset_create_and_add
      |-> vmbus_add_channel_kobj (initialization of the primary
                                  channel's kobject happens later)

Above code flow is sequential and the warning is always reproducible in
this path.

Fixes: 9ab877a6cc ("uio_hv_generic: make ring buffer attribute for primary channel")
Cc: stable@kernel.org
Suggested-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Suggested-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
Link: https://lore.kernel.org/r/20250502074811.2022-2-namjain@linux.microsoft.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-02 13:59:02 +02:00
Kairui Song
38e541051e btrfs: open code folio_index() in btree_clear_folio_dirty_tag()
The folio_index() helper is only needed for mixed usage of page cache
and swap cache, for pure page cache usage, the caller can just use
folio->index instead.

It can't be a swap cache folio here.  Swap mapping may only call into fs
through 'swap_rw' but btrfs does not use that method for swap.

Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Kairui Song <kasong@tencent.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-02 13:20:56 +02:00
Qu Wenruo
8fb1dcbbcc Revert "btrfs: canonicalize the device path before adding it"
This reverts commit 7e06de7c83.

Commit 7e06de7c83 ("btrfs: canonicalize the device path before adding
it") tries to make btrfs to use "/dev/mapper/*" name first, then any
filename inside "/dev/" as the device path.

This is mostly fine when there is only the root namespace involved, but
when multiple namespace are involved, things can easily go wrong for the
d_path() usage.

As d_path() returns a file path that is namespace dependent, the
resulted string may not make any sense in another namespace.

Furthermore, the "/dev/" prefix checks itself is not reliable, one can
still make a valid initramfs without devtmpfs, and fill all needed
device nodes manually.

Overall the userspace has all its might to pass whatever device path for
mount, and we are not going to win the war trying to cover every corner
case.

So just revert that commit, and do no extra d_path() based file path
sanity check.

CC: stable@vger.kernel.org # 6.12+
Link: https://lore.kernel.org/linux-fsdevel/20250115185608.GA2223535@zen.localdomain/
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-02 13:20:26 +02:00
Qu Wenruo
f95d186255 btrfs: avoid NULL pointer dereference if no valid csum tree
[BUG]
When trying read-only scrub on a btrfs with rescue=idatacsums mount
option, it will crash with the following call trace:

  BUG: kernel NULL pointer dereference, address: 0000000000000208
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  CPU: 1 UID: 0 PID: 835 Comm: btrfs Tainted: G           O        6.15.0-rc3-custom+ #236 PREEMPT(full)
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022
  RIP: 0010:btrfs_lookup_csums_bitmap+0x49/0x480 [btrfs]
  Call Trace:
   <TASK>
   scrub_find_fill_first_stripe+0x35b/0x3d0 [btrfs]
   scrub_simple_mirror+0x175/0x290 [btrfs]
   scrub_stripe+0x5f7/0x6f0 [btrfs]
   scrub_chunk+0x9a/0x150 [btrfs]
   scrub_enumerate_chunks+0x333/0x660 [btrfs]
   btrfs_scrub_dev+0x23e/0x600 [btrfs]
   btrfs_ioctl+0x1dcf/0x2f80 [btrfs]
   __x64_sys_ioctl+0x97/0xc0
   do_syscall_64+0x4f/0x120
   entry_SYSCALL_64_after_hwframe+0x76/0x7e

[CAUSE]
Mount option "rescue=idatacsums" will completely skip loading the csum
tree, so that any data read will not find any data csum thus we will
ignore data checksum verification.

Normally call sites utilizing csum tree will check the fs state flag
NO_DATA_CSUMS bit, but unfortunately scrub does not check that bit at all.

This results in scrub to call btrfs_search_slot() on a NULL pointer
and triggered above crash.

[FIX]
Check both extent and csum tree root before doing any tree search.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-02 13:20:11 +02:00
Boris Burkov
d6fe0c69b3 btrfs: handle empty eb->folios in num_extent_folios()
num_extent_folios() unconditionally calls folio_order() on
eb->folios[0]. If that is NULL this will be a segfault. It is reasonable
for it to return 0 as the number of folios in the eb when the first
entry is NULL, so do that instead.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-02 13:20:08 +02:00
Goldwyn Rodrigues
bc7e097509 btrfs: correct the order of prelim_ref arguments in btrfs__prelim_ref
btrfs_prelim_ref() calls the old and new reference variables in the
incorrect order. This causes a NULL pointer dereference because oldref
is passed as NULL to trace_btrfs_prelim_ref_insert().

Note, trace_btrfs_prelim_ref_insert() is being called with newref as
oldref (and oldref as NULL) on purpose in order to print out
the values of newref.

To reproduce:
echo 1 > /sys/kernel/debug/tracing/events/btrfs/btrfs_prelim_ref_insert/enable

Perform some writeback operations.

Backtrace:
BUG: kernel NULL pointer dereference, address: 0000000000000018
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 115949067 P4D 115949067 PUD 11594a067 PMD 0
 Oops: Oops: 0000 [#1] SMP NOPTI
 CPU: 1 UID: 0 PID: 1188 Comm: fsstress Not tainted 6.15.0-rc2-tester+ #47 PREEMPT(voluntary)  7ca2cef72d5e9c600f0c7718adb6462de8149622
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-2-gc13ff2cd-prebuilt.qemu.org 04/01/2014
 RIP: 0010:trace_event_raw_event_btrfs__prelim_ref+0x72/0x130
 Code: e8 43 81 9f ff 48 85 c0 74 78 4d 85 e4 0f 84 8f 00 00 00 49 8b 94 24 c0 06 00 00 48 8b 0a 48 89 48 08 48 8b 52 08 48 89 50 10 <49> 8b 55 18 48 89 50 18 49 8b 55 20 48 89 50 20 41 0f b6 55 28 88
 RSP: 0018:ffffce44820077a0 EFLAGS: 00010286
 RAX: ffff8c6b403f9014 RBX: ffff8c6b55825730 RCX: 304994edf9cf506b
 RDX: d8b11eb7f0fdb699 RSI: ffff8c6b403f9010 RDI: ffff8c6b403f9010
 RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000010
 R10: 00000000ffffffff R11: 0000000000000000 R12: ffff8c6b4e8fb000
 R13: 0000000000000000 R14: ffffce44820077a8 R15: ffff8c6b4abd1540
 FS:  00007f4dc6813740(0000) GS:ffff8c6c1d378000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000018 CR3: 000000010eb42000 CR4: 0000000000750ef0
 PKRU: 55555554
 Call Trace:
  <TASK>
  prelim_ref_insert+0x1c1/0x270
  find_parent_nodes+0x12a6/0x1ee0
  ? __entry_text_end+0x101f06/0x101f09
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? srso_alias_return_thunk+0x5/0xfbef5
  btrfs_is_data_extent_shared+0x167/0x640
  ? fiemap_process_hole+0xd0/0x2c0
  extent_fiemap+0xa5c/0xbc0
  ? __entry_text_end+0x101f05/0x101f09
  btrfs_fiemap+0x7e/0xd0
  do_vfs_ioctl+0x425/0x9d0
  __x64_sys_ioctl+0x75/0xc0

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-02 13:19:56 +02:00
Kees Cook
6f9a8ab796 btrfs: compression: adjust cb->compressed_folios allocation type
In preparation for making the kmalloc() family of allocators type aware,
we need to make sure that the returned type from the allocation matches
the type of the variable being assigned. (Before, the allocator would
always return "void *", which can be implicitly cast to any pointer type.)

The assigned type is "struct folio **" but the returned type will be
"struct page **". These are the same allocation size (pointer size), but
the types don't match. Adjust the allocation type to match the assignment.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Kees Cook <kees@kernel.org>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-02 13:19:52 +02:00
Dave Penkler
4e77d3ec7c usb: usbtmc: Fix erroneous generic_read ioctl return
wait_event_interruptible_timeout returns a long
The return value was being assigned to an int causing an integer overflow
when the remaining jiffies > INT_MAX which resulted in random error
returns.

Use a long return value, converting to the int ioctl return only on error.

Fixes: bb99794a47 ("usb: usbtmc: Add ioctl for vendor specific read")
Cc: stable@vger.kernel.org
Signed-off-by: Dave Penkler <dpenkler@gmail.com>
Link: https://lore.kernel.org/r/20250502070941.31819-4-dpenkler@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-02 09:28:54 +02:00
Dave Penkler
a9747c9b8b usb: usbtmc: Fix erroneous wait_srq ioctl return
wait_event_interruptible_timeout returns a long
The return was being assigned to an int causing an integer overflow when
the remaining jiffies > INT_MAX resulting in random error returns.

Use a long return value,  converting to the int ioctl return only on
error.

Fixes: 739240a9f6 ("usb: usbtmc: Add ioctl USBTMC488_IOCTL_WAIT_SRQ")
Cc: stable@vger.kernel.org
Signed-off-by: Dave Penkler <dpenkler@gmail.com>
Link: https://lore.kernel.org/r/20250502070941.31819-3-dpenkler@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-02 09:28:54 +02:00
Dave Penkler
cac01bd178 usb: usbtmc: Fix erroneous get_stb ioctl error returns
wait_event_interruptible_timeout returns a long
The return was being assigned to an int causing an integer overflow when
the remaining jiffies > INT_MAX resulting in random error returns.

Use a long return value and convert to int ioctl return only on error.

When the return value of wait_event_interruptible_timeout was <= INT_MAX
the number of remaining jiffies was returned which has no meaning for the
user. Return 0 on success.

Reported-by: Michael Katzmann <vk2bea@gmail.com>
Fixes: dbf3e7f654 ("Implement an ioctl to support the USMTMC-USB488 READ_STATUS_BYTE operation.")
Cc: stable@vger.kernel.org
Signed-off-by: Dave Penkler <dpenkler@gmail.com>
Link: https://lore.kernel.org/r/20250502070941.31819-2-dpenkler@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-02 09:28:54 +02:00
Kees Cook
11854fe263 binfmt_elf: Move brk for static PIE even if ASLR disabled
In commit bbdc6076d2 ("binfmt_elf: move brk out of mmap when doing
direct loader exec"), the brk was moved out of the mmap region when
loading static PIE binaries (ET_DYN without INTERP). The common case
for these binaries was testing new ELF loaders, so the brk needed to
be away from mmap to avoid colliding with stack, future mmaps (of the
loader-loaded binary), etc. But this was only done when ASLR was enabled,
in an attempt to minimize changes to memory layouts.

After adding support to respect alignment requirements for static PIE
binaries in commit 3545deff0e ("binfmt_elf: Honor PT_LOAD alignment
for static PIE"), it became possible to have a large gap after the
final PT_LOAD segment and the top of the mmap region. This means that
future mmap allocations might go after the last PT_LOAD segment (where
brk might be if ASLR was disabled) instead of before them (where they
traditionally ended up).

On arm64, running with ASLR disabled, Ubuntu 22.04's "ldconfig" binary,
a static PIE, has alignment requirements that leaves a gap large enough
after the last PT_LOAD segment to fit the vdso and vvar, but still leave
enough space for the brk (which immediately follows the last PT_LOAD
segment) to be allocated by the binary.

fffff7f20000-fffff7fde000 r-xp 00000000 fe:02 8110426 /sbin/ldconfig.real
fffff7fee000-fffff7ff5000 rw-p 000be000 fe:02 8110426 /sbin/ldconfig.real
fffff7ff5000-fffff7ffa000 rw-p 00000000 00:00 0
***[brk will go here at fffff7ffa000]***
fffff7ffc000-fffff7ffe000 r--p 00000000 00:00 0       [vvar]
fffff7ffe000-fffff8000000 r-xp 00000000 00:00 0       [vdso]
fffffffdf000-1000000000000 rw-p 00000000 00:00 0      [stack]

After commit 0b3bc3354e ("arm64: vdso: Switch to generic storage
implementation"), the arm64 vvar grew slightly, and suddenly the brk
collided with the allocation.

fffff7f20000-fffff7fde000 r-xp 00000000 fe:02 8110426 /sbin/ldconfig.real
fffff7fee000-fffff7ff5000 rw-p 000be000 fe:02 8110426 /sbin/ldconfig.real
fffff7ff5000-fffff7ffa000 rw-p 00000000 00:00 0
***[oops, no room any more, vvar is at fffff7ffa000!]***
fffff7ffa000-fffff7ffe000 r--p 00000000 00:00 0       [vvar]
fffff7ffe000-fffff8000000 r-xp 00000000 00:00 0       [vdso]
fffffffdf000-1000000000000 rw-p 00000000 00:00 0      [stack]

The solution is to unconditionally move the brk out of the mmap region
for static PIE binaries. Whether ASLR is enabled or not does not change if
there may be future mmap allocation collisions with a growing brk region.

Update memory layout comments (with kernel-doc headings), consolidate
the setting of mm->brk to later (it isn't needed early), move static PIE
brk out of mmap unconditionally, and make sure brk(2) knows to base brk
position off of mm->start_brk not mm->end_data no matter what the cause of
moving it is (via current->brk_randomized).

For the CONFIG_COMPAT_BRK case, though, leave the logic unchanged, as we
can never safely move the brk. These systems, however, are not using
specially aligned static PIE binaries.

Reported-by: Ryan Roberts <ryan.roberts@arm.com>
Closes: https://lore.kernel.org/lkml/f93db308-4a0e-4806-9faf-98f890f5a5e6@arm.com/
Fixes: bbdc6076d2 ("binfmt_elf: move brk out of mmap when doing direct loader exec")
Link: https://lore.kernel.org/r/20250425224502.work.520-kees@kernel.org
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Tested-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Kees Cook <kees@kernel.org>
2025-05-01 17:13:53 -07:00
Wang Zhaolong
eb4447bcce ksmbd: fix memory leak in parse_lease_state()
The previous patch that added bounds check for create lease context
introduced a memory leak. When the bounds check fails, the function
returns NULL without freeing the previously allocated lease_ctx_info
structure.

This patch fixes the issue by adding kfree(lreq) before returning NULL
in both boundary check cases.

Fixes: bab703ed84 ("ksmbd: add bounds check for create lease context")
Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-01 18:58:48 -05:00
Namjae Jeon
53e3e5babc ksmbd: prevent rename with empty string
Client can send empty newname string to ksmbd server.
It will cause a kernel oops from d_alloc.
This patch return the error when attempting to rename
a file or directory with an empty new name string.

Cc: stable@vger.kernel.org
Reported-by: Norbert Szetei <norbert@doyensec.com>
Tested-by: Norbert Szetei <norbert@doyensec.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-01 18:58:48 -05:00
Will McVicker
95deee37a1 platform: Fix race condition during DMA configure at IOMMU probe time
To avoid a race between the IOMMU probing thread and the device driver
async probing thread during configuration of the platform DMA, update
`platform_dma_configure()` to read `dev->driver` once and test if it's
NULL before using it. This ensures that we don't de-reference an invalid
platform driver pointer if the device driver is asynchronously bound
while configuring the DMA.

Fixes: bcb81ac6ae ("iommu: Get DT/ACPI parsing into the proper probe path")
Signed-off-by: Will McVicker <willmcvicker@google.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20250424180420.3928523-1-willmcvicker@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-01 18:00:58 +02:00
Greg Kroah-Hartman
f55aaec4fc Merge tag 'iio-fixes-for-6.15a' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-linus
Jonathan writes:

IIO: 1st set of fixes for the 6.15 cycle.

A mixed bunch of fixes for new and ancient issues found.

multiple driver sets:
- Stop leaking wakeup sources on device unbind.
- Various timestamp alignment fixes that came up as part of work to add
  runtime checks on buffer sizing. Similarly a DMA buffer safety fix.
hid-sensor-prox
- Fix a bad merge conflict resolution that lost some variable assignments.
- Fix handling of scale when multiple channels present.
- Fix wrong application of exponent in offset calculation.
adi,ad7380
- Disable offload before using the SPI bus.
- Fix a wrong shift on the event threshold.
adi,ad7606
- Check there is a sw_mode_config callback before using it as not
  all busses define one.
- Fix missing hold of chip select on in multi word accesses.
adi,ad7861
- Fix wrong logic on storing of mode.
adi,adis16201
- Wrong resolution for inclinometer channel.
adi,adxl367
- Use fresh ODR when setting activity time, not previous value.
bosch,bmi270
- Fix initial sampling frequency configuration which was using the
  wrong register mask.
rockchip,saradc
- Fix clock initialization sequence to get frequency after get + enable,
  not before.
st,lsm6dsx
- Avoid 2 potential infinite loops if we see empty FIFOs
ti,opt3001
- Fix a deadlock that can occur due to concurrent access to a flag.

* tag 'iio-fixes-for-6.15a' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (28 commits)
  iio: adis16201: Correct inclinometer channel resolution
  iio: adc: ad7606: fix serial register access
  iio: pressure: mprls0025pa: use aligned_s64 for timestamp
  iio: imu: adis16550: align buffers for timestamp
  staging: iio: adc: ad7816: Correct conditional logic for store mode
  iio: adc: ad7266: Fix potential timestamp alignment issue.
  iio: adc: ad7768-1: Fix insufficient alignment of timestamp.
  iio: adc: dln2: Use aligned_s64 for timestamp
  iio: accel: adxl355: Make timestamp 64-bit aligned using aligned_s64
  iio: temp: maxim-thermocouple: Fix potential lack of DMA safe buffer.
  iio: chemical: pms7003: use aligned_s64 for timestamp
  iio: chemical: sps30: use aligned_s64 for timestamp
  iio: imu: inv_mpu6050: align buffer for timestamp
  iio: imu: st_lsm6dsx: Fix wakeup source leaks on device unbind
  iio: adc: qcom-spmi-iadc: Fix wakeup source leaks on device unbind
  iio: accel: fxls8962af: Fix wakeup source leaks on device unbind
  iio: adc: ad7380: fix event threshold shift
  iio: hid-sensor-prox: Fix incorrect OFFSET calculation
  iio: hid-sensor-prox: support multi-channel SCALE calculation
  iio: hid-sensor-prox: Restore lost scale assignments
  ...
2025-05-01 17:47:49 +02:00
RD Babiera
e918d3959b usb: typec: tcpm: delay SNK_TRY_WAIT_DEBOUNCE to SRC_TRYWAIT transition
This patch fixes Type-C Compliance Test TD 4.7.6 - Try.SNK DRP Connect
SNKAS.

The compliance tester moves into SNK_UNATTACHED during toggling and
expects the PUT to apply Rp after tPDDebounce of detection. If the port
is in SNK_TRY_WAIT_DEBOUNCE, it will move into SRC_TRYWAIT immediately
and apply Rp. This violates TD 4.7.5.V.3, where the tester confirms that
the PUT attaches Rp after the transitions to Unattached.SNK for
tPDDebounce.

Change the tcpm_set_state delay between SNK_TRY_WAIT_DEBOUNCE and
SRC_TRYWAIT to tPDDebounce.

Fixes: a0a3e04e6b ("staging: typec: tcpm: Check for Rp for tPDDebounce")
Cc: stable <stable@kernel.org>
Signed-off-by: RD Babiera <rdbabiera@google.com>
Reviewed-by: Badhri Jagan Sridharan <badhri@google.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20250429234703.3748506-2-rdbabiera@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-01 17:41:03 +02:00
Oliver Neukum
054c514554 USB: usbtmc: use interruptible sleep in usbtmc_read
usbtmc_read() calls usbtmc_generic_read()
which uses interruptible sleep, but usbtmc_read()
itself uses uninterruptble sleep for mutual exclusion
between threads. That makes no sense.
Both should use interruptible sleep.

Fixes: 5b775f672c ("USB: add USB test and measurement class driver")
Cc: stable <stable@kernel.org>
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Link: https://lore.kernel.org/r/20250430134810.226015-1-oneukum@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-01 17:40:37 +02:00
Pawel Laszczak
8614ecdb15 usb: cdnsp: fix L1 resume issue for RTL_REVISION_NEW_LPM version
The controllers with rtl version larger than
RTL_REVISION_NEW_LPM (0x00002700) has bug which causes that controller
doesn't resume from L1 state. It happens if after receiving LPM packet
controller starts transitioning to L1 and in this moment the driver force
resuming by write operation to PORTSC.PLS.
It's corner case and happens when write operation to PORTSC occurs during
device delay before transitioning to L1 after transmitting ACK
time (TL1TokenRetry).

Forcing transition from L1->L0 by driver for revision larger than
RTL_REVISION_NEW_LPM is not needed, so driver can simply fix this issue
through block call of cdnsp_force_l0_go function.

Fixes: 3d82904559 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver")
Cc: stable <stable@kernel.org>
Signed-off-by: Pawel Laszczak <pawell@cadence.com>
Acked-by: Peter Chen <peter.chen@kernel.org>
Link: https://lore.kernel.org/r/PH7PR07MB9538B55C3A6E71F9ED29E980DD842@PH7PR07MB9538.namprd07.prod.outlook.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-01 17:40:21 +02:00
Andrei Kuchynski
312d79669e usb: typec: ucsi: displayport: Fix NULL pointer access
This patch ensures that the UCSI driver waits for all pending tasks in the
ucsi_displayport_work workqueue to finish executing before proceeding with
the partner removal.

Cc: stable <stable@kernel.org>
Fixes: af8622f6a5 ("usb: typec: ucsi: Support for DisplayPort alt mode")
Signed-off-by: Andrei Kuchynski <akuchynski@chromium.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Benson Leung <bleung@chromium.org>
Link: https://lore.kernel.org/r/20250424084429.3220757-3-akuchynski@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-01 17:40:04 +02:00
Andrei Kuchynski
364618c89d usb: typec: ucsi: displayport: Fix deadlock
This patch introduces the ucsi_con_mutex_lock / ucsi_con_mutex_unlock
functions to the UCSI driver. ucsi_con_mutex_lock ensures the connector
mutex is only locked if a connection is established and the partner pointer
is valid. This resolves a deadlock scenario where
ucsi_displayport_remove_partner holds con->mutex waiting for
dp_altmode_work to complete while dp_altmode_work attempts to acquire it.

Cc: stable <stable@kernel.org>
Fixes: af8622f6a5 ("usb: typec: ucsi: Support for DisplayPort alt mode")
Signed-off-by: Andrei Kuchynski <akuchynski@chromium.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20250424084429.3220757-2-akuchynski@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-01 17:40:01 +02:00
Lukasz Czechowski
9f657a9280 usb: misc: onboard_usb_dev: fix support for Cypress HX3 hubs
The Cypress HX3 USB3.0 hubs use different PID values depending
on the product variant. The comment in compatibles table is
misleading, as the currently used PIDs (0x6504 and 0x6506 for
USB 3.0 and USB 2.0, respectively) are defaults for the CYUSB331x,
while CYUSB330x and CYUSB332x variants use different values.
Based on the datasheet [1], update the compatible usb devices table
to handle different types of the hub.
The change also includes vendor mode PIDs, which are used by the
hub in I2C Master boot mode, if connected EEPROM contains invalid
signature or is blank. This allows to correctly boot the hub even
if the EEPROM will have broken content.
Number of vcc supplies and timing requirements are the same for all
HX variants, so the platform driver's match table does not have to
be extended.

[1] https://www.infineon.com/dgdl/Infineon-HX3_USB_3_0_Hub_Consumer_Industrial-DataSheet-v22_00-EN.pdf?fileId=8ac78c8c7d0d8da4017d0ecb53f644b8
    Table 9. PID Values

Fixes: b43cd82a1a ("usb: misc: onboard-hub: add support for Cypress HX3 USB 3.0 family")
Cc: stable <stable@kernel.org>
Signed-off-by: Lukasz Czechowski <lukasz.czechowski@thaumatec.com>
Link: https://lore.kernel.org/r/20250425-onboard_usb_dev-v2-1-4a76a474a010@thaumatec.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-01 17:39:59 +02:00
Alexey Charkov
a5c7973539 usb: uhci-platform: Make the clock really optional
Device tree bindings state that the clock is optional for UHCI platform
controllers, and some existing device trees don't provide those - such
as those for VIA/WonderMedia devices.

The driver however fails to probe now if no clock is provided, because
devm_clk_get returns an error pointer in such case.

Switch to devm_clk_get_optional instead, so that it could probe again
on those platforms where no clocks are given.

Cc: stable <stable@kernel.org>
Fixes: 26c502701c ("usb: uhci: Add clk support to uhci-platform")
Signed-off-by: Alexey Charkov <alchark@gmail.com>
Link: https://lore.kernel.org/r/20250425-uhci-clock-optional-v1-1-a1d462592f29@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-01 17:39:16 +02:00
Prashanth K
2372f1caec usb: dwc3: gadget: Make gadget_wakeup asynchronous
Currently gadget_wakeup() waits for U0 synchronously if it was
called from func_wakeup(), this is because we need to send the
function wakeup command soon after the link is active. And the
call is made synchronous by polling DSTS continuosly for 20000
times in __dwc3_gadget_wakeup(). But it observed that sometimes
the link is not active even after polling 20K times, leading to
remote wakeup failures. Adding a small delay between each poll
helps, but that won't guarantee resolution in future. Hence make
the gadget_wakeup completely asynchronous.

Since multiple interfaces can issue a function wakeup at once,
add a new variable wakeup_pending_funcs which will indicate the
functions that has issued func_wakup, this is represented in a
bitmap format. If the link is in U3, dwc3_gadget_func_wakeup()
will set the bit corresponding to interface_id and bail out.
Once link comes back to U0, linksts_change irq is triggered,
where the function wakeup command is sent based on bitmap.

Cc: stable <stable@kernel.org>
Fixes: 92c08a84b5 ("usb: dwc3: Add function suspend and function wakeup support")
Signed-off-by: Prashanth K <prashanth.k@oss.qualcomm.com>
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Link: https://lore.kernel.org/r/20250422103231.1954387-4-prashanth.k@oss.qualcomm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-01 17:38:55 +02:00
Prashanth K
5977a58dd5 usb: gadget: Use get_status callback to set remote wakeup capability
Currently when the host sends GET_STATUS request for an interface,
we use get_status callbacks to set/clear remote wakeup capability
of that interface. And if get_status callback isn't present for
that interface, then we assume its remote wakeup capability based
on bmAttributes.

Now consider a scenario, where we have a USB configuration with
multiple interfaces (say ECM + ADB), here ECM is remote wakeup
capable and as of now ADB isn't. And bmAttributes will indicate
the device as wakeup capable. With the current implementation,
when host sends GET_STATUS request for both interfaces, we will
set FUNC_RW_CAP for both. This results in USB3 CV Chapter 9.15
(Function Remote Wakeup Test) failures as host expects remote
wakeup from both interfaces.

The above scenario is just an example, and the failure can be
observed if we use configuration with any interface except ECM.
Hence avoid configuring remote wakeup capability from composite
driver based on bmAttributes, instead use get_status callbacks
and let the function drivers decide this.

Cc: stable <stable@kernel.org>
Fixes: 481c225c48 ("usb: gadget: Handle function suspend feature selector")
Signed-off-by: Prashanth K <prashanth.k@oss.qualcomm.com>
Reviewed-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Link: https://lore.kernel.org/r/20250422103231.1954387-3-prashanth.k@oss.qualcomm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-01 17:38:51 +02:00
Prashanth K
8e3820271c usb: gadget: f_ecm: Add get_status callback
When host sends GET_STATUS to ECM interface, handle the request
from the function driver. Since the interface is wakeup capable,
set the corresponding bit, and set RW bit if the function is
already armed for wakeup by the host.

Cc: stable <stable@kernel.org>
Fixes: 481c225c48 ("usb: gadget: Handle function suspend feature selector")
Signed-off-by: Prashanth K <prashanth.k@oss.qualcomm.com>
Reviewed-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Link: https://lore.kernel.org/r/20250422103231.1954387-2-prashanth.k@oss.qualcomm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-01 17:38:41 +02:00
Jim Lin
732f35cf8b usb: host: tegra: Prevent host controller crash when OTG port is used
When a USB device is connected to the OTG port, the tegra_xhci_id_work()
routine transitions the PHY to host mode and calls xhci_hub_control()
with the SetPortFeature command to enable port power.

In certain cases, the XHCI controller may be in a low-power state
when this operation occurs. If xhci_hub_control() is invoked while
the controller is suspended, the PORTSC register may return 0xFFFFFFFF,
indicating a read failure. This causes xhci_hc_died() to be triggered,
leading to host controller shutdown.

Example backtrace:
[  105.445736] Workqueue: events tegra_xhci_id_work
[  105.445747]  dump_backtrace+0x0/0x1e8
[  105.445759]  xhci_hc_died.part.48+0x40/0x270
[  105.445769]  tegra_xhci_set_port_power+0xc0/0x240
[  105.445774]  tegra_xhci_id_work+0x130/0x240

To prevent this, ensure the controller is fully resumed before
interacting with hardware registers by calling pm_runtime_get_sync()
prior to the host mode transition and xhci_hub_control().

Fixes: f836e78430 ("usb: xhci-tegra: Add OTG support")
Cc: stable <stable@kernel.org>
Signed-off-by: Jim Lin <jilin@nvidia.com>
Signed-off-by: Wayne Chang <waynec@nvidia.com>
Link: https://lore.kernel.org/r/20250422114001.126367-1-waynec@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-01 17:37:25 +02:00
Pawel Laszczak
241e2ce88e usb: cdnsp: Fix issue with resuming from L1
In very rare cases after resuming controller from L1 to L0 it reads
registers before the clock UTMI have been enabled and as the result
driver reads incorrect value.
Most of registers are in APB domain clock but some of them (e.g. PORTSC)
are in UTMI domain clock.
After entering to L1 state the UTMI clock can be disabled.
When controller transition from L1 to L0 the port status change event is
reported and in interrupt runtime function driver reads PORTSC.
During this read operation controller synchronize UTMI and APB domain
but UTMI clock is still disabled and in result it reads 0xFFFFFFFF value.
To fix this issue driver increases APB timeout value.

The issue is platform specific and if the default value of APB timeout
is not sufficient then this time should be set Individually for each
platform.

Fixes: 3d82904559 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver")
Cc: stable <stable@kernel.org>
Signed-off-by: Pawel Laszczak <pawell@cadence.com>
Acked-by: Peter Chen <peter.chen@kernel.org>
Link: https://lore.kernel.org/r/PH7PR07MB953846C57973E4DB134CAA71DDBF2@PH7PR07MB9538.namprd07.prod.outlook.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-01 17:36:12 +02:00
Wayne Chang
59820fde00 usb: gadget: tegra-xudc: ACK ST_RC after clearing CTRL_RUN
We identified a bug where the ST_RC bit in the status register was not
being acknowledged after clearing the CTRL_RUN bit in the control
register. This could lead to unexpected behavior in the USB gadget
drivers.

This patch resolves the issue by adding the necessary code to explicitly
acknowledge ST_RC after clearing CTRL_RUN based on the programming
sequence, ensuring proper state transition.

Fixes: 49db427232 ("usb: gadget: Add UDC driver for tegra XUSB device mode controller")
Cc: stable <stable@kernel.org>
Signed-off-by: Wayne Chang <waynec@nvidia.com>
Link: https://lore.kernel.org/r/20250418081228.1194779-1-waynec@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-01 17:35:02 +02:00
Radim Krčmář
87ec7d5249 KVM: RISC-V: reset smstateen CSRs
Not resetting smstateen is a potential security hole, because VU might
be able to access state that VS does not properly context-switch.

Fixes: 81f0f314fe ("RISCV: KVM: Add sstateen0 context save/restore")
Signed-off-by: Radim Krčmář <rkrcmar@ventanamicro.com>
Link: https://lore.kernel.org/r/20250403112522.1566629-8-rkrcmar@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-05-01 18:26:14 +05:30
Andre Przywara
4a9c3c3215 clk: sunxi-ng: fix order of arguments in clock macro
When introducing the SUNXI_CCU_MP_DATA_WITH_MUX_GATE_FEAT macro, the order
of the last two arguments was different between the users and the
definition: features became flags and flags became features.

This just didn't end up in a disaster yet because most users ended up
passing 0 for both arguments, and other clocks (for the new A523 SoC) are
not yet used.

Swap the order of the arguments in the definition, so that users stay
untouched.

Fixes: cdbb9d0d09 ("clk: sunxi-ng: mp: provide wrappers for setting feature flags")
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://patch.msgid.link/20250430095325.477311-1-andre.przywara@arm.com
[wens@csie.org: fix typo in commit message]
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2025-05-01 17:47:49 +08:00
Thorsten Blum
c44572e0cc MIPS: Fix MAX_REG_OFFSET
Fix MAX_REG_OFFSET to point to the last register in 'pt_regs' and not to
the marker itself, which could allow regs_get_register() to return an
invalid offset.

Fixes: 40e084a506 ("MIPS: Add uprobes support.")
Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2025-04-30 20:04:59 +02:00
Dan Carpenter
650266ac4c dm: add missing unlock on in dm_keyslot_evict()
We need to call dm_put_live_table() even if dm_get_live_table() returns
NULL.

Fixes: 9355a9eb21 ("dm: support key eviction from keyslot managers of underlying devices")
Cc: stable@vger.kernel.org	# v5.12+
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2025-04-30 18:17:43 +02:00
Karol Wachowski
75680b7cd4 accel/ivpu: Correct mutex unlock order in job submission
The mutex unlock for vdev->submitted_jobs_lock was incorrectly placed
before unlocking file_priv->lock. Change order of unlocks to avoid potential
race conditions.

Fixes: 5bbccadaf3 ("accel/ivpu: Abort all jobs after command queue unregister")
Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250425093656.2228168-1-jacek.lawrynowicz@linux.intel.com
2025-04-30 11:27:37 +02:00
Jacek Lawrynowicz
f2ecc700d1 accel/ivpu: Fix pm related deadlocks in cmdq ioctls
Fix deadlocks in ivpu_cmdq_create_ioctl() and ivpu_cmdq_destroy_ioctl()
related to runtime suspend.

Runtime suspend acquires file_priv->lock mutex by calling
ivpu_cmdq_reset_all_contexts(). The same lock is acquired in the cmdq
ioctls. If one of the cmdq ioctls is called while runtime suspend is in
progress, it can lead to a deadlock.

Call stacks from example deadlock below.

Runtime suspend thread:
[ 3443.179717] Call Trace:
[ 3443.179724]  __schedule+0x4b6/0x16b0
[ 3443.179732]  ? __mod_timer+0x27d/0x3a0
[ 3443.179738]  schedule+0x2f/0x140
[ 3443.179741]  schedule_preempt_disabled+0x19/0x30
[ 3443.179743]  __mutex_lock.constprop.0+0x335/0x7d0
[ 3443.179745]  ? xas_find+0x1ed/0x260
[ 3443.179747]  ? xa_find+0x8e/0xf0
[ 3443.179749]  __mutex_lock_slowpath+0x13/0x20
[ 3443.179751]  mutex_lock+0x41/0x60
[ 3443.179757]  ivpu_cmdq_reset_all_contexts+0x82/0x150 [intel_vpu a9bd091a97f28f0235f161316b29f8234f437295]
[ 3443.179786]  ivpu_pm_runtime_suspend_cb+0x1f1/0x3f0 [intel_vpu a9bd091a97f28f0235f161316b29f8234f437295]
[ 3443.179850]  pci_pm_runtime_suspend+0x6e/0x1f0
[ 3443.179870]  ? __pfx_pci_pm_runtime_suspend+0x10/0x10
[ 3443.179886]  __rpm_callback+0x48/0x130
[ 3443.179899]  rpm_callback+0x64/0x70
[ 3443.179911]  rpm_suspend+0x12c/0x630
[ 3443.179922]  ? __schedule+0x4be/0x16b0
[ 3443.179941]  pm_runtime_work+0xca/0xf0
[ 3443.179955]  process_one_work+0x188/0x3d0
[ 3443.179971]  worker_thread+0x2b9/0x3c0
[ 3443.179984]  kthread+0xfb/0x220
[ 3443.180001]  ? __pfx_worker_thread+0x10/0x10
[ 3443.180013]  ? __pfx_kthread+0x10/0x10
[ 3443.180029]  ret_from_fork+0x47/0x70
[ 3443.180044]  ? __pfx_kthread+0x10/0x10
[ 3443.180059]  ret_from_fork_asm+0x1a/0x30

User space thread:
[ 3443.180128] Call Trace:
[ 3443.180138]  __schedule+0x4b6/0x16b0
[ 3443.180159]  schedule+0x2f/0x140
[ 3443.180163]  rpm_resume+0x1a7/0x6a0
[ 3443.180165]  ? __pfx_autoremove_wake_function+0x10/0x10
[ 3443.180169]  __pm_runtime_resume+0x56/0x90
[ 3443.180171]  ivpu_rpm_get+0x28/0xb0 [intel_vpu a9bd091a97f28f0235f161316b29f8234f437295]
[ 3443.180181]  ivpu_ipc_send_receive+0x6d/0x120 [intel_vpu a9bd091a97f28f0235f161316b29f8234f437295]
[ 3443.180193]  ? free_frozen_pages+0x395/0x670
[ 3443.180199]  ? __free_pages+0xa7/0xc0
[ 3443.180202]  ivpu_jsm_hws_destroy_cmdq+0x76/0xf0 [intel_vpu a9bd091a97f28f0235f161316b29f8234f437295]
[ 3443.180213]  ? locks_dispose_list+0x6c/0xa0
[ 3443.180219]  ? kmem_cache_free+0x342/0x470
[ 3443.180222]  ? vm_area_free+0x19/0x30
[ 3443.180225]  ? xas_load+0x17/0xf0
[ 3443.180229]  ? xa_load+0x72/0xb0
[ 3443.180230]  ivpu_cmdq_unregister.isra.0+0xb1/0x100 [intel_vpu a9bd091a97f28f0235f161316b29f8234f437295]
[ 3443.180241]  ivpu_cmdq_destroy_ioctl+0x8d/0x130 [intel_vpu a9bd091a97f28f0235f161316b29f8234f437295]
[ 3443.180251]  ? __pfx_ivpu_cmdq_destroy_ioctl+0x10/0x10 [intel_vpu a9bd091a97f28f0235f161316b29f8234f437295]
[ 3443.180260]  drm_ioctl_kernel+0xb3/0x110
[ 3443.180265]  drm_ioctl+0x2ca/0x580
[ 3443.180266]  ? __pfx_ivpu_cmdq_destroy_ioctl+0x10/0x10 [intel_vpu a9bd091a97f28f0235f161316b29f8234f437295]
[ 3443.180275]  ? __fput+0x1ae/0x2f0
[ 3443.180279]  ? kmem_cache_free+0x342/0x470
[ 3443.180282]  __x64_sys_ioctl+0xa9/0xe0
[ 3443.180286]  x64_sys_call+0x13b7/0x26f0
[ 3443.180289]  do_syscall_64+0x62/0x180
[ 3443.180291]  entry_SYSCALL_64_after_hwframe+0x71/0x79

Fixes: 465a3914b2 ("accel/ivpu: Add API for command queue create/destroy/submit")
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250425093341.2202895-1-jacek.lawrynowicz@linux.intel.com
2025-04-30 11:26:56 +02:00
Jacek Lawrynowicz
c4eb2f88d2 accel/ivpu: Increase state dump msg timeout
Increase JMS message state dump command timeout to 100 ms. On some
platforms, the FW may take a bit longer than 50 ms to dump its state
to the log buffer and we don't want to miss any debug info during TDR.

Fixes: 5e162f872d ("accel/ivpu: Add FW state dump on TDR")
Cc: stable@vger.kernel.org # v6.13+
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250425092822.2194465-1-jacek.lawrynowicz@linux.intel.com
2025-04-30 11:26:20 +02:00
Ze Huang
3e6244429b riscv: dts: sophgo: fix DMA data-width configuration for CV18xx
The "snps,data-width" property[1] defines the AXI data width of the DMA
controller as:

    width = 8 × (2^n) bits

(0 = 8 bits, 1 = 16 bits, 2 = 32 bits, ..., 6 = 512 bits)
where "n" is the value of "snps,data-width".

For the CV18xx DMA controller, the correct AXI data width is 32 bits,
corresponding to "snps,data-width = 2".

Test results on Milkv Duo S can be found here [2].

Link: https://github.com/torvalds/linux/blob/master/Documentation/devicetree/bindings/dma/snps%2Cdw-axi-dmac.yaml#L74 [1]
Link: https://gist.github.com/Sutter099/4fa99bb2d89e5af975983124704b3861 [2]

Fixes: 514951a81a ("riscv: dts: sophgo: cv18xx: add DMA controller")
Co-developed-by: Yu Yuan <yu.yuan@sjtu.edu.cn>
Signed-off-by: Yu Yuan <yu.yuan@sjtu.edu.cn>
Signed-off-by: Ze Huang <huangze@whut.edu.cn>
Link: https://lore.kernel.org/r/20250428-duo-dma-config-v1-1-eb6ad836ca42@whut.edu.cn
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
Signed-off-by: Chen Wang <wangchen20@iscas.ac.cn>
2025-04-30 14:51:43 +08:00
Sabrina Dubroca
417fae2c40 xfrm: ipcomp: fix truesize computation on receive
ipcomp_post_acomp currently drops all frags (via pskb_trim_unique(skb,
0)), and then subtracts the old skb->data_len from truesize. This
adjustment has already be done during trimming (in skb_condense), so
we don't need to do it again.

This shows up for example when running fragmented traffic over ipcomp,
we end up hitting the WARN_ON_ONCE in skb_try_coalesce.

Fixes: eb2953d269 ("xfrm: ipcomp: Use crypto_acomp interface")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-04-30 08:08:16 +02:00
Hongbo Li
9fcd53c320 erofs: remove unused enum type
Opt_err is not used in EROFS, we can remove it.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Gao Xiang <xiang@kernel.org>
Link: https://lore.kernel.org/r/20250429075056.689570-1-lihongbo22@huawei.com
Signed-off-by: Gao Xiang <xiang@kernel.org>
2025-04-30 12:13:32 +08:00
Nicolas Frattaroli
4bf593be2e arm64: dts: rockchip: fix Sige5 RTC interrupt pin
Someone made a typo when they added the RTC to the Sige5 DTS, which
resulted in it using interrupts from GPIO0 B0 instead of GPIO0 A0. The
pinctrl entry for it wasn't typoed though, curiously enough.

The Sige5 v1.1 schematic was used to verify that GPIO0 A0 is the correct
pin for the RTC wakeup interrupt, so let's change it to that.

Fixes: 40f742b07a ("arm64: dts: rockchip: Add rk3576-armsom-sige5 board")
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Link: https://lore.kernel.org/r/20250429-sige5-rtc-oopsie-v1-1-8686767d0f1f@collabora.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-04-29 23:21:49 +02:00
Andrea Righi
e38be1c764 sched_ext: Fix rq lock state in hotplug ops
The ops.cpu_online() and ops.cpu_offline() callbacks incorrectly assume
that the rq involved in the operation is locked, which is not the case
during hotplug, triggering the following warning:

  WARNING: CPU: 1 PID: 20 at kernel/sched/sched.h:1504 handle_hotplug+0x280/0x340

Fix by not tracking the target rq as locked in the context of
ops.cpu_online() and ops.cpu_offline().

Fixes: 18853ba782 ("sched_ext: Track currently locked rq")
Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Tested-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-04-29 08:01:32 -10:00
Gary Bisson
11cdb506d0 Input: mtk-pmic-keys - fix possible null pointer dereference
In mtk_pmic_keys_probe, the regs parameter is only set if the button is
parsed in the device tree. However, on hardware where the button is left
floating, that node will most likely be removed not to enable that
input. In that case the code will try to dereference a null pointer.

Let's use the regs struct instead as it is defined for all supported
platforms. Note that it is ok setting the key reg even if that latter is
disabled as the interrupt won't be enabled anyway.

Fixes: b581acb49a ("Input: mtk-pmic-keys - transfer per-key bit in mtk_pmic_keys_regs")
Signed-off-by: Gary Bisson <bisson.gary@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-04-29 09:27:49 -07:00
Christian Bruel
2ef5c66cba arm64: dts: st: Use 128kB size for aliased GIC400 register access on stm32mp23 SoCs
Adjust the size of 8kB GIC regions to 128kB so that each 4kB is mapped 16
times over a 64kB region.
The offset is then adjusted in the irq-gic driver.

see commit 12e14066f4 ("irqchip/GIC: Add workaround for aliased GIC400")

Fixes: e9b03ef213 ("arm64: dts: st: introduce stm32mp23 SoCs family")
Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Christian Bruel <christian.bruel@foss.st.com>
Link: https://lore.kernel.org/r/20250415111654.2103767-7-christian.bruel@foss.st.com
Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-04-29 18:16:28 +02:00
Christian Bruel
3a1e108209 arm64: dts: st: Adjust interrupt-controller for stm32mp23 SoCs
Use gic-400 compatible and remove address-cells = <1> for aarch64

Fixes: e9b03ef213 ("arm64: dts: st: introduce stm32mp23 SoCs family")
Signed-off-by: Christian Bruel <christian.bruel@foss.st.com>
Link: https://lore.kernel.org/r/20250415111654.2103767-6-christian.bruel@foss.st.com
Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-04-29 18:16:28 +02:00
Christian Bruel
1bc229e9bb arm64: dts: st: Use 128kB size for aliased GIC400 register access on stm32mp21 SoCs
Adjust the size of 8kB GIC regions to 128kB so that each 4kB is mapped
16 times over a 64kB region.
The offset is then adjusted in the irq-gic driver.

see commit 12e14066f4 ("irqchip/GIC: Add workaround for aliased GIC400")

Fixes: 7a57b1bb1a ("arm64: dts: st: introduce stm32mp21 SoCs family")
Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Christian Bruel <christian.bruel@foss.st.com>
Link: https://lore.kernel.org/r/20250415111654.2103767-5-christian.bruel@foss.st.com
Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-04-29 18:16:28 +02:00
Christian Bruel
02dc83f09c arm64: dts: st: Adjust interrupt-controller for stm32mp21 SoCs
Use gic-400 compatible for aarch64

Fixes: 7a57b1bb1a ("arm64: dts: st: introduce stm32mp21 SoCs family")
Signed-off-by: Christian Bruel <christian.bruel@foss.st.com>
Link: https://lore.kernel.org/r/20250415111654.2103767-4-christian.bruel@foss.st.com
Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-04-29 18:16:28 +02:00
Christian Bruel
06c231fe95 arm64: dts: st: Use 128kB size for aliased GIC400 register access on stm32mp25 SoCs
Adjust the size of 8kB GIC regions to 128kB so that each 4kB is mapped 16
times over a 64kB region.
The offset is then adjusted in the irq-gic driver.

see commit 12e14066f4 ("irqchip/GIC: Add workaround for aliased GIC400")

Fixes: 5d30d03aaf ("arm64: dts: st: introduce stm32mp25 SoCs family")
Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Christian Bruel <christian.bruel@foss.st.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250415111654.2103767-3-christian.bruel@foss.st.com
Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-04-29 18:16:28 +02:00
Christian Bruel
de2b2107d5 arm64: dts: st: Adjust interrupt-controller for stm32mp25 SoCs
Use gic-400 compatible and remove address-cells = <1> on aarch64

Fixes: 5d30d03aaf ("arm64: dts: st: introduce stm32mp25 SoCs family")
Signed-off-by: Christian Bruel <christian.bruel@foss.st.com>
Link: https://lore.kernel.org/r/20250415111654.2103767-2-christian.bruel@foss.st.com
Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-04-29 18:16:28 +02:00
Arnd Bergmann
128795bdbe Merge tag 'imx-fixes-6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes
i.MX fixes for 6.15:

- An i.MX8MP change from Ahmad Fatoum to fix the broken nominal device
  tree caused by commit 9f7595b3e5 ("arm64: dts: imx8mp: configure
  GPU and NPU clocks to overdrive rate")
- A MAINTAINERS update from Michael Riesch to exclude Sony IMX image
  sensor drivers from i.MX entry
- A i.MX95 device tree change from Richard Zhu to correct the range of
  PCIe app-reg region
- An opos6ul device tree change from Sébastien Szymanski to fix
  an Ethernet regression caused by commit c7e73b5051 ("ARM: imx:
  mach-imx6ul: remove 14x14 EVK specific PHY fixup")
- An imx8mm-verdin device tree change from Wojciech Dubowik to fix
  a SD card regression caused by commit f5aab0438e ("regulator:
  pca9450: Fix enable register for LDO5")

* tag 'imx-fixes-6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  arm64: dts: imx8mm-verdin: Link reg_usdhc2_vqmmc to usdhc2
  MAINTAINERS: add exclude for dt-bindings to imx entry
  ARM: dts: opos6ul: add ksz8081 phy properties
  arm64: dts: imx95: Correct the range of PCIe app-reg region
  arm64: dts: imx8mp: configure GPU and NPU clocks in nominal DTSI
2025-04-29 18:14:39 +02:00
Arnd Bergmann
7771f41d39 Merge tag 'juno-fix-6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes
Armv8 Morello fix for v6.15

Just a single fix addressing the cache node inconsistencies. It removed
unnecessary CPU number from L2 cache node names since they are local to
CPU nodes and should simply be named "l2-cache" and relocates the shared
L3 cache node from under cpu@0/l2-cache to the /cpus node, which is the
standard location for shared caches.

* tag 'juno-fix-6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
  arm64: dts: morello: Fix-up cache nodes
2025-04-29 18:14:29 +02:00
Arnd Bergmann
fbefe1c45d Merge tag 'ffa-fix-6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes
Arm FF-A fix for v6.15

A fix that addresses incorrect release of Rx buffer ownership in the
driver. The fix specificially avoids releasing Rx buffer ownership with
FFA_RX_RELEASE if it wasn’t acquired during a FFA_PARTITION_INFO_GET call
that only requested the partition count. This prevents unnecessary errors
like FFA_RET_DENIED from firmware when buffers are not actually owned by
the driver.

* tag 'ffa-fix-6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
  firmware: arm_ffa: Skip Rx buffer ownership release if not acquired
2025-04-29 18:14:19 +02:00
Arnd Bergmann
94ddc14095 Merge tag 'scmi-fixes-6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes
Arm SCMI fixes for v6.15

Couple of fixes addressing issues with timeout in the polling path
and device reference count imbalance detected by kmemleak.

1. The change fixes a timeout issue in the polling path of SCMI transactions
   where false positives could occur if the polling thread was pre-empted,
   causing it to appear as though a timeout occurred when it hadn't. The fix
   ensures that the polling result is verified before reporting a timeout,
   accounting for potential pre-emption or out-of-order replies.

2. It also corrects a device reference count imbalance caused by
   device_find_child() during device destruction, which prevented proper
   cleanup and triggered memory leaks detected by KMemleak.

* tag 'scmi-fixes-6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
  firmware: arm_scmi: Fix timeout checks on polling path
  firmware: arm_scmi: Balance device refcount when destroying devices
2025-04-29 18:14:03 +02:00
Suma Hegde
0581d384f3 platform/x86/amd/hsmp: Make amd_hsmp and hsmp_acpi as mutually exclusive drivers
amd_hsmp and hsmp_acpi are intended to be mutually exclusive drivers and
amd_hsmp is for legacy platforms. To achieve this, it is essential to
check for the presence of the ACPI device in plat.c. If the hsmp ACPI
device entry is found, allow the hsmp_acpi driver to manage the hsmp
and return an error from plat.c.

Additionally, rename the driver from amd_hsmp to hsmp_acpi to prevent
"Driver 'amd_hsmp' is already registered, aborting..." error in case
both drivers are loaded simultaneously.

Also, support both platform device based and ACPI based probing for
family 0x1A models 0x00 to 0x0F, implement only ACPI based probing
for family 0x1A, models 0x10 to 0x1F. Return false from
legacy_hsmp_support() for this platform.
This aligns with the condition check in is_f1a_m0h().

Link: https://lore.kernel.org/platform-driver-x86/aALZxvHWmphNL1wa@gourry-fedora-PF4VCD3F/
Fixes: 7d3135d163 ("platform/x86/amd/hsmp: Create separate ACPI, plat and common drivers")
Reviewed-by: Naveen Krishna Chatradhi <naveenkrishna.chatradhi@amd.com>
Co-developed-by: Gregory Price <gourry@gourry.net>
Signed-off-by: Gregory Price <gourry@gourry.net>
Signed-off-by: Suma Hegde <suma.hegde@amd.com>
Link: https://lore.kernel.org/r/20250425102357.266790-1-suma.hegde@amd.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-04-29 18:03:13 +03:00
Mario Limonciello
8e81b9cd6e drivers/platform/x86/amd: pmf: Check for invalid Smart PC Policies
commit 376a8c2a14 ("platform/x86/amd/pmf: Update PMF Driver for
Compatibility with new PMF-TA") added support for platforms that support
an updated TA, however it also exposed a number of platforms that although
they have support for the updated TA don't actually populate a policy
binary.

Add an explicit check that the policy binary isn't empty before
initializing the TA.

Reported-by: Christian Heusel <christian@heusel.eu>
Closes: https://lore.kernel.org/platform-driver-x86/ae644428-5bf2-4b30-81ba-0b259ed3449b@heusel.eu/
Fixes: 376a8c2a14 ("platform/x86/amd/pmf: Update PMF Driver for Compatibility with new PMF-TA")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Christian Heusel <christian@heusel.eu>
Link: https://lore.kernel.org/r/20250423132002.3984997-3-superm1@kernel.org
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-04-29 17:47:43 +03:00
Mario Limonciello
690d722e02 drivers/platform/x86/amd: pmf: Check for invalid sideloaded Smart PC Policies
If a policy is passed into amd_pmf_get_pb_data() that causes the engine
to fail to start there is a memory leak. Free the memory in this failure
path.

Fixes: 10817f28e5 ("platform/x86/amd/pmf: Add capability to sideload of policy binary")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250423132002.3984997-2-superm1@kernel.org
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-04-29 17:47:41 +03:00
Geert Uytterhoeven
13a6d42656 pmdomain: renesas: rcar: Remove obsolete nullify checks
All nullify users and helpers were removed, but the R-Car SYSC drivers
still checked for nullified domains.  Remove the obsolete checks.

Fixes: c8d8770444 ("pmdomain: renesas: rcar-sysc: Remove rcar_sysc_nullify() helper")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/107f2bf9f13b29f0f623d2959a5347ec151fb089.1745840768.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2025-04-29 11:22:27 +02:00
Max Kellermann
bbfe756dc3 fs/erofs/fileio: call erofs_onlinefolio_split() after bio_add_folio()
If bio_add_folio() fails (because it is full),
erofs_fileio_scan_folio() needs to submit the I/O request via
erofs_fileio_rq_submit() and allocate a new I/O request with an empty
`struct bio`.  Then it retries the bio_add_folio() call.

However, at this point, erofs_onlinefolio_split() has already been
called which increments `folio->private`; the retry will call
erofs_onlinefolio_split() again, but there will never be a matching
erofs_onlinefolio_end() call.  This leaves the folio locked forever
and all waiters will be stuck in folio_wait_bit_common().

This bug has been added by commit ce63cb62d7 ("erofs: support
unencoded inodes for fileio"), but was practically unreachable because
there was room for 256 folios in the `struct bio` - until commit
9f74ae8c9a ("erofs: shorten bvecs[] for file-backed mounts") which
reduced the array capacity to 16 folios.

It was now trivial to trigger the bug by manually invoking readahead
from userspace, e.g.:

 posix_fadvise(fd, 0, st.st_size, POSIX_FADV_WILLNEED);

This should be fixed by invoking erofs_onlinefolio_split() only after
bio_add_folio() has succeeded.  This is safe: asynchronous completions
invoking erofs_onlinefolio_end() will not unlock the folio because
erofs_fileio_scan_folio() is still holding a reference to be released
by erofs_onlinefolio_end() at the end.

Fixes: ce63cb62d7 ("erofs: support unencoded inodes for fileio")
Fixes: 9f74ae8c9a ("erofs: shorten bvecs[] for file-backed mounts")
Cc: stable@vger.kernel.org
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Reviewed-by: Gao Xiang <xiang@kernel.org>
Tested-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://lore.kernel.org/r/20250428230933.3422273-1-max.kellermann@ionos.com
Signed-off-by: Gao Xiang <xiang@kernel.org>
2025-04-29 11:36:00 +08:00
Lode Willems
22cd66a5db Input: xpad - add support for 8BitDo Ultimate 2 Wireless Controller
This patch adds support for the 8BitDo Ultimate 2 Wireless Controller.
Tested using the wireless dongle and plugged in.

Signed-off-by: Lode Willems <me@lodewillems.com>
Link: https://lore.kernel.org/r/20250422112457.6728-1-me@lodewillems.com
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-04-28 12:42:00 -07:00
Mikael Gonella-Bolduc
7675b5efd8 Input: cyttsp5 - fix power control issue on wakeup
The power control function ignores the "on" argument when setting the
report ID, and thus is always sending HID_POWER_SLEEP. This causes a
problem when trying to wakeup.

Fix by sending the state variable, which contains the proper HID_POWER_ON or
HID_POWER_SLEEP based on the "on" argument.

Fixes: 3c98b8dbdc ("Input: cyttsp5 - implement proper sleep and wakeup procedures")
Cc: stable@vger.kernel.org
Signed-off-by: Mikael Gonella-Bolduc <mgonellabolduc@dimonoff.com>
Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com>
Reviewed-by: Alistair Francis <alistair@alistair23.me>
Link: https://lore.kernel.org/r/20250423135243.1261460-1-hugo@hugovil.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-04-28 12:15:53 -07:00
Mattijs Korpershoek
6a10a2f1e0 MAINTAINERS: .mailmap: update Mattijs Korpershoek's email address
Update Mattijs Korpershoek's email address to @kernel.org.

Signed-off-by: Mattijs Korpershoek <mkorpershoek@kernel.org>
Link: https://lore.kernel.org/r/20250428-keypad-maintainers-v1-1-4e9c4afba415@kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-04-28 12:14:32 -07:00
Mattijs Korpershoek
b8ac485a17 dt-bindings: mediatek,mt6779-keypad: Update Mattijs' email address
Update Mattijs Korpershoek's email address to @kernel.org.

Signed-off-by: Mattijs Korpershoek <mkorpershoek@kernel.org>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20250428-keypad-email-v1-1-dde6ac76725b@kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-04-28 12:14:32 -07:00
Arnd Bergmann
9a046c1d21 Input: stmpe-ts - use module alias instead of device table
When compile tested with W=1 on x86_64 with driver as built-in:

  stmpe-ts.c:371:34: error: unused variable 'stmpe_ts_ids' [-Werror,-Wunused-const-variable]

Ideally this would be referenced from the platform_driver, but since
the compatible string is already matched by the mfd driver for its
parent device, that would break probing.

In this case, the of_device_id table just serves as a module alias
for loading the driver, while the device itself is probed using
the platform device name.

Remove the table and instead use a module alias that reflects how
the driver is actually probed.

Link: https://lore.kernel.org/lkml/20240403080702.3509288-8-arnd@kernel.org/
Link: https://lore.kernel.org/lkml/181dbdb8-c050-4966-8cb4-2f39495ff3f9@app.fastmail.com/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20250409122314.2848028-3-arnd@kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-04-28 12:14:12 -07:00
Yan Zhao
20a6cff3b2 KVM: x86/mmu: Check and free obsolete roots in kvm_mmu_reload()
Check request KVM_REQ_MMU_FREE_OBSOLETE_ROOTS to free obsolete roots in
kvm_mmu_reload() to prevent kvm_mmu_reload() from seeing a stale obsolete
root.

Since kvm_mmu_reload() can be called outside the
vcpu_enter_guest() path (e.g., kvm_arch_vcpu_pre_fault_memory()), it may be
invoked after a root has been marked obsolete and before vcpu_enter_guest()
is invoked to process KVM_REQ_MMU_FREE_OBSOLETE_ROOTS and set root.hpa to
invalid. This causes kvm_mmu_reload() to fail to load a new root, which
can lead to kvm_arch_vcpu_pre_fault_memory() being stuck in the while
loop in kvm_tdp_map_page() since RET_PF_RETRY is always returned due to
is_page_fault_stale().

Keep the existing check of KVM_REQ_MMU_FREE_OBSOLETE_ROOTS in
vcpu_enter_guest() since the cost of kvm_check_request() is negligible,
especially a check that's guarded by kvm_request_pending().

Export symbol of kvm_mmu_free_obsolete_roots() as kvm_mmu_reload() is
inline and may be called outside of kvm.ko.

Fixes: 6e01b7601d ("KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory()")
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Link: https://lore.kernel.org/r/20250318013333.5817-1-yan.y.zhao@intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-04-28 11:06:13 -07:00
Trond Myklebust
fa7ab64f1e NFS/localio: Fix a race in nfs_local_open_fh()
Once the clp->cl_uuid.lock has been dropped, another CPU could come in
and free the struct nfsd_file that was just added. To prevent that from
happening, take the RCU read lock before dropping the spin lock.

Fixes: 86e0041225 ("nfs: cache all open LOCALIO nfsd_file(s) in client")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Mike Snitzer <snitzer@kernel.org>
2025-04-28 09:50:50 -07:00
Hongbo Li
8d16dd7b65 MAINTAINERS: erofs: add myself as reviewer
I have a solid background in file systems and since much of my
recent work has focused on EROFS, I am familiar with it. Now I
have the time and am willing to help review EROFS patches.

I hope my participation can be helpful to the EROFS patch review
process.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Acked-by: Chao Yu <chao@kernel.org>
Acked-by: Gao Xiang <xiang@kernel.org>
Link: https://lore.kernel.org/r/20250424030653.3308358-1-lihongbo22@huawei.com
Signed-off-by: Gao Xiang <xiang@kernel.org>
2025-04-28 18:08:14 +08:00
Thomas Gleixner
b71f9804f6 timekeeping: Prevent coarse clocks going backwards
Lei Chen raised an issue with CLOCK_MONOTONIC_COARSE seeing time
inconsistencies. Lei tracked down that this was being caused by the
adjustment:

    tk->tkr_mono.xtime_nsec -= offset;

which is made to compensate for the unaccumulated cycles in offset when the
multiplicator is adjusted forward, so that the non-_COARSE clockids don't
see inconsistencies.

However, the _COARSE clockid getter functions use the adjusted xtime_nsec
value directly and do not compensate the negative offset via the
clocksource delta multiplied with the new multiplicator. In that case the
caller can observe time going backwards in consecutive calls.

By design, this negative adjustment should be fine, because the logic run
from timekeeping_adjust() is done after it accumulated approximately

     multiplicator * interval_cycles

into xtime_nsec.  The accumulated value is always larger then the

     mult_adj * offset

value, which is subtracted from xtime_nsec. Both operations are done
together under the tk_core.lock, so the net change to xtime_nsec is always
always be positive.

However, do_adjtimex() calls into timekeeping_advance() as well, to
apply the NTP frequency adjustment immediately. In this case,
timekeeping_advance() does not return early when the offset is smaller
then interval_cycles. In that case there is no time accumulated into
xtime_nsec. But the subsequent call into timekeeping_adjust(), which
modifies the multiplicator, subtracts from xtime_nsec to correct for the
new multiplicator.

Here because there was no accumulation, xtime_nsec becomes smaller than
before, which opens a window up to the next accumulation, where the
_COARSE clockid getters, which don't compensate for the offset, can
observe the inconsistency.

This has been tried to be fixed by forwarding the timekeeper in the case
that adjtimex() adjusts the multiplier, which resets the offset to zero:

  757b000f7b ("timekeeping: Fix possible inconsistencies in _COARSE clockids")

That works correctly, but unfortunately causes a regression on the
adjtimex() side. There are two issues:

   1) The forwarding of the base time moves the update out of the original
      period and establishes a new one.

   2) The clearing of the accumulated NTP error is changing the behaviour as
      well.

User-space expects that multiplier/frequency updates are in effect, when the
syscall returns, so delaying the update to the next tick is not solving the
problem either.

Commit 757b000f7b was reverted so that the established expectations of
user space implementations (ntpd, chronyd) are restored, but that obviously
brought the inconsistencies back.

One of the initial approaches to fix this was to establish a separate
storage for the coarse time getter nanoseconds part by calculating it from
the offset. That was dropped on the floor because not having yet another
state to maintain was simpler. But given the result of the above exercise,
this solution turns out to be the right one. Bring it back in a slightly
modified form.

Thus introduce timekeeper::coarse_nsec and store that nanoseconds part in
it, switch the time getter functions and the VDSO update to use that value.
coarse_nsec is set on operations which forward or initialize the timekeeper
and after time was accumulated during a tick. If there is no accumulation
the timestamp is unchanged.

This leaves the adjtimex() behaviour unmodified and prevents coarse time
from going backwards.

[ jstultz: Simplified the coarse_nsec calculation and kept behavior so
  	   coarse clockids aren't adjusted on each inter-tick adjtimex
  	   call, slightly reworked the comments and commit message ]

Fixes: da15cfdae0 ("time: Introduce CLOCK_REALTIME_COARSE")
Reported-by: Lei Chen <lei.chen@smartx.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/all/20250419054706.2319105-1-jstultz@google.com
Closes: https://lore.kernel.org/lkml/20250310030004.3705801-1-lei.chen@smartx.com/
2025-04-28 11:17:29 +02:00
Stephen Smalley
8b0ba61df5 fs/xattr.c: fix simple_xattr_list to always include security.* xattrs
The vfs has long had a fallback to obtain the security.* xattrs from the
LSM when the filesystem does not implement its own listxattr, but
shmem/tmpfs and kernfs later gained their own xattr handlers to support
other xattrs. Unfortunately, as a side effect, tmpfs and kernfs-based
filesystems like sysfs no longer return the synthetic security.* xattr
names via listxattr unless they are explicitly set by userspace or
initially set upon inode creation after policy load. coreutils has
recently switched from unconditionally invoking getxattr for security.*
for ls -Z via libselinux to only doing so if listxattr returns the xattr
name, breaking ls -Z of such inodes.

Before:
$ getfattr -m.* /run/initramfs
<no output>
$ getfattr -m.* /sys/kernel/fscaps
<no output>
$ setfattr -n user.foo /run/initramfs
$ getfattr -m.* /run/initramfs
user.foo

After:
$ getfattr -m.* /run/initramfs
security.selinux
$ getfattr -m.* /sys/kernel/fscaps
security.selinux
$ setfattr -n user.foo /run/initramfs
$ getfattr -m.* /run/initramfs
security.selinux
user.foo

Link: https://lore.kernel.org/selinux/CAFqZXNtF8wDyQajPCdGn=iOawX4y77ph0EcfcqcUUj+T87FKyA@mail.gmail.com/
Link: https://lore.kernel.org/selinux/20250423175728.3185-2-stephen.smalley.work@gmail.com/
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Link: https://lore.kernel.org/20250424152822.2719-1-stephen.smalley.work@gmail.com
Fixes: b09e0fa4b4 ("tmpfs: implement generic xattr support")
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-04-28 10:53:22 +02:00
Sergey Shtylyov
c367eea504 nfs: nfs3acl: drop useless assignment in nfs3_get_acl()
In nfs3_get_acl(), the local variable status is assigned the result of
nfs_refresh_inode() inside the *switch* statement, but that value gets
overwritten in the next *if* statement's true branch and is completely
ignored if that branch isn't taken...

Found by Linux Verification Center (linuxtesting.org) with the Svace static
analysis tool.

Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Link: https://lore.kernel.org/r/c32dced7-a4fa-43c0-aafe-ef6c819c2f91@omp.ru
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-04-27 23:25:44 -04:00
Sergey Shtylyov
d82e86c153 nfs: direct: drop useless initializer in nfs_direct_write_completion()
In nfs_direct_write_completion(), the local variable req isn't used outside
the *while* loop and is assigned to right at the start of that loop's body,
so its initializer appears useless -- drop it; then move the declaration to
the loop body (which happens to have a pointless empty line anyway)...

Found by Linux Verification Center (linuxtesting.org) with the Svace static
analysis tool.

Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Link: https://lore.kernel.org/r/416219f5-7983-484b-b5a7-5fb7da9561f7@omp.ru
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-04-27 23:25:44 -04:00
Jeff Layton
d5fb22a7c5 nfs: move the nfs4_data_server_cache into struct nfs_net
Since struct nfs4_pnfs_ds should not be shared between net namespaces,
move from a global list of objects to a per-netns list and spinlock.

Tested-by: Sargun Dillon <sargun@sargun.me>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Link: https://lore.kernel.org/r/20250410-nfs-ds-netns-v2-2-f80b7979ba80@kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-04-27 23:25:44 -04:00
Jeff Layton
6b9785dc8b nfs: don't share pNFS DS connections between net namespaces
Currently, different NFS clients can share the same DS connections, even
when they are in different net namespaces. If a containerized client
creates a DS connection, another container can find and use it. When the
first client exits, the connection will close which can lead to stalls
in other clients.

Add a net namespace pointer to struct nfs4_pnfs_ds, and compare those
value to the caller's netns in _data_server_lookup_locked() when
searching for a nfs4_pnfs_ds to match.

Reported-by: Omar Sandoval <osandov@osandov.com>
Reported-by: Sargun Dillon <sargun@sargun.me>
Closes: https://lore.kernel.org/linux-nfs/Z_ArpQC_vREh_hEA@telecaster/
Tested-by: Sargun Dillon <sargun@sargun.me>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Link: https://lore.kernel.org/r/20250410-nfs-ds-netns-v2-1-f80b7979ba80@kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-04-27 23:25:44 -04:00
Li Lingfeng
c457dc1ec7 nfs: handle failure of nfs_get_lock_context in unlock path
When memory is insufficient, the allocation of nfs_lock_context in
nfs_get_lock_context() fails and returns -ENOMEM. If we mistakenly treat
an nfs4_unlockdata structure (whose l_ctx member has been set to -ENOMEM)
as valid and proceed to execute rpc_run_task(), this will trigger a NULL
pointer dereference in nfs4_locku_prepare. For example:

BUG: kernel NULL pointer dereference, address: 000000000000000c
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP PTI
CPU: 15 UID: 0 PID: 12 Comm: kworker/u64:0 Not tainted 6.15.0-rc2-dirty #60
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40
Workqueue: rpciod rpc_async_schedule
RIP: 0010:nfs4_locku_prepare+0x35/0xc2
Code: 89 f2 48 89 fd 48 c7 c7 68 69 ef b5 53 48 8b 8e 90 00 00 00 48 89 f3
RSP: 0018:ffffbbafc006bdb8 EFLAGS: 00010246
RAX: 000000000000004b RBX: ffff9b964fc1fa00 RCX: 0000000000000000
RDX: 0000000000000000 RSI: fffffffffffffff4 RDI: ffff9ba53fddbf40
RBP: ffff9ba539934000 R08: 0000000000000000 R09: ffffbbafc006bc38
R10: ffffffffb6b689c8 R11: 0000000000000003 R12: ffff9ba539934030
R13: 0000000000000001 R14: 0000000004248060 R15: ffffffffb56d1c30
FS: 0000000000000000(0000) GS:ffff9ba5881f0000(0000) knlGS:00000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000000c CR3: 000000093f244000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 __rpc_execute+0xbc/0x480
 rpc_async_schedule+0x2f/0x40
 process_one_work+0x232/0x5d0
 worker_thread+0x1da/0x3d0
 ? __pfx_worker_thread+0x10/0x10
 kthread+0x10d/0x240
 ? __pfx_kthread+0x10/0x10
 ret_from_fork+0x34/0x50
 ? __pfx_kthread+0x10/0x10
 ret_from_fork_asm+0x1a/0x30
 </TASK>
Modules linked in:
CR2: 000000000000000c
---[ end trace 0000000000000000 ]---

Free the allocated nfs4_unlockdata when nfs_get_lock_context() fails and
return NULL to terminate subsequent rpc_run_task, preventing NULL pointer
dereference.

Fixes: f30cb757f6 ("NFS: Always wait for I/O completion before unlock")
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20250417072508.3850532-1-lilingfeng3@huawei.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-04-27 23:25:44 -04:00
Trond Myklebust
bead8b4953 pNFS/flexfiles: Record the RPC errors in the I/O tracepoints
When debugging I/O issues, we want to see not just the NFS level errors,
but also the RPC level problems, so record both in the tracepoints.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-04-27 23:25:44 -04:00
Trond Myklebust
440caf8ee2 NFSv4/pnfs: Layoutreturn on close must handle fatal networking errors
If we have a fatal ENETDOWN or ENETUNREACH error, then the layoutreturn
on close code should also handle that as fatal, and free the layouts.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2025-04-27 23:25:44 -04:00
Trond Myklebust
2d7f844ffa NFSv4: Handle fatal ENETDOWN and ENETUNREACH errors
Ensure that the NFSv4 error handling code recognises the
RPC_TASK_NETUNREACH_FATAL flag, and handles the ENETDOWN and ENETUNREACH
errors accordingly.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2025-04-27 23:25:44 -04:00
Wojciech Dubowik
5591ce0069 arm64: dts: imx8mm-verdin: Link reg_usdhc2_vqmmc to usdhc2
Define vqmmc regulator-gpio for usdhc2 with vin-supply
coming from LDO5.

Without this definition LDO5 will be powered down, disabling
SD card after bootup. This has been introduced in commit
f5aab0438e ("regulator: pca9450: Fix enable register for LDO5").

Fixes: 6a57f224f7 ("arm64: dts: freescale: add initial support for verdin imx8m mini")
Fixes: f5aab0438e ("regulator: pca9450: Fix enable register for LDO5")
Tested-by: Manuel Traut <manuel.traut@mt.com>
Reviewed-by: Philippe Schenker <philippe.schenker@impulsing.ch>
Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Cc: stable@vger.kernel.org
Signed-off-by: Wojciech Dubowik <Wojciech.Dubowik@mt.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2025-04-27 21:49:30 +08:00
Thorsten Blum
7f74c066e5 MIPS: CPS: Fix potential NULL pointer dereferences in cps_prepare_cpus()
Check the return values of kcalloc() and exit early to avoid potential
NULL pointer dereferences.

Compile-tested only.

Cc: stable@vger.kernel.org
Fixes: 75fa6a5838 ("MIPS: CPS: Introduce struct cluster_boot_config")
Fixes: 0856c143e1 ("MIPS: CPS: Boot CPUs in secondary clusters")
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2025-04-27 10:13:22 +02:00
Marco Crivellari
cc3e3d3a9d MIPS: rename rollback_handler with skipover_handler
Recently the rollback region has been changed into an
idle interrupt region [1]. This patch make the appropriate
changes renaming functions and macro, to reflect the change.

[1] https://lore.kernel.org/linux-mips/20250403161143.361461-2-marco.crivellari@suse.com/

Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2025-04-27 10:12:44 +02:00
Marco Crivellari
b713f27e32 MIPS: Move r4k_wait() to .cpuidle.text section
Fix missing .cpuidle.text section assignment for r4k_wait() to correct
backtracing with nmi_backtrace().

Fixes: 97c8580e85 ("MIPS: Annotate cpu_wait implementations with __cpuidle")
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2025-04-27 10:12:09 +02:00
Marco Crivellari
56651128e2 MIPS: Fix idle VS timer enqueue
MIPS re-enables interrupts on its idle routine and performs
a TIF_NEED_RESCHED check afterwards before putting the CPU to sleep.

The IRQs firing between the check and the 'wait' instruction may set the
TIF_NEED_RESCHED flag. In order to deal with this possible race, IRQs
interrupting __r4k_wait() rollback their return address to the
beginning of __r4k_wait() so that TIF_NEED_RESCHED is checked
again before going back to sleep.

However idle IRQs can also queue timers that may require a tick
reprogramming through a new generic idle loop iteration but those timers
would go unnoticed here because __r4k_wait() only checks
TIF_NEED_RESCHED. It doesn't check for pending timers.

Fix this with fast-forwarding idle IRQs return address to the end of the
idle routine instead of the beginning, so that the generic idle loop
handles both TIF_NEED_RESCHED and pending timers.

CONFIG_CPU_MICROMIPS has been removed along with the nop instructions.
There, NOPs are 2 byte in size, so change the code with 3 _ssnop which are
always 4 byte and remove the ifdef. Added ehb to make sure the hazard
is always cleared.

Fixes: c65a5480ff ("[MIPS] Fix potential latency problem due to non-atomic cpu_wait.")
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2025-04-27 10:12:03 +02:00
Jernej Skrabec
573f99c758 Revert "arm64: dts: allwinner: h6: Use RSB for AXP805 PMIC connection"
This reverts commit 531fdbeede.

Hardware that uses I2C wasn't designed with high speeds in mind, so
communication with PMIC via RSB can intermittently fail. Go back to I2C
as higher speed and efficiency isn't worth the trouble.

Fixes: 531fdbeede ("arm64: dts: allwinner: h6: Use RSB for AXP805 PMIC connection")
Link: https://github.com/LibreELEC/LibreELEC.tv/issues/7731
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://patch.msgid.link/20250413135848.67283-1-jernej.skrabec@gmail.com
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2025-04-27 13:38:14 +08:00
Tom Vincent
5e6a4ee979 arm64: dts: rockchip: Assign RT5616 MCLK rate on rk3588-friendlyelec-cm3588
The Realtek RT5616 audio codec on the FriendlyElec CM3588 module fails
to probe correctly due to the missing clock properties. This results
in distorted analogue audio output.

Assign MCLK to 12.288 MHz, which allows the codec to advertise most of
the standard sample rates per other RK3588 devices.

Fixes: e23819cf27 ("arm64: dts: rockchip: Add FriendlyElec CM3588 NAS board")
Signed-off-by: Tom Vincent <linux@tlvince.com>
Link: https://lore.kernel.org/r/20250417081753.644950-1-linux@tlvince.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-04-26 23:39:36 +02:00
Krzysztof Kozlowski
7ec0987da2 arm64: dts: rockchip: Align wifi node name with bindings in CB2
Since commit 3c3606793f ("dt-bindings: wireless: bcm4329-fmac: Use
wireless-controller.yaml schema"), bindings expect 'wifi' as node name:

  rk3566-bigtreetech-cb2-manta.dtb: sdio-wifi@1: $nodename:0: 'sdio-wifi@1' does not match '^wifi(@.*)?$'

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20250424084729.105182-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-04-26 23:39:36 +02:00
Heiko Stuebner
d934a93bbc clk: rockchip: rk3576: define clk_otp_phy_g
The phy clock of the OTP block is also present, but was not defined
so far. Though its clk-id already existed, so just define its location.

Tested-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20250210224510.1194963-2-heiko@sntech.de
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-04-26 00:04:49 +02:00
Andrea Righi
e7dcd1304b sched_ext: Remove duplicate BTF_ID_FLAGS definitions
Some kfuncs specific to the idle CPU selection policy are registered in
both the scx_kfunc_ids_any and scx_kfunc_ids_idle blocks, even though
they should only be defined in the latter.

Remove the duplicates from scx_kfunc_ids_any.

Fixes: 337d1b354a ("sched_ext: Move built-in idle CPU selection policy to a separate file")
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-04-25 08:41:02 -10:00
Gabriel Shahrouzi
c6e8d85faf staging: axis-fifo: Remove hardware resets for user errors
The axis-fifo driver performs a full hardware reset (via
reset_ip_core()) in several error paths within the read and write
functions. This reset flushes both TX and RX FIFOs and resets the
AXI-Stream links.

Allow the user to handle the error without causing hardware disruption
or data loss in other FIFO paths.

Fixes: 4a965c5f89 ("staging: add driver for Xilinx AXI-Stream FIFO v4.1 IP core")
Cc: stable@vger.kernel.org
Signed-off-by: Gabriel Shahrouzi <gshahrouzi@gmail.com>
Link: https://lore.kernel.org/r/20250419004306.669605-1-gshahrouzi@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-25 16:14:25 +02:00
Gabriel Shahrouzi
2ca34b5087 staging: axis-fifo: Correct handling of tx_fifo_depth for size validation
Remove erroneous subtraction of 4 from the total FIFO depth read from
device tree. The stored depth is for checking against total capacity,
not initial vacancy. This prevented writes near the FIFO's full size.

The check performed just before data transfer, which uses live reads of
the TDFV register to determine current vacancy, correctly handles the
initial Depth - 4 hardware state and subsequent FIFO fullness.

Fixes: 4a965c5f89 ("staging: add driver for Xilinx AXI-Stream FIFO v4.1 IP core")
Cc: stable@vger.kernel.org
Signed-off-by: Gabriel Shahrouzi <gshahrouzi@gmail.com>
Link: https://lore.kernel.org/r/20250419012937.674924-1-gshahrouzi@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-25 16:14:15 +02:00
Dave Stevenson
98698ca0e5 staging: bcm2835-camera: Initialise dev in v4l2_dev
Commit 42a2f6664e ("staging: vc04_services: Move global g_state to
vchiq_state") changed mmal_init to pass dev->v4l2_dev.dev to
vchiq_mmal_init, however nothing iniitialised dev->v4l2_dev, so we got
a NULL pointer dereference.

Set dev->v4l2_dev.dev during bcm2835_mmal_probe. The device pointer
could be passed into v4l2_device_register to set it, however that also
has other effects that would need additional changes.

Fixes: 42a2f6664e ("staging: vc04_services: Move global g_state to vchiq_state")
Cc: stable@vger.kernel.org
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Reviewed-by: Stefan Wahren <wahrenst@gmx.net>
Link: https://lore.kernel.org/r/20250423-staging-bcm2835-v4l2-fix-v2-1-3227f0ba4700@raspberrypi.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-25 16:14:10 +02:00
Michael Riesch
1526a735a7 MAINTAINERS: add exclude for dt-bindings to imx entry
Since the IMX (as in i.MX, the NXP SoCs) MAINTAINERS entry claims
everything that contains the name "imx", hanges to device tree
bindings for any Sony IMX image sensor are likely to be sent to the
maintainers listed therein. Add the missing exclude to fix that.

Fixes: da8b7f0fb0 ("MAINTAINERS: add all files matching "imx" and "mxs" to the IMX entry")
Suggested-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Michael Riesch <michael.riesch@collabora.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2025-04-25 09:36:25 +08:00
Dan Carpenter
a476cadf8e KVM: x86: Check that the high 32bits are clear in kvm_arch_vcpu_ioctl_run()
The "kvm_run->kvm_valid_regs" and "kvm_run->kvm_dirty_regs" variables are
u64 type.  We are only using the lowest 3 bits but we want to ensure that
the users are not passing invalid bits so that we can use the remaining
bits in the future.

However "sync_valid_fields" and kvm_sync_valid_fields() are u32 type so
the check only ensures that the lower 32 bits are clear.  Fix this by
changing the types to u64.

Fixes: 74c1807f6c ("KVM: x86: block KVM_CAP_SYNC_REGS if guest state is protected")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/ec25aad1-113e-4c6e-8941-43d432251398@stanley.mountain
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-04-24 11:18:02 -07:00
Mikhail Lobanov
a2620f8932 KVM: SVM: Forcibly leave SMM mode on SHUTDOWN interception
Previously, commit ed129ec905 ("KVM: x86: forcibly leave nested mode
on vCPU reset") addressed an issue where a triple fault occurring in
nested mode could lead to use-after-free scenarios. However, the commit
did not handle the analogous situation for System Management Mode (SMM).

This omission results in triggering a WARN when KVM forces a vCPU INIT
after SHUTDOWN interception while the vCPU is in SMM. This situation was
reprodused using Syzkaller by:

  1) Creating a KVM VM and vCPU
  2) Sending a KVM_SMI ioctl to explicitly enter SMM
  3) Executing invalid instructions causing consecutive exceptions and
     eventually a triple fault

The issue manifests as follows:

  WARNING: CPU: 0 PID: 25506 at arch/x86/kvm/x86.c:12112
  kvm_vcpu_reset+0x1d2/0x1530 arch/x86/kvm/x86.c:12112
  Modules linked in:
  CPU: 0 PID: 25506 Comm: syz-executor.0 Not tainted
  6.1.130-syzkaller-00157-g164fe5dde9b6 #0
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
  BIOS 1.12.0-1 04/01/2014
  RIP: 0010:kvm_vcpu_reset+0x1d2/0x1530 arch/x86/kvm/x86.c:12112
  Call Trace:
   <TASK>
   shutdown_interception+0x66/0xb0 arch/x86/kvm/svm/svm.c:2136
   svm_invoke_exit_handler+0x110/0x530 arch/x86/kvm/svm/svm.c:3395
   svm_handle_exit+0x424/0x920 arch/x86/kvm/svm/svm.c:3457
   vcpu_enter_guest arch/x86/kvm/x86.c:10959 [inline]
   vcpu_run+0x2c43/0x5a90 arch/x86/kvm/x86.c:11062
   kvm_arch_vcpu_ioctl_run+0x50f/0x1cf0 arch/x86/kvm/x86.c:11283
   kvm_vcpu_ioctl+0x570/0xf00 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4122
   vfs_ioctl fs/ioctl.c:51 [inline]
   __do_sys_ioctl fs/ioctl.c:870 [inline]
   __se_sys_ioctl fs/ioctl.c:856 [inline]
   __x64_sys_ioctl+0x19a/0x210 fs/ioctl.c:856
   do_syscall_x64 arch/x86/entry/common.c:51 [inline]
   do_syscall_64+0x35/0x80 arch/x86/entry/common.c:81
   entry_SYSCALL_64_after_hwframe+0x6e/0xd8

Architecturally, INIT is blocked when the CPU is in SMM, hence KVM's WARN()
in kvm_vcpu_reset() to guard against KVM bugs, e.g. to detect improper
emulation of INIT.  SHUTDOWN on SVM is a weird edge case where KVM needs to
do _something_ sane with the VMCB, since it's technically undefined, and
INIT is the least awful choice given KVM's ABI.

So, double down on stuffing INIT on SHUTDOWN, and force the vCPU out of
SMM to avoid any weirdness (and the WARN).

Found by Linux Verification Center (linuxtesting.org) with Syzkaller.

Fixes: ed129ec905 ("KVM: x86: forcibly leave nested mode on vCPU reset")
Cc: stable@vger.kernel.org
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Mikhail Lobanov <m.lobanov@rosa.ru>
Link: https://lore.kernel.org/r/20250414171207.155121-1-m.lobanov@rosa.ru
[sean: massage changelog, make it clear this isn't architectural behavior]
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-04-24 11:17:58 -07:00
Milton Barrera
fa9fdeea1b HID: quirks: Add ADATA XPG alpha wireless mouse support
This patch adds HID_QUIRK_ALWAYS_POLL for the ADATA XPG wireless gaming mouse (USB ID 125f:7505) and its USB dongle (USB ID 125f:7506). Without this quirk, the device does not generate input events properly.

Signed-off-by: Milton Barrera <miltonjosue2001@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-04-24 13:01:57 +02:00
Jiapeng Chong
acae9d5b51 HID: hid-steam: Remove the unused variable connected
Variable connected is not effectively used, so delete it.

drivers/hid/hid-steam.c:1153:7: warning: variable ‘connected’ set but not used.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=20462
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-04-24 12:52:11 +02:00
Mario Limonciello
f32e8c8095 HID: amd_sfh: Avoid clearing reports for SRA sensor
SRA sensor doesn't allocate any memory for reports.  Skip
trying to clear memory for that sensor in cleanup path.

Suggested-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-04-24 12:17:55 +02:00
Mario Limonciello
0cc2effbc8 HID: amd_sfh: Fix SRA sensor when it's the only sensor
On systems that only have an SRA sensor connected to SFH the sensor
doesn't get enabled due to a bad optimization condition of breaking
the sensor walk loop.

This optimization is unnecessary in the first place because if there
is only one device then the loop only runs once. Drop the condition
and explicitly mark sensor as enabled.

Reported-by: Yijun Shen <Yijun.Shen@dell.com>
Tested-By: Yijun Shen <Yijun_Shen@Dell.com>
Fixes: d1c444b471 ("HID: amd_sfh: Add support to export device operating states")
Cc: stable@vger.kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-04-24 12:17:55 +02:00
Qasim Ijaz
6bf8ab7774 HID: wacom: fix shift OOB in kfifo allocation for zero pktlen
During wacom_parse_and_register() the code calls wacom_devm_kfifo_alloc
to allocate a fifo. During this operation it passes kfifo_alloc a
fifo_size of 0. Kfifo attempts to round the size passed to it to the
next power of 2 via roundup_pow_of_two (queue-type data structures
do this to maintain efficiency of operations).

However during this phase a problem arises when the roundup_pow_of_two()
function utilises a shift exponent of fls_long(n-1), where n is the
fifo_size. Since n is 0 in this case and n is also an unsigned long,
doing n-1 causes unsigned integer wrap-around to occur making the
fifo_size 4294967295. So the code effectively does fls_long(4294967295)
which results in 64. Returning back to roundup_pow_of_two(), the code
utilises a shift exponent of 64. When a shift exponent of 64 is used
on a 64-bit type such as 1UL it results in a shift-out-of-bounds.

The root cause of the issue seems to stem from insufficient validation
of wacom_compute_pktlen(), since in this case the fifo_size comes
from wacom_wac->features.pktlen. During wacom_parse_and_register()
the wacom_compute_pktlen() function sets the pktlen as 0.

To fix this, we should handle cases where wacom_compute_pktlen()
results in 0.

Reported-by: syzbot <syzbot+d5204cbbdd921f1f7cad@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=d5204cbbdd921f1f7cad
Fixes: 5e013ad206 ("HID: wacom: Remove static WACOM_PKGLEN_MAX limit")
Tested-by: Qasim Ijaz <qasdev00@gmail.com>
Reviewed-by: Jason Gerecke <jason.gerecke@wacom.com>
Cc: stable@vger.kernel.org
Signed-off-by: Qasim Ijaz <qasdev00@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-04-24 12:13:32 +02:00
Henry Martin
bd07f75120 HID: uclogic: Add NULL check in uclogic_input_configured()
devm_kasprintf() returns NULL when memory allocation fails. Currently,
uclogic_input_configured() does not check for this case, which results
in a NULL pointer dereference.

Add NULL check after devm_kasprintf() to prevent this issue.

Fixes: dd613a4e45 ("HID: uclogic: Correct devm device reference for hidinput input_dev name")
Signed-off-by: Henry Martin <bsdhenrymartin@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-04-24 12:12:16 +02:00
Qasim Ijaz
fd34bf79a6 HID: wacom: fix memory leak on size mismatch in wacom_wac_queue_flush()
In wacom_wac_queue_flush() the code allocates zero initialised
buffer which it uses as a storage buffer for copying data from
a fifo via kfifo_out(). The kfifo_out() function returns the
number of elements it has copied. The code checks if the number
of copied elements does not equal the size of the fifo record,
if it does not it simply skips the entry and continues to the
next iteration. However it does not release the storage buffer
leading to a memory leak.

Fix the memory leak by freeing the buffer on size mismatch.

Fixes: 5e013ad206 ("HID: wacom: Remove static WACOM_PKGLEN_MAX limit")
Reviewed-by: Jason Gerecke <jason.gerecke@wacom.com>
Signed-off-by: Qasim Ijaz <qasdev00@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-04-24 11:54:34 +02:00
Qasim Ijaz
e1ca5f39c2 HID: wacom: handle kzalloc() allocation failure in wacom_wac_queue_flush()
During wacom_wac_queue_flush() the code calls
kzalloc() to allocate a zero initialised buffer
which it uses as a storage buffer to get data
from the fifo via kfifo_out(). However it does not
check kzalloc() for allocation failure which returns
NULL and could potentially lead to a NULL deref.

Fix this by checking for kzalloc() failure and skipping
the current entry if allocation failure occurs.

Fixes: 5e013ad206 ("HID: wacom: Remove static WACOM_PKGLEN_MAX limit")
Signed-off-by: Qasim Ijaz <qasdev00@gmail.com>
Reviewed-by: Jason Gerecke <jason.gerecke@wacom.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-04-24 11:53:35 +02:00
Qasim Ijaz
09d546303b HID: thrustmaster: fix memory leak in thrustmaster_interrupts()
In thrustmaster_interrupts(), the allocated send_buf is not
freed if the usb_check_int_endpoints() check fails, leading
to a memory leak.

Fix this by ensuring send_buf is freed before returning in
the error path.

Fixes: 50420d7c79 ("HID: hid-thrustmaster: Fix warning in thrustmaster_probe by adding endpoint check")
Signed-off-by: Qasim Ijaz <qasdev00@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-04-24 11:48:13 +02:00
Aditya Garg
c14e02e68b HID: hid-appletb-kbd: Fix wrong date and kernel version in sysfs interface docs
The driver hid-appletb-kbd was upstreamed in kernel 6.15. But, due to an
oversight on my part, I didn't change the kernel version and expected
date while upstreaming the driver, thus it remained as 6.5, the original
kernel version when the driver was developed for downstream. This commit
should fix this.

Signed-off-by: Aditya Garg <gargaditya08@live.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-04-24 11:46:06 +02:00
Peter Hutterer
55cd617566 HID: bpf: fix BTN_STYLUS for the XP Pen ACK05 remote
Usage_Dig_BarrelSwitch was applied in the UsagePage_Button
which incorrectly mapped to BTN_TOOL_PENCIL

Link: https://gitlab.freedesktop.org/libevdev/udev-hid-bpf/-/merge_requests/193
Fixes: 834da375 ("bpf: add a v6.11+ compatible BPF fixup for the XPPen ACK05 remote")
Link: https://patchwork.kernel.org/project/linux-input/patch/20250207-bpf-import-2025-02-07-v1-7-6048fdd5a206@kernel.org/

Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-04-24 11:36:53 +02:00
Basavaraj Natikar
305245a2e1 dmaengine: ptdma: Move variable condition check to the first place and remove redundancy
The variable is dereferenced without first checking if it's null, leading
to the following warning: 'Variable dereferenced before check: desc.'

     drivers/dma/amd/ptdma/ptdma-dmaengine.c: pt_cmd_callback_work()
     warn: variable dereferenced before check 'desc'

Therefore, move the condition check for the 'desc' variable to the first
place and remove the redundant extra condition check.

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/bfa0a979-ce9f-422d-92c3-34921155d048@stanley.mountain/
Fixes: 6565439894 ("dmaengine: ptdma: Utilize the AE4DMA engine's multi-queue functionality")
Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
Link: https://lore.kernel.org/r/20250421114215.1687073-1-Basavaraj.Natikar@amd.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-04-23 17:41:32 +05:30
Vinicius Costa Gomes
8dfa57aabf dmaengine: idxd: Fix allowing write() from different address spaces
Check if the process submitting the descriptor belongs to the same
address space as the one that opened the file, reject otherwise.

Fixes: 6827738dc6 ("dmaengine: idxd: add a write() method for applications to submit work")
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20250421170337.3008875-1-dave.jiang@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-04-23 17:34:17 +05:30
Andrea Righi
a11d6784d7 sched_ext: Fix missing rq lock in scx_bpf_cpuperf_set()
scx_bpf_cpuperf_set() can be used to set a performance target level on
any CPU. However, it doesn't correctly acquire the corresponding rq
lock, which may lead to unsafe behavior and trigger the following
warning, due to the lockdep_assert_rq_held() check:

[   51.713737] WARNING: CPU: 3 PID: 3899 at kernel/sched/sched.h:1512 scx_bpf_cpuperf_set+0x1a0/0x1e0
...
[   51.713836] Call trace:
[   51.713837]  scx_bpf_cpuperf_set+0x1a0/0x1e0 (P)
[   51.713839]  bpf_prog_62d35beb9301601f_bpfland_init+0x168/0x440
[   51.713841]  bpf__sched_ext_ops_init+0x54/0x8c
[   51.713843]  scx_ops_enable.constprop.0+0x2c0/0x10f0
[   51.713845]  bpf_scx_reg+0x18/0x30
[   51.713847]  bpf_struct_ops_link_create+0x154/0x1b0
[   51.713849]  __sys_bpf+0x1934/0x22a0

Fix by properly acquiring the rq lock when possible or raising an error
if we try to operate on a CPU that is not the one currently locked.

Fixes: d86adb4fc0 ("sched_ext: Add cpuperf support")
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Acked-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-04-22 09:28:12 -10:00
Andrea Righi
18853ba782 sched_ext: Track currently locked rq
Some kfuncs provided by sched_ext may need to operate on a struct rq,
but they can be invoked from various contexts, specifically, different
scx callbacks.

While some of these callbacks are invoked with a particular rq already
locked, others are not. This makes it impossible for a kfunc to reliably
determine whether it's safe to access a given rq, triggering potential
bugs or unsafe behaviors, see for example [1].

To address this, track the currently locked rq whenever a sched_ext
callback is invoked via SCX_CALL_OP*().

This allows kfuncs that need to operate on an arbitrary rq to retrieve
the currently locked one and apply the appropriate action as needed.

[1] https://lore.kernel.org/lkml/20250325140021.73570-1-arighi@nvidia.com/

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Acked-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-04-22 09:27:50 -10:00
Martin Blumenstingl
a08b28c1ed arm64: dts: amlogic: g12: fix reference to unknown/untested PWM clock
Device-tree expects absent clocks to be specified as <0> (instead of
using <>). This fixes using the FCLK4/FCLK3 clocks as they are now
seen at their correct index (while before they were recognized, but at
the correct index - resulting in the hardware using a different clock
than what the kernel sees).

Fixes: e6884f2e41 ("arm64: dts: amlogic: g12: switch to the new PWM controller binding")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20250420164801.330505-5-martin.blumenstingl@googlemail.com
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
2025-04-22 09:00:20 +02:00
Martin Blumenstingl
511d388a40 arm64: dts: amlogic: gx: fix reference to unknown/untested PWM clock
Device-tree expects absent clocks to be specified as <0> (instead of
using <>). This fixes using the FCLK4/FCLK3 clocks as they are now
seen at their correct index (while before they were recognized, but at
the correct index - resulting in the hardware using a different clock
than what the kernel sees).

Fixes: a526eeef9a ("arm64: dts: amlogic: gx: switch to the new PWM controller binding")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20250420164801.330505-4-martin.blumenstingl@googlemail.com
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
2025-04-22 09:00:20 +02:00
Martin Blumenstingl
a994b58f9d ARM: dts: amlogic: meson8b: fix reference to unknown/untested PWM clock
Device-tree expects absent clocks to be specified as <0> (instead of
using <>). This fixes using the FCLK4/FCLK3 clocks as they are now
seen at their correct index (while before they were recognized, but at
the correct index - resulting in the hardware using a different clock
than what the kernel sees).

Fixes: dbf9218619 ("ARM: dts: amlogic: meson8b: switch to the new PWM controller binding")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20250420164801.330505-3-martin.blumenstingl@googlemail.com
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
2025-04-22 09:00:19 +02:00
Martin Blumenstingl
3409f843c0 ARM: dts: amlogic: meson8: fix reference to unknown/untested PWM clock
Device-tree expects absent clocks to be specified as <0> (instead of
using <>). This fixes using the FCLK4/FCLK3 clocks as they are now
seen at their correct index (while before they were recognized, but at
the correct index - resulting in the hardware using a different clock
than what the kernel sees).

Fixes: 802cff460a ("ARM: dts: amlogic: meson8: switch to the new PWM controller binding")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20250420164801.330505-2-martin.blumenstingl@googlemail.com
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
2025-04-22 09:00:19 +02:00
Sébastien Szymanski
6e1a7bc838 ARM: dts: opos6ul: add ksz8081 phy properties
Commit c7e73b5051 ("ARM: imx: mach-imx6ul: remove 14x14 EVK specific
PHY fixup") removed a PHY fixup that setted the clock mode and the LED
mode.
Make the Ethernet interface work again by doing as advised in the
commit's log, set clock mode and the LED mode in the device tree.

Fixes: c7e73b5051 ("ARM: imx: mach-imx6ul: remove 14x14 EVK specific PHY fixup")
Signed-off-by: Sébastien Szymanski <sebastien.szymanski@armadeus.com>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2025-04-22 10:44:18 +08:00
Richard Zhu
02e4232998 arm64: dts: imx95: Correct the range of PCIe app-reg region
Correct the range of PCIe app-reg region from 0x2000 to 0x4000 refer to
SerDes_SS memory map of i.MX95 Rerference Manual.

Fixes: 3b1d5deb29 ("arm64: dts: imx95: add pcie[0,1] and pcie-ep[0,1] support")
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2025-04-22 10:35:23 +08:00
Ahmad Fatoum
4b98bf3bff arm64: dts: imx8mp: configure GPU and NPU clocks in nominal DTSI
Commit 255fbd9eab ("arm64: dts: imx8mp: Add optional nominal drive
mode DTSI") added imx8mp-nominal.dtsi, which overrides all overdrive
clock rates in imx8mp.dtsi to the nominal rates.

At the same time, commit 9f7595b3e5 ("arm64: dts: imx8mp: configure
GPU and NPU clocks to overdrive rate") went in, which changed some
clock rates away from the nominal values.

Resolve the discrepancy by effectively reverting the changes in the
latter commit inside imx8mp-nominal.dtsi. This is required for proper
operation of the imx8mp-skov boards, which are currently
imx8mp-nominal.dtsi's only users and lets all other boards that don't
include it benefit from the new higher frequencies.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2025-04-22 10:18:41 +08:00
Gabriel Shahrouzi
609bc31eca iio: adis16201: Correct inclinometer channel resolution
The inclinometer channels were previously defined with 14 realbits.
However, the ADIS16201 datasheet states the resolution for these output
channels is 12 bits (Page 14, text description; Page 15, table 7).

Correct the realbits value to 12 to accurately reflect the hardware.

Fixes: f7fe1d1dd5 ("staging: iio: new adis16201 driver")
Cc: stable@vger.kernel.org
Signed-off-by: Gabriel Shahrouzi <gshahrouzi@gmail.com>
Reviewed-by: Marcelo Schmitt <marcelo.schmitt1@gmail.com>
Link: https://patch.msgid.link/20250421131539.912966-1-gshahrouzi@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-21 14:58:31 +01:00
Angelo Dureghello
f083f8a21c iio: adc: ad7606: fix serial register access
Fix register read/write routine as per datasheet.

When reading multiple consecutive registers, only the first one is read
properly. This is due to missing chip select deassert and assert again
between first and second 16bit transfer, as shown in the datasheet
AD7606C-16, rev 0, figure 110.

Fixes: f2a22e1e17 ("iio: adc: ad7606: Add support for software mode for ad7616")
Reviewed-by: David Lechner <dlechner@baylibre.com>
Signed-off-by: Angelo Dureghello <adureghello@baylibre.com>
Link: https://patch.msgid.link/20250418-wip-bl-ad7606-fix-reg-access-v3-1-d5eeb440c738@baylibre.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-21 11:59:58 +01:00
David Lechner
ffcd19e9f4 iio: pressure: mprls0025pa: use aligned_s64 for timestamp
Follow the pattern of other drivers and use aligned_s64 for the
timestamp. This will ensure the struct itself it also 8-byte aligned.

While touching this, convert struct mpr_chan to an anonymous struct
to consolidate the code a bit to make it easier for future readers.

Fixes: 713337d914 ("iio: pressure: Honeywell mprls0025pa pressure sensor")
Signed-off-by: David Lechner <dlechner@baylibre.com>
Link: https://patch.msgid.link/20250418-iio-more-timestamp-alignment-v2-2-d6a5d2b1c9fe@baylibre.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-21 11:51:48 +01:00
David Lechner
e4570f4bb2 iio: imu: adis16550: align buffers for timestamp
Align the buffers used with iio_push_to_buffers_with_timestamp() to
ensure the s64 timestamp is aligned to 8 bytes.

Fixes: bac4368fab ("iio: imu: adis16550: add adis16550 support")
Signed-off-by: David Lechner <dlechner@baylibre.com>
Link: https://patch.msgid.link/20250418-iio-more-timestamp-alignment-v2-1-d6a5d2b1c9fe@baylibre.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-21 11:51:35 +01:00
Dan Carpenter
4bcc063939 ice, irdma: fix an off by one in error handling code
If we don't allocate the MIN number of IRQs then we need to free what
we have and return -ENOMEM.  The problem is this loop is off by one
so it frees an entry that wasn't allocated and it doesn't free the
first entry where i == 0.

Fixes: 3e0d3cb3fb ("ice, irdma: move interrupts code to irdma")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Link: https://patch.msgid.link/20250414234231.523-2-tatyana.e.nikolova@intel.com
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-04-20 06:55:51 -04:00
Michal Swiatkowski
80f2ab46c2 irdma: free iwdev->rf after removing MSI-X
Currently iwdev->rf is allocated in irdma_probe(), but free in
irdma_ib_dealloc_device(). It can be misleading. Move the free to
irdma_remove() to be more obvious.

Freeing in irdma_ib_dealloc_device() leads to KASAN use-after-free
issue. Which can also lead to NULL pointer dereference. Fix this.

irdma_deinit_interrupts() can't be moved before freeing iwdef->rf,
because in this case deinit interrupts will be done before freeing irqs.
The simplest solution is to move kfree(iwdev->rf) to irdma_remove().

Reproducer:
  sudo rmmod irdma

Minified splat(s):
  BUG: KASAN: use-after-free in irdma_remove+0x257/0x2d0 [irdma]
  Call Trace:
   <TASK>
   ? __pfx__raw_spin_lock_irqsave+0x10/0x10
   ? kfree+0x253/0x450
   ? irdma_remove+0x257/0x2d0 [irdma]
   kasan_report+0xed/0x120
   ? irdma_remove+0x257/0x2d0 [irdma]
   irdma_remove+0x257/0x2d0 [irdma]
   auxiliary_bus_remove+0x56/0x80
   device_release_driver_internal+0x371/0x530
   ? kernfs_put.part.0+0x147/0x310
   driver_detach+0xbf/0x180
   bus_remove_driver+0x11b/0x2a0
   auxiliary_driver_unregister+0x1a/0x50
   irdma_exit_module+0x40/0x4c [irdma]

  Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN NOPTI
  KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
  RIP: 0010:ice_free_rdma_qvector+0x2a/0xa0 [ice]
  Call Trace:
   ? ice_free_rdma_qvector+0x2a/0xa0 [ice]
   irdma_remove+0x179/0x2d0 [irdma]
   auxiliary_bus_remove+0x56/0x80
   device_release_driver_internal+0x371/0x530
   ? kobject_put+0x61/0x4b0
   driver_detach+0xbf/0x180
   bus_remove_driver+0x11b/0x2a0
   auxiliary_driver_unregister+0x1a/0x50
   irdma_exit_module+0x40/0x4c [irdma]

Reported-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Closes: https://lore.kernel.org/netdev/8e533834-4564-472f-b29b-4f1cb7730053@linux.intel.com/
Fixes: 3e0d3cb3fb ("ice, irdma: move interrupts code to irdma")
Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Link: https://patch.msgid.link/20250414234231.523-1-tatyana.e.nikolova@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-04-20 06:55:51 -04:00
Zhu Yanjun
f81b33582f RDMA/rxe: Fix slab-use-after-free Read in rxe_queue_cleanup bug
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x7d/0xa0 lib/dump_stack.c:120
 print_address_description mm/kasan/report.c:378 [inline]
 print_report+0xcf/0x610 mm/kasan/report.c:489
 kasan_report+0xb5/0xe0 mm/kasan/report.c:602
 rxe_queue_cleanup+0xd0/0xe0 drivers/infiniband/sw/rxe/rxe_queue.c:195
 rxe_cq_cleanup+0x3f/0x50 drivers/infiniband/sw/rxe/rxe_cq.c:132
 __rxe_cleanup+0x168/0x300 drivers/infiniband/sw/rxe/rxe_pool.c:232
 rxe_create_cq+0x22e/0x3a0 drivers/infiniband/sw/rxe/rxe_verbs.c:1109
 create_cq+0x658/0xb90 drivers/infiniband/core/uverbs_cmd.c:1052
 ib_uverbs_create_cq+0xc7/0x120 drivers/infiniband/core/uverbs_cmd.c:1095
 ib_uverbs_write+0x969/0xc90 drivers/infiniband/core/uverbs_main.c:679
 vfs_write fs/read_write.c:677 [inline]
 vfs_write+0x26a/0xcc0 fs/read_write.c:659
 ksys_write+0x1b8/0x200 fs/read_write.c:731
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xaa/0x1b0 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

In the function rxe_create_cq, when rxe_cq_from_init fails, the function
rxe_cleanup will be called to handle the allocated resources. In fact,
some memory resources have already been freed in the function
rxe_cq_from_init. Thus, this problem will occur.

The solution is to let rxe_cleanup do all the work.

Fixes: 8700e3e7c4 ("Soft RoCE driver")
Link: https://paste.ubuntu.com/p/tJgC42wDf6/
Tested-by: liuyi <liuy22@mails.tsinghua.edu.cn>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://patch.msgid.link/20250412075714.3257358-1-yanjun.zhu@linux.dev
Reviewed-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-04-20 06:14:49 -04:00
Janne Grunau
dc5befecbe arm64: dts: apple: touchbar: Mark ps_dispdfr_be as always-on
The driver depends on boot loader initialized state which resets when the
ps_dispdfr_be power-domain is powered off. This happens on suspend or
when the driver is missing during boot.
Mark the domain as always on until the driver can handle this.

Fixes: 7275e795e5 ("arm64: dts: apple: Add touchbar screen nodes")
Signed-off-by: Janne Grunau <j@jannau.net>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Link: https://lore.kernel.org/r/20250416-arm64_dts_apple_touchbar-v1-1-e1c0b53b9125@jannau.net
Signed-off-by: Sven Peter <sven@svenpeter.dev>
2025-04-19 08:40:33 +02:00
Asahi Lina
712dc3e7b9 mailmap: Update email for Asahi Lina
Add an alias so I can more easily filter kernel-related emails.

Signed-off-by: Asahi Lina <lina+kernel@asahilina.net>
Link: https://lore.kernel.org/r/20250409-mailmap-lina-email-v1-1-265d05848ae3@asahilina.net
Signed-off-by: Sven Peter <sven@svenpeter.dev>
2025-04-19 08:40:31 +02:00
Hugo Villeneuve
c6cb8bf794 Input: cyttsp5 - ensure minimum reset pulse width
The current reset pulse width is measured to be 5us on a
Renesas RZ/G2L SOM. The manufacturer's minimum reset pulse width is
specified as 10us.

Extend reset pulse width to make sure it is long enough on all platforms.

Also reword confusing comments about reset pin assertion.

Fixes: 5b0c03e24a ("Input: Add driver for Cypress Generation 5 touchscreen")
Cc: stable@vger.kernel.org
Acked-by: Alistair Francis <alistair@alistair23.me>
Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com>
Link: https://lore.kernel.org/r/20250410184633.1164837-1-hugo@hugovil.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-04-18 19:50:41 -07:00
WangYuli
8b1d858cbd Input: sparcspkr - avoid unannotated fall-through
Fix follow warnings with clang-21i (and reformat for clarity):
  drivers/input/misc/sparcspkr.c:78:3: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
     78 |                 case SND_TONE: break;
        |                 ^
  drivers/input/misc/sparcspkr.c:78:3: note: insert 'break;' to avoid fall-through
     78 |                 case SND_TONE: break;
        |                 ^
        |                 break;
  drivers/input/misc/sparcspkr.c:113:3: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
    113 |                 case SND_TONE: break;
        |                 ^
  drivers/input/misc/sparcspkr.c:113:3: note: insert 'break;' to avoid fall-through
    113 |                 case SND_TONE: break;
        |                 ^
        |                 break;
  2 warnings generated.

Signed-off-by: WangYuli <wangyuli@uniontech.com>
Link: https://lore.kernel.org/r/6730E40353C76908+20250415052439.155051-1-wangyuli@uniontech.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-04-18 18:38:46 -07:00
Kees Cook
936a25ef11 input/joystick: magellan: Mark __nonstring look-up table
GCC 15's new -Wunterminated-string-initialization notices that the
16 character lookup table "nibbles" (which is not used as a C-String)
needs to be marked as "nonstring":

drivers/input/joystick/magellan.c: In function 'magellan_crunch_nibbles':
drivers/input/joystick/magellan.c:51:44: warning: initializer-string for array of 'unsigned char' truncates NUL terminator but destination lacks 'nonstring' attribute (17 chars into 16 available) [-Wunterminated-string-initialization]
   51 |         static unsigned char nibbles[16] = "0AB3D56GH9:K<MN?";
      |                                            ^~~~~~~~~~~~~~~~~~

Add the annotation and While at it, mark the table as const too.

Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250416174513.work.662-kees@kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-04-18 18:33:42 -07:00
Gabriel Shahrouzi
2e92295627 staging: iio: adc: ad7816: Correct conditional logic for store mode
The mode setting logic in ad7816_store_mode was reversed due to
incorrect handling of the strcmp return value. strcmp returns 0 on
match, so the `if (strcmp(buf, "full"))` block executed when the
input was not "full".

This resulted in "full" setting the mode to AD7816_PD (power-down) and
other inputs setting it to AD7816_FULL.

Fix this by checking it against 0 to correctly check for "full" and
"power-down", mapping them to AD7816_FULL and AD7816_PD respectively.

Fixes: 7924425db0 ("staging: iio: adc: new driver for AD7816 devices")
Cc: stable@vger.kernel.org
Signed-off-by: Gabriel Shahrouzi <gshahrouzi@gmail.com>
Acked-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/stable/20250414152920.467505-1-gshahrouzi%40gmail.com
Link: https://patch.msgid.link/20250414154050.469482-1-gshahrouzi@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-18 16:17:37 +01:00
Jonathan Cameron
52d3498847 iio: adc: ad7266: Fix potential timestamp alignment issue.
On architectures where an s64 is only 32-bit aligned insufficient padding
would be left between the earlier elements and the timestamp. Use
aligned_s64 to enforce the correct placement and ensure the storage is
large enough.

Fixes: 54e018da31 ("iio:ad7266: Mark transfer buffer as __be16") # aligned_s64 is much newer.
Reported-by: David Lechner <dlechner@baylibre.com>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: David Lechner <dlechner@baylibre.com>
Link: https://patch.msgid.link/20250413103443.2420727-2-jic23@kernel.org
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-18 16:07:47 +01:00
Jonathan Cameron
ffbc26bc91 iio: adc: ad7768-1: Fix insufficient alignment of timestamp.
On architectures where an s64 is not 64-bit aligned, this may result
insufficient alignment of the timestamp and the structure being too small.
Use aligned_s64 to force the alignment.

Fixes: a1caeebab0 ("iio: adc: ad7768-1: Fix too small buffer passed to iio_push_to_buffers_with_timestamp()") # aligned_s64 newer
Reported-by: David Lechner <dlechner@baylibre.com>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: David Lechner <dlechner@baylibre.com>
Link: https://patch.msgid.link/20250413103443.2420727-3-jic23@kernel.org
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-18 16:07:38 +01:00
Jonathan Cameron
5097eaae98 iio: adc: dln2: Use aligned_s64 for timestamp
Here the lack of marking allows the overall structure to not be
sufficiently aligned resulting in misplacement of the timestamp
in iio_push_to_buffers_with_timestamp(). Use aligned_s64 to
force the alignment on all architectures.

Fixes: 7c0299e879 ("iio: adc: Add support for DLN2 ADC")
Reported-by: David Lechner <dlechner@baylibre.com>
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: David Lechner <dlechner@baylibre.com>
Link: https://patch.msgid.link/20250413103443.2420727-4-jic23@kernel.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-18 16:07:29 +01:00
Jonathan Cameron
1bb942287e iio: accel: adxl355: Make timestamp 64-bit aligned using aligned_s64
The IIO ABI requires 64-bit aligned timestamps. In this case insufficient
padding would have been added on architectures where an s64 is only 32-bit
aligned.  Use aligned_s64 to enforce the correct alignment.

Fixes: 327a0eaf19 ("iio: accel: adxl355: Add triggered buffer support")
Reported-by: David Lechner <dlechner@baylibre.com>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: David Lechner <dlechner@baylibre.com>
Link: https://patch.msgid.link/20250413103443.2420727-5-jic23@kernel.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-18 16:06:30 +01:00
Jonathan Cameron
f79aeb6c63 iio: temp: maxim-thermocouple: Fix potential lack of DMA safe buffer.
The trick of using __aligned(IIO_DMA_MINALIGN) ensures that there is
no overlap between buffers used for DMA and those used for driver
state storage that are before the marking. It doesn't ensure
anything above state variables found after the marking. Hence
move this particular bit of state earlier in the structure.

Fixes: 10897f3430 ("iio: temp: maxim_thermocouple: Fix alignment for DMA safety")
Reviewed-by: David Lechner <dlechner@baylibre.com>
Link: https://patch.msgid.link/20250413103443.2420727-14-jic23@kernel.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-18 16:05:06 +01:00
David Lechner
6ffa698674 iio: chemical: pms7003: use aligned_s64 for timestamp
Follow the pattern of other drivers and use aligned_s64 for the
timestamp. This will ensure that the timestamp is correctly aligned on
all architectures.

Also move the unaligned.h header while touching this since it was the
only one not in alphabetical order.

Fixes: 13e945631c ("iio:chemical:pms7003: Fix timestamp alignment and prevent data leak.")
Signed-off-by: David Lechner <dlechner@baylibre.com>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Link: https://patch.msgid.link/20250417-iio-more-timestamp-alignment-v1-4-eafac1e22318@baylibre.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-18 16:03:55 +01:00
David Lechner
bb49d94034 iio: chemical: sps30: use aligned_s64 for timestamp
Follow the pattern of other drivers and use aligned_s64 for the
timestamp. This will ensure that the timestamp is correctly aligned on
all architectures.

Fixes: a5bf6fdd19 ("iio:chemical:sps30: Fix timestamp alignment")
Signed-off-by: David Lechner <dlechner@baylibre.com>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Link: https://patch.msgid.link/20250417-iio-more-timestamp-alignment-v1-5-eafac1e22318@baylibre.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-18 16:03:46 +01:00
David Lechner
1d2d8524ea iio: imu: inv_mpu6050: align buffer for timestamp
Align the buffer used with iio_push_to_buffers_with_timestamp() to
ensure the s64 timestamp is aligned to 8 bytes.

Fixes: 0829edc43e ("iio: imu: inv_mpu6050: read the full fifo when processing data")
Signed-off-by: David Lechner <dlechner@baylibre.com>
Link: https://patch.msgid.link/20250417-iio-more-timestamp-alignment-v1-7-eafac1e22318@baylibre.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-18 16:03:34 +01:00
Ronald Wahl
fca280992a dmaengine: ti: k3-udma: Add missing locking
Recent kernels complain about a missing lock in k3-udma.c when the lock
validator is enabled:

[    4.128073] WARNING: CPU: 0 PID: 746 at drivers/dma/ti/../virt-dma.h:169 udma_start.isra.0+0x34/0x238
[    4.137352] CPU: 0 UID: 0 PID: 746 Comm: kworker/0:3 Not tainted 6.12.9-arm64 #28
[    4.144867] Hardware name: pp-v12 (DT)
[    4.148648] Workqueue: events udma_check_tx_completion
[    4.153841] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    4.160834] pc : udma_start.isra.0+0x34/0x238
[    4.165227] lr : udma_start.isra.0+0x30/0x238
[    4.169618] sp : ffffffc083cabcf0
[    4.172963] x29: ffffffc083cabcf0 x28: 0000000000000000 x27: ffffff800001b005
[    4.180167] x26: ffffffc0812f0000 x25: 0000000000000000 x24: 0000000000000000
[    4.187370] x23: 0000000000000001 x22: 00000000e21eabe9 x21: ffffff8000fa0670
[    4.194571] x20: ffffff8001b6bf00 x19: ffffff8000fa0430 x18: ffffffc083b95030
[    4.201773] x17: 0000000000000000 x16: 00000000f0000000 x15: 0000000000000048
[    4.208976] x14: 0000000000000048 x13: 0000000000000000 x12: 0000000000000001
[    4.216179] x11: ffffffc08151a240 x10: 0000000000003ea1 x9 : ffffffc08046ab68
[    4.223381] x8 : ffffffc083cabac0 x7 : ffffffc081df3718 x6 : 0000000000029fc8
[    4.230583] x5 : ffffffc0817ee6d8 x4 : 0000000000000bc0 x3 : 0000000000000000
[    4.237784] x2 : 0000000000000000 x1 : 00000000001fffff x0 : 0000000000000000
[    4.244986] Call trace:
[    4.247463]  udma_start.isra.0+0x34/0x238
[    4.251509]  udma_check_tx_completion+0xd0/0xdc
[    4.256076]  process_one_work+0x244/0x3fc
[    4.260129]  process_scheduled_works+0x6c/0x74
[    4.264610]  worker_thread+0x150/0x1dc
[    4.268398]  kthread+0xd8/0xe8
[    4.271492]  ret_from_fork+0x10/0x20
[    4.275107] irq event stamp: 220
[    4.278363] hardirqs last  enabled at (219): [<ffffffc080a27c7c>] _raw_spin_unlock_irq+0x38/0x50
[    4.287183] hardirqs last disabled at (220): [<ffffffc080a1c154>] el1_dbg+0x24/0x50
[    4.294879] softirqs last  enabled at (182): [<ffffffc080037e68>] handle_softirqs+0x1c0/0x3cc
[    4.303437] softirqs last disabled at (177): [<ffffffc080010170>] __do_softirq+0x1c/0x28
[    4.311559] ---[ end trace 0000000000000000 ]---

This commit adds the missing locking.

Fixes: 25dcb5dd7b ("dmaengine: ti: New driver for K3 UDMA")
Cc: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: dmaengine@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Ronald Wahl <ronald.wahl@legrand.com>
Acked-by: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Link: https://lore.kernel.org/r/20250414173113.80677-1-rwahl@gmx.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-04-17 20:36:34 +05:30
Yemike Abhilash Chandra
8ca9590c39 dmaengine: ti: k3-udma: Use cap_mask directly from dma_device structure instead of a local copy
Currently, a local dma_cap_mask_t variable is used to store device
cap_mask within udma_of_xlate(). However, the DMA_PRIVATE flag in
the device cap_mask can get cleared when the last channel is released.
This can happen right after storing the cap_mask locally in
udma_of_xlate(), and subsequent dma_request_channel() can fail due to
mismatch in the cap_mask. Fix this by removing the local dma_cap_mask_t
variable and directly using the one from the dma_device structure.

Fixes: 25dcb5dd7b ("dmaengine: ti: New driver for K3 UDMA")
Cc: stable@vger.kernel.org
Signed-off-by: Vaishnav Achath <vaishnav.a@ti.com>
Acked-by: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Reviewed-by: Udit Kumar <u-kumar1@ti.com>
Signed-off-by: Yemike Abhilash Chandra <y-abhilashchandra@ti.com>
Link: https://lore.kernel.org/r/20250417075521.623651-1-y-abhilashchandra@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-04-17 20:33:34 +05:30
Nathan Lynch
df180e6530 dmaengine: Revert "dmaengine: dmatest: Fix dmatest waiting less when interrupted"
Several issues with this change:

* The analysis is flawed and it's unclear what problem is being
  fixed. There is no difference between wait_event_freezable_timeout()
  and wait_event_timeout() with respect to device interrupts. And of
  course "the interrupt notifying the finish of an operation happens
  during wait_event_freezable_timeout()" -- that's how it's supposed
  to work.

* The link at the "Closes:" tag appears to be an unrelated
  use-after-free in idxd.

* It introduces a regression: dmatest threads are meant to be
  freezable and this change breaks that.

See discussion here:
https://lore.kernel.org/dmaengine/878qpa13fe.fsf@AUSNATLYNCH.amd.com/

Fixes: e87ca16e99 ("dmaengine: dmatest: Fix dmatest waiting less when interrupted")
Signed-off-by: Nathan Lynch <nathan.lynch@amd.com>
Link: https://lore.kernel.org/r/20250403-dmaengine-dmatest-revert-waiting-less-v1-1-8227c5a3d7c8@amd.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-04-17 15:06:02 +05:30
Tobias Brunner
e3fd057776 xfrm: Fix UDP GRO handling for some corner cases
This fixes an issue that's caused if there is a mismatch between the data
offset in the GRO header and the length fields in the regular sk_buff due
to the pskb_pull()/skb_push() calls.  That's because the UDP GRO layer
stripped off the UDP header via skb_gro_pull() already while the UDP
header was explicitly not pulled/pushed in this function.

For example, an IKE packet that triggered this had len=data_len=1268 and
the data_offset in the GRO header was 28 (IPv4 + UDP).  So pskb_pull()
was called with an offset of 28-8=20, which reduced len to 1248 and via
pskb_may_pull() and __pskb_pull_tail() it also set data_len to 1248.
As the ESP offload module was not loaded, the function bailed out and
called skb_push(), which restored len to 1268, however, data_len remained
at 1248.

So while skb_headlen() was 0 before, it was now 20.  The latter caused a
difference of 8 instead of 28 (or 0 if pskb_pull()/skb_push() was called
with the complete GRO data_offset) in gro_try_pull_from_frag0() that
triggered a call to gro_pull_from_frag0() that corrupted the packet.

This change uses a more GRO-like approach seen in other GRO receivers
via skb_gro_header() to just read the actual data we are interested in
and does not try to "restore" the UDP header at this point to call the
existing function.  If the offload module is not loaded, it immediately
bails out, otherwise, it only does a quick check to see if the packet
is an IKE or keepalive packet instead of calling the existing function.

Fixes: 172bf009c1 ("xfrm: Support GRO for IPv4 ESP in UDP encapsulation")
Fixes: 221ddb723d ("xfrm: Support GRO for IPv6 ESP in UDP encapsulation")
Signed-off-by: Tobias Brunner <tobias@strongswan.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-04-17 11:08:16 +02:00
Purva Yeshi
97994333de dmaengine: idxd: cdev: Fix uninitialized use of sva in idxd_cdev_open
Fix Smatch-detected issue:
drivers/dma/idxd/cdev.c:321 idxd_cdev_open() error:
uninitialized symbol 'sva'.

'sva' pointer may be used uninitialized in error handling paths.
Specifically, if PASID support is enabled and iommu_sva_bind_device()
returns an error, the code jumps to the cleanup label and attempts to
call iommu_sva_unbind_device(sva) without ensuring that sva was
successfully assigned. This triggers a Smatch warning about an
uninitialized symbol.

Initialize sva to NULL at declaration and add a check using
IS_ERR_OR_NULL() before unbinding the device. This ensures the
function does not use an invalid or uninitialized pointer during
cleanup.

Signed-off-by: Purva Yeshi <purvayeshi550@gmail.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://lore.kernel.org/r/20250410110216.21592-1-purvayeshi550@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-04-17 12:51:55 +05:30
Matthias Schiffer
8772cc49e0 batman-adv: fix duplicate MAC address check
batadv_check_known_mac_addr() is both too lenient and too strict:

- It is called from batadv_hardif_add_interface(), which means that it
  checked interfaces that are not used for batman-adv at all. Move it
  to batadv_hardif_enable_interface(). Also, restrict it to hardifs of
  the same mesh interface; different mesh interfaces should not interact
  at all. The batadv_check_known_mac_addr() argument is changed from
  `struct net_device` to `struct batadv_hard_iface` to achieve this.
- The check only cares about hardifs in BATADV_IF_ACTIVE and
  BATADV_IF_TO_BE_ACTIVATED states, but interfaces in BATADV_IF_INACTIVE
  state should be checked as well, or the following steps will not
  result in a warning then they should:

  - Add two interfaces in down state with different MAC addresses to
    a mesh as hardifs
  - Change the MAC addresses so they conflict
  - Set interfaces to up state

  Now there will be two active hardifs with the same MAC address, but no
  warning. Fix by only ignoring hardifs in BATADV_IF_NOT_IN_USE state.

The RCU lock can be dropped, as we're holding RTNL anyways when the
function is called.

Fixes: c6c8fea297 ("net: Add batman-adv meshing protocol")
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2025-04-16 20:52:33 +02:00
Sabrina Dubroca
028363685b espintcp: remove encap socket caching to avoid reference leak
The current scheme for caching the encap socket can lead to reference
leaks when we try to delete the netns.

The reference chain is: xfrm_state -> enacp_sk -> netns

Since the encap socket is a userspace socket, it holds a reference on
the netns. If we delete the espintcp state (through flush or
individual delete) before removing the netns, the reference on the
socket is dropped and the netns is correctly deleted. Otherwise, the
netns may not be reachable anymore (if all processes within the ns
have terminated), so we cannot delete the xfrm state to drop its
reference on the socket.

This patch results in a small (~2% in my tests) performance
regression.

A GC-type mechanism could be added for the socket cache, to clear
references if the state hasn't been used "recently", but it's a lot
more complex than just not caching the socket.

Fixes: e27cca96cd ("xfrm: add espintcp (RFC 8229)")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-04-14 11:59:17 +02:00
Sabrina Dubroca
63c1f19a3b espintcp: fix skb leaks
A few error paths are missing a kfree_skb.

Fixes: e27cca96cd ("xfrm: add espintcp (RFC 8229)")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-04-14 11:58:50 +02:00
Ivaylo Ivanov
dd303e0219 soc: samsung: usi: prevent wrong bits inversion during unconfiguring
Instead of setting bit 1 (USI_OPTION_CLKSTOP_ON) during USI unconfiguring,
all the other bits in the USI_OPTION register get inverted, which
should not happen as that means the clock will keep getting provided
to the USI IP. Remove the unnecessary tilde.

Fixes: 11e77776b5 ("soc: samsung: usi: add a routine for unconfiguring the ip")
Signed-off-by: Ivaylo Ivanov <ivo.ivanov.ivanov1@gmail.com>
Link: https://lore.kernel.org/r/20250413163755.788907-1-ivo.ivanov.ivanov1@gmail.com
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
2025-04-14 08:51:18 +02:00
Krzysztof Kozlowski
4551383e78 iio: imu: st_lsm6dsx: Fix wakeup source leaks on device unbind
Device can be unbound, so driver must also release memory for the wakeup
source.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250406-b4-device-wakeup-leak-iio-v1-3-2d7d322a4a93@linaro.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-12 12:24:00 +01:00
Krzysztof Kozlowski
ad3764b45c iio: adc: qcom-spmi-iadc: Fix wakeup source leaks on device unbind
Device can be unbound, so driver must also release memory for the wakeup
source.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20250406-b4-device-wakeup-leak-iio-v1-2-2d7d322a4a93@linaro.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-12 12:24:00 +01:00
Krzysztof Kozlowski
0cd34d98df iio: accel: fxls8962af: Fix wakeup source leaks on device unbind
Device can be unbound, so driver must also release memory for the wakeup
source.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20250406-b4-device-wakeup-leak-iio-v1-1-2d7d322a4a93@linaro.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-12 12:23:53 +01:00
Dan Carpenter
9cf118aafd phy: rockchip-samsung-dcphy: Add missing assignment
The "ret = " was accidentally dropped so the error handling doesn't work.

Fixes: b2a1a2ae78 ("phy: rockchip: Add Samsung MIPI D-/C-PHY driver")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/e64265a4-9543-4728-a49f-ea910fccef7c@stanley.mountain
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-04-11 17:16:22 +05:30
Geert Uytterhoeven
e153fdea9d phy: can-transceiver: Re-instate "mux-states" property presence check
On the Renesas Gray Hawk Single development board:

    can-transceiver-phy can-phy0: /can-phy0: failed to get mux-state (0)

"mux-states" is an optional property for CAN transceivers.  However,
mux_get() always prints an error message in case of an error, including
when the property is not present, confusing the user.

Fix this by re-instating the property presence check (this time using
the proper API) in a wrapper around devm_mux_state_get().  When the
multiplexer subsystem gains support for optional muxes, the wrapper can
just be removed.

In addition, propagate all real errors upstream, instead of ignoring
them.

Fixes: d02dfd4ceb ("phy: can-transceiver: Drop unnecessary "mux-states" property presence check")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://lore.kernel.org/r/3d7e0d723908284e8cf06ad1f7950c03173178f3.1742483710.git.geert+renesas@glider.be
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-04-11 17:00:15 +05:30
Nitin Rawat
d784552e76 phy: qcom-qmp-ufs: check for mode type for phy setting
Generally all target supports Rate B but for very few like SM8550,
two sets of UFS PHY settings are provided, one set is to support
HS-G5 Rate A and another set is to support HS-G4 and lower
gears with Rate B.

Commit b02cc9a17679("phy: qcom-qmp-ufs: Add PHY Configuration support
for sm8750") apply Rate B setting for SM8550 gear 5 without checking
for mode value (Rate A or Rate B) from Controller driver which caused
issue as SM8550 support rate A for Gear 5.

Fix this by adding mode check before applying Rat B phy setting.

Fixes: b02cc9a176 ("phy: qcom-qmp-ufs: Add PHY Configuration support for sm8750")
Reported-by: Neil Armstrong <neil.armstrong@linaro.org>
Closes: https://lore.kernel.org/all/430ed11c-0490-45be-897b-27cad9682371@quicinc.com/
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD
Signed-off-by: Nitin Rawat <quic_nitirawa@quicinc.com>
Link: https://lore.kernel.org/r/20250407121008.22230-1-quic_nitirawa@quicinc.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-04-11 12:42:26 +05:30
Wayne Chang
b47158fb42 phy: tegra: xusb: Use a bitmask for UTMI pad power state tracking
The current implementation uses bias_pad_enable as a reference count to
manage the shared bias pad for all UTMI PHYs. However, during system
suspension with connected USB devices, multiple power-down requests for
the UTMI pad result in a mismatch in the reference count, which in turn
produces warnings such as:

[  237.762967] WARNING: CPU: 10 PID: 1618 at tegra186_utmi_pad_power_down+0x160/0x170
[  237.763103] Call trace:
[  237.763104]  tegra186_utmi_pad_power_down+0x160/0x170
[  237.763107]  tegra186_utmi_phy_power_off+0x10/0x30
[  237.763110]  phy_power_off+0x48/0x100
[  237.763113]  tegra_xusb_enter_elpg+0x204/0x500
[  237.763119]  tegra_xusb_suspend+0x48/0x140
[  237.763122]  platform_pm_suspend+0x2c/0xb0
[  237.763125]  dpm_run_callback.isra.0+0x20/0xa0
[  237.763127]  __device_suspend+0x118/0x330
[  237.763129]  dpm_suspend+0x10c/0x1f0
[  237.763130]  dpm_suspend_start+0x88/0xb0
[  237.763132]  suspend_devices_and_enter+0x120/0x500
[  237.763135]  pm_suspend+0x1ec/0x270

The root cause was traced back to the dynamic power-down changes
introduced in commit a30951d31b ("xhci: tegra: USB2 pad power controls"),
where the UTMI pad was being powered down without verifying its current
state. This unbalanced behavior led to discrepancies in the reference
count.

To rectify this issue, this patch replaces the single reference counter
with a bitmask, renamed to utmi_pad_enabled. Each bit in the mask
corresponds to one of the four USB2 PHYs, allowing us to track each pad's
enablement status individually.

With this change:
  - The bias pad is powered on only when the mask is clear.
  - Each UTMI pad is powered on or down based on its corresponding bit
    in the mask, preventing redundant operations.
  - The overall power state of the shared bias pad is maintained
    correctly during suspend/resume cycles.

The mutex used to prevent race conditions during UTMI pad enable/disable
operations has been moved from the tegra186_utmi_bias_pad_power_on/off
functions to the parent functions tegra186_utmi_pad_power_on/down. This
change ensures that there are no race conditions when updating the bitmask.

Cc: stable@vger.kernel.org
Fixes: a30951d31b ("xhci: tegra: USB2 pad power controls")
Signed-off-by: Wayne Chang <waynec@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://lore.kernel.org/r/20250408030905.990474-1-waynec@nvidia.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-04-11 12:34:46 +05:30
Rob Herring (Arm)
0c56228119 arm64: dts: morello: Fix-up cache nodes
There's no need include the CPU number in the L2 cache node names as
the names are local to the CPU nodes. The documented node name is
also just "l2-cache".

The L3 cache is not part of cpu@0/l2-cache as it is shared among all
cores. Move it to /cpus node which is the typical place for shared
caches.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20250403-dt-cpu-schema-v1-3-076be7171a85@kernel.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2025-04-10 14:04:03 +01:00
Rob Herring (Arm)
6833cbdc73 arm64: dts: rockchip: Fix mmc-pwrseq clock name on rock-pi-4
The defined name for "mmc-pwrseq-simple" clock is "ext_clock".

Signed-off-by: "Rob Herring (Arm)" <robh@kernel.org>
Link: https://lore.kernel.org/r/20250409205040.1522754-1-robh@kernel.org
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-04-10 10:58:17 +02:00
Rob Herring (Arm)
c8c59bcac9 arm64: dts: rockchip: Use "regulator-fixed" for btreg on px30-engicam for vcc3v3-btreg
The vcc3v3-btreg regulator only has 1 state and no state gpios defined,
so "regulator-gpio" is not the correct binding to use. "regulator-fixed"
is the correct binding to use. It supports an enable GPIO which is
needed in this case.

Signed-off-by: "Rob Herring (Arm)" <robh@kernel.org>
Link: https://lore.kernel.org/r/20250409205047.1522943-1-robh@kernel.org
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-04-10 10:58:11 +02:00
Charles Keepax
fd15594ba7 soundwire: bus: Fix race on the creation of the IRQ domain
The SoundWire IRQ domain needs to be created before any slaves are added
to the bus, such that the domain is always available when needed. Move
the call to sdw_irq_create() before the calls to sdw_acpi_find_slaves()
and sdw_of_find_slaves().

Fixes: 12a95123bf ("soundwire: bus: Allow SoundWire peripherals to register IRQ handlers")
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20250409122239.1396489-1-ckeepax@opensource.cirrus.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-04-10 11:58:43 +05:30
Sudeep Holla
4567bdaaaa firmware: arm_ffa: Skip Rx buffer ownership release if not acquired
Completion of the FFA_PARTITION_INFO_GET ABI transfers the ownership of
the caller’s Rx buffer from the producer(typically partition mnager) to
the consumer(this driver/OS). FFA_RX_RELEASE transfers the ownership
from the consumer back to the producer.

However, when we set the flag to just return the count of partitions
deployed in the system corresponding to the specified UUID while
invoking FFA_PARTITION_INFO_GET, the Rx buffer ownership shouldn't be
transferred to this driver. We must be able to skip transferring back
the ownership to the partition manager when we request just to get the
count of the partitions as the buffers are not acquired in this case.

Firmware may return FFA_RET_DENIED or other error for the ffa_rx_release()
in such cases.

Fixes: bb1be74985 ("firmware: arm_ffa: Add v1.1 get_partition_info support")
Message-Id: <20250321115700.3525197-1-sudeep.holla@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2025-04-08 10:32:15 +01:00
Cristian Marussi
c23c03bf1f firmware: arm_scmi: Fix timeout checks on polling path
Polling mode transactions wait for a reply busy-looping without holding a
spinlock, but currently the timeout checks are based only on elapsed time:
as a result we could hit a false positive whenever our busy-looping thread
is pre-empted and scheduled out for a time greater than the polling
timeout.

Change the checks at the end of the busy-loop to make sure that the polling
wasn't indeed successful or an out-of-order reply caused the polling to be
forcibly terminated.

Fixes: 31d2f803c1 ("firmware: arm_scmi: Add sync_cmds_completed_on_ret transport flag")
Reported-by: Huangjie <huangjie1663@phytium.com.cn>
Closes: https://lore.kernel.org/arm-scmi/20250123083323.2363749-1-jackhuang021@gmail.com/
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Cc: stable@vger.kernel.org # 5.18.x
Message-Id: <20250310175800.1444293-1-cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2025-04-08 10:31:59 +01:00
Cristian Marussi
9ca67840c0 firmware: arm_scmi: Balance device refcount when destroying devices
Using device_find_child() to lookup the proper SCMI device to destroy
causes an unbalance in device refcount, since device_find_child() calls an
implicit get_device(): this, in turns, inhibits the call of the provided
release methods upon devices destruction.

As a consequence, one of the structures that is not freed properly upon
destruction is the internal struct device_private dev->p populated by the
drivers subsystem core.

KMemleak detects this situation since loading/unloding some SCMI driver
causes related devices to be created/destroyed without calling any
device_release method.

unreferenced object 0xffff00000f583800 (size 512):
  comm "insmod", pid 227, jiffies 4294912190
  hex dump (first 32 bytes):
    00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00  .....N..........
    ff ff ff ff ff ff ff ff 60 36 1d 8a 00 80 ff ff  ........`6......
  backtrace (crc 114e2eed):
    kmemleak_alloc+0xbc/0xd8
    __kmalloc_cache_noprof+0x2dc/0x398
    device_add+0x954/0x12d0
    device_register+0x28/0x40
    __scmi_device_create.part.0+0x1bc/0x380
    scmi_device_create+0x2d0/0x390
    scmi_create_protocol_devices+0x74/0xf8
    scmi_device_request_notifier+0x1f8/0x2a8
    notifier_call_chain+0x110/0x3b0
    blocking_notifier_call_chain+0x70/0xb0
    scmi_driver_register+0x350/0x7f0
    0xffff80000a3b3038
    do_one_initcall+0x12c/0x730
    do_init_module+0x1dc/0x640
    load_module+0x4b20/0x5b70
    init_module_from_file+0xec/0x158

$ ./scripts/faddr2line ./vmlinux device_add+0x954/0x12d0
device_add+0x954/0x12d0:
kmalloc_noprof at include/linux/slab.h:901
(inlined by) kzalloc_noprof at include/linux/slab.h:1037
(inlined by) device_private_init at drivers/base/core.c:3510
(inlined by) device_add at drivers/base/core.c:3561

Balance device refcount by issuing a put_device() on devices found via
device_find_child().

Reported-by: Alice Ryhl <aliceryhl@google.com>
Closes: https://lore.kernel.org/linux-arm-kernel/Z8nK3uFkspy61yjP@arm.com/T/#mc1f73a0ea5e41014fa145147b7b839fc988ada8f
CC: Sudeep Holla <sudeep.holla@arm.com>
CC: Catalin Marinas <catalin.marinas@arm.com>
Fixes: d4f9dddd21 ("firmware: arm_scmi: Add dynamic scmi devices creation")
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Tested-by: Alice Ryhl <aliceryhl@google.com>
Message-Id: <20250306185447.2039336-1-cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2025-04-08 10:31:59 +01:00
David Lechner
2d7b60f33d iio: adc: ad7380: fix event threshold shift
Add required bit shift to the event threshold read function to get
correct scaling.

When alert support was added, the write function correctly included the
required shift needed to convert the threshold register value to the
same scale as the raw ADC value. However, the shift got missed in the
read function.

Fixes: 27d1a4dbe1 ("iio: adc: ad7380: add alert support")
Signed-off-by: David Lechner <dlechner@baylibre.com>
Reviewed-by: Julien Stephan <jstephan@baylibre.com>
Link: https://patch.msgid.link/20250402-iio-adc-ad7380-fix-event-threshold-shift-v1-1-ad4975c296b2@baylibre.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-07 19:32:50 +01:00
Zhang Lixu
79dabbd505 iio: hid-sensor-prox: Fix incorrect OFFSET calculation
The OFFSET calculation in the prox_read_raw() was incorrectly using the
unit exponent, which is intended for SCALE calculations.

Remove the incorrect OFFSET calculation and set it to a fixed value of 0.

Cc: stable@vger.kernel.org
Fixes: 39a3a0138f ("iio: hid-sensors: Added Proximity Sensor Driver")
Signed-off-by: Zhang Lixu <lixu.zhang@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20250331055022.1149736-4-lixu.zhang@intel.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-07 19:32:50 +01:00
Zhang Lixu
8b518cdb03 iio: hid-sensor-prox: support multi-channel SCALE calculation
With the introduction of multi-channel support in commit 596ef5cf65
("iio: hid-sensor-prox: Add support for more channels"), each channel
requires an independent SCALE calculation, but the existing code only
calculates SCALE for a single channel.

Addresses the problem by modifying the driver to perform independent
SCALE calculations for each channel.

Cc: stable@vger.kernel.org
Fixes: 596ef5cf65 ("iio: hid-sensor-prox: Add support for more channels")
Signed-off-by: Zhang Lixu <lixu.zhang@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20250331055022.1149736-3-lixu.zhang@intel.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-07 19:32:49 +01:00
Zhang Lixu
83ded7cfac iio: hid-sensor-prox: Restore lost scale assignments
The variables `scale_pre_decml`, `scale_post_decml`, and `scale_precision`
were assigned in commit d68c592e02 ("iio: hid-sensor-prox: Fix scale not
correct issue"), but due to a merge conflict in
commit 9c15db92a8 ("Merge tag 'iio-for-5.13a' of
https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next"),
these assignments were lost.

Add back lost assignments and replace `st->prox_attr` with
`st->prox_attr[0]` because commit 596ef5cf65 ("iio: hid-sensor-prox: Add
support for more channels") changed `prox_attr` to an array.

Cc: stable@vger.kernel.org # 5.13+
Fixes: 9c15db92a8 ("Merge tag 'iio-for-5.13a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next")
Signed-off-by: Zhang Lixu <lixu.zhang@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20250331055022.1149736-2-lixu.zhang@intel.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-07 19:32:49 +01:00
David Lechner
5257d80e22 iio: adc: ad7606: check for NULL before calling sw_mode_config()
Check that the sw_mode_config function pointer is not NULL before
calling it. Not all buses define this callback, which resulted in a NULL
pointer dereference.

Fixes: e571c19021 ("iio: adc: ad7606: move scale_setup as function pointer on chip-info")
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Signed-off-by: David Lechner <dlechner@baylibre.com>
Link: https://patch.msgid.link/20250318-iio-adc-ad7606-improvements-v2-1-4b605427774c@baylibre.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-07 19:32:49 +01:00
Luca Ceresoli
f063a28002 iio: light: opt3001: fix deadlock due to concurrent flag access
The threaded IRQ function in this driver is reading the flag twice: once to
lock a mutex and once to unlock it. Even though the code setting the flag
is designed to prevent it, there are subtle cases where the flag could be
true at the mutex_lock stage and false at the mutex_unlock stage. This
results in the mutex not being unlocked, resulting in a deadlock.

Fix it by making the opt3001_irq() code generally more robust, reading the
flag into a variable and using the variable value at both stages.

Fixes: 94a9b7b180 ("iio: light: add support for TI's opt3001 light sensor")
Cc: stable@vger.kernel.org
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Link: https://patch.msgid.link/20250321-opt3001-irq-fix-v1-1-6c520d851562@bootlin.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-07 19:32:49 +01:00
David Lechner
82c51ac740 iio: adc: ad7380: disable offload before using SPI bus
Move disabling of the SPI offload before attempting to use the SPI bus
to write a register in ad7380_offload_buffer_predisable().

This caused a crash in the spi_engine_irq() interrupt handler due to
being in an invalid state.

Fixes: bbeaec81a0 ("iio: ad7380: add support for SPI offload")
Signed-off-by: David Lechner <dlechner@baylibre.com>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: Angelo Dureghello <adureghello@baylibre.com>
Link: https://patch.msgid.link/20250320-iio-adc-ad7380-fix-spi-offload-buffer-predisable-v1-1-6912ac8c0ae0@baylibre.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-07 19:32:49 +01:00
Simon Xue
839f81de39 iio: adc: rockchip: Fix clock initialization sequence
clock_set_rate should be executed after devm_clk_get_enabled.

Fixes: 97ad10bb29 ("iio: adc: rockchip_saradc: Make use of devm_clk_get_enabled")
Signed-off-by: Simon Xue <xxm@rock-chips.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patch.msgid.link/20250312062016.137821-1-xxm@rock-chips.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-07 19:32:49 +01:00
Silvano Seva
8114ef86e2 iio: imu: st_lsm6dsx: fix possible lockup in st_lsm6dsx_read_tagged_fifo
Prevent st_lsm6dsx_read_tagged_fifo from falling in an infinite loop in
case pattern_len is equal to zero and the device FIFO is not empty.

Fixes: 801a6e0af0 ("iio: imu: st_lsm6dsx: add support to LSM6DSO")
Signed-off-by: Silvano Seva <s.seva@4sigma.it>
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250311085030.3593-4-s.seva@4sigma.it
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-07 19:32:49 +01:00
Silvano Seva
159ca7f181 iio: imu: st_lsm6dsx: fix possible lockup in st_lsm6dsx_read_fifo
Prevent st_lsm6dsx_read_fifo from falling in an infinite loop in case
pattern_len is equal to zero and the device FIFO is not empty.

Fixes: 290a6ce11d ("iio: imu: add support to lsm6dsx driver")
Signed-off-by: Silvano Seva <s.seva@4sigma.it>
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250311085030.3593-2-s.seva@4sigma.it
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-07 19:32:49 +01:00
Lothar Rubusch
38f67d0264 iio: accel: adxl367: fix setting odr for activity time update
Fix setting the odr value to update activity time based on frequency
derrived by recent odr, and not by obsolete odr value.

The [small] bug: When _adxl367_set_odr() is called with a new odr value,
it first writes the new odr value to the hardware register
ADXL367_REG_FILTER_CTL.
Second, it calls _adxl367_set_act_time_ms(), which calls
adxl367_time_ms_to_samples(). Here st->odr still holds the old odr value.
This st->odr member is used to derrive a frequency value, which is
applied to update ADXL367_REG_TIME_ACT. Hence, the idea is to update
activity time, based on possibilities and power consumption by the
current ODR rate.
Finally, when the function calls return, again in _adxl367_set_odr() the
new ODR is assigned to st->odr.

The fix: When setting a new ODR value is set to ADXL367_REG_FILTER_CTL,
also ADXL367_REG_TIME_ACT should probably be updated with a frequency
based on the recent ODR value and not the old one. Changing the location
of the assignment to st->odr fixes this.

Fixes: cbab791c5e ("iio: accel: add ADXL367 driver")
Signed-off-by: Lothar Rubusch <l.rubusch@gmail.com>
Reviewed-by: Marcelo Schmitt <marcelo.schmitt1@gmail.com>
Link: https://patch.msgid.link/20250309193515.2974-1-l.rubusch@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-07 19:32:49 +01:00
Gustavo Silva
6d03811d7a iio: imu: bmi270: fix initial sampling frequency configuration
In the bmi270_configure_imu() function, the accelerometer and gyroscope
configuration registers are incorrectly written with the mask
BMI270_PWR_CONF_ADV_PWR_SAVE_MSK, which is unrelated to these registers.

As a result, the accelerometer's sampling frequency is set to 200 Hz
instead of the intended 100 Hz.

Remove the mask to ensure the correct bits are set in the configuration
registers.

Fixes: 3ea51548d6 ("iio: imu: Add i2c driver for bmi270 imu")
Signed-off-by: Gustavo Silva <gustavograzs@gmail.com>
Reviewed-by: Alex Lanzano <lanzano.alex@gmail.com>
Link: https://patch.msgid.link/20250304-bmi270-odr-fix-v1-1-384dbcd699fb@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-04-07 19:32:48 +01:00
Uwe Kleine-König
7fc027e894 arm64: dts: rockchip: Add pinmuxing for eMMC on QNAP TS433
Until now the emmc worked when booting because the bootrom set up the
pin config correctly to load the initial bootloader from it.

So when the kernel started it "just" reused this setup but never made
sure it was actually correct.

This then breaks when the system is started via some other means, like
downloading the initial bootloader via the bootrom usb download.

With actual emmc pin-config added, barebox is able to access the eMMC
even when booted via USB.

Fixes: 9da1c0327d ("arm64: dts: rockchip: Add basic support for QNAP TS-433")
Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org>
[refined commit message to explain that we're currently just running
 on bootom-goodwill]
Link: https://lore.kernel.org/r/20250319113138.125192-2-uwe@kleine-koenig.org
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-04-07 09:15:00 +02:00
Dragan Simic
e0bd7ecf6b arm64: dts: rockchip: Remove overdrive-mode OPPs from RK3588J SoC dtsi
The differences in the vendor-approved CPU and GPU OPPs for the standard
Rockchip RK3588 variant [1] and the industrial Rockchip RK3588J variant [2]
come from the latter, presumably, supporting an extended temperature range
that's usually associated with industrial applications, despite the two SoC
variant datasheets specifying the same upper limit for the allowed ambient
temperature for both variants.  However, the lower temperature limit is
specified much lower for the RK3588J variant. [1][2]

To be on the safe side and to ensure maximum longevity of the RK3588J SoCs,
only the CPU and GPU OPPs that are declared by the vendor to be always safe
for this SoC variant may be provided.  As explained by the vendor [3] and
according to the RK3588J datasheet, [2] higher-frequency/higher-voltage
CPU and GPU OPPs can be used as well, but at the risk of reducing the SoC
lifetime expectancy.  Presumably, using the higher OPPs may be safe only
when not enjoying the assumed extended temperature range that the RK3588J,
as an SoC variant targeted specifically at higher-temperature, industrial
applications, is made (or binned) for.

Anyone able to keep their RK3588J-based board outside the above-presumed
extended temperature range at all times, and willing to take the associated
risk of possibly reducing the SoC lifetime expectancy, is free to apply
a DT overlay that adds the higher CPU and GPU OPPs.

With all this and the downstream RK3588(J) DT definitions [4][5] in mind,
let's delete the RK3588J CPU and GPU OPPs that are not considered belonging
to the normal operation mode for this SoC variant.  To quote the RK3588J
datasheet [2], "normal mode means the chipset works under safety voltage
and frequency;  for the industrial environment, highly recommend to keep in
normal mode, the lifetime is reasonably guaranteed", while "overdrive mode
brings higher frequency, and the voltage will increase accordingly;  under
the overdrive mode for a long time, the chipset may shorten the lifetime,
especially in high-temperature condition".

To sum the RK3588J datasheet [2] and the vendor-provided DTs up, [4][5]
the maximum allowed CPU core, GPU and NPU frequencies are as follows:

   IP core    | Normal mode | Overdrive mode
  ------------+-------------+----------------
   Cortex-A55 |   1,296 MHz |      1,704 MHz
   Cortex-A76 |   1,608 MHz |      2,016 MHz
   GPU        |     700 MHz |        850 MHz
   NPU        |     800 MHz |        950 MHz

Unfortunately, when it comes to the actual voltages for the RK3588J CPU and
GPU OPPs, there's a discrepancy between the RK3588J datasheet [2] and the
downstream kernel code. [4][5]  The RK3588J datasheet states that "the max.
working voltage of CPU/GPU/NPU is 0.75 V under the normal mode", while the
downstream kernel code actually allows voltage ranges that go up to 0.95 V,
which is still within the voltage range allowed by the datasheet.  However,
the RK3588J datasheet also tells us to "strictly refer to the software
configuration of SDK and the hardware reference design", so let's embrace
the voltage ranges provided by the downstream kernel code, which also
prevents the undesirable theoretical outcome of ending up with no usable
OPPs on a particular board, as a result of the board's voltage regulator(s)
being unable to deliver the exact voltages, for whatever reason.

The above-described voltage ranges for the RK3588J CPU OPPs remain taken
from the downstream kernel code [4][5] by picking the highest, worst-bin
values, which ensure that all RK3588J bins will work reliably.  Yes, with
some power inevitably wasted as unnecessarily generated heat, but the
reliability is paramount, together with the longevity.  This deficiency
may be revisited separately at some point in the future.

The provided RK3588J CPU OPPs follow the slightly debatable "provide only
the highest-frequency OPP from the same-voltage group" approach that's been
established earlier, [6] as a result of the "same-voltage, lower-frequency"
OPPs being considered inefficient from the IPA governor's standpoint, which
may also be revisited separately at some point in the future.

[1] https://wiki.friendlyelec.com/wiki/images/e/ee/Rockchip_RK3588_Datasheet_V1.6-20231016.pdf
[2] https://wmsc.lcsc.com/wmsc/upload/file/pdf/v2/lcsc/2403201054_Rockchip-RK3588J_C22364189.pdf
[3] https://lore.kernel.org/linux-rockchip/e55125ed-64fb-455e-b1e4-cebe2cf006e4@cherry.de/T/#u
[4] https://raw.githubusercontent.com/rockchip-linux/kernel/604cec4004abe5a96c734f2fab7b74809d2d742f/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
[5] https://raw.githubusercontent.com/rockchip-linux/kernel/604cec4004abe5a96c734f2fab7b74809d2d742f/arch/arm64/boot/dts/rockchip/rk3588j.dtsi
[6] https://lore.kernel.org/all/20240229-rk-dts-additions-v3-5-6afe8473a631@gmail.com/

Fixes: 667885a686 ("arm64: dts: rockchip: Add OPP data for CPU cores on RK3588j")
Fixes: a7b2070505 ("arm64: dts: rockchip: Split GPU OPPs of RK3588 and RK3588j")
Cc: stable@vger.kernel.org
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Alexey Charkov <alchark@gmail.com>
Helped-by: Quentin Schulz <quentin.schulz@cherry.de>
Reviewed-by: Quentin Schulz <quentin.schulz@cherry.de>
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Link: https://lore.kernel.org/r/eeec0d30d79b019d111b3f0aa2456e69896b2caa.1742813866.git.dsimic@manjaro.org
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-04-07 09:15:00 +02:00
Sam Edwards
fdc7bd909a arm64: dts: rockchip: Allow Turing RK1 cooling fan to spin down
The RK3588 thermal sensor driver only receives interrupts when a
higher-temperature threshold is crossed; it cannot notify when the
sensor cools back off. As a result, the driver must poll for temperature
changes to detect when the conditions for a thermal trip are no longer
met. However, it only does so if the DT enables polling.

Before this patch, the RK1 DT did not enable polling, causing the fan to
continue running at the speed corresponding to the highest temperature
reached.

Follow suit with similar RK3588 boards by setting a polling-delay of
1000ms, enabling the driver to detect when the sensor cools back off,
allowing the fan speed to decrease as appropriate.

Fixes: 7c8ec5e6b9 ("arm64: dts: rockchip: Enable automatic fan control on Turing RK1")
Cc: stable@kernel.org # v6.13+
Signed-off-by: Sam Edwards <CFSworks@gmail.com>
Reviewed-by: Dragan Simic <dsimic@manjaro.org>
Link: https://lore.kernel.org/r/20250329165017.3885-1-CFSworks@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-04-07 09:14:59 +02:00
Arnd Bergmann
d51adf038e media: cec: tda9950: add back i2c dependency
drivers/media/cec/i2c/tda9950.c: In function 'tda9950_write_range':
drivers/media/cec/i2c/tda9950.c:92:15: error: implicit declaration of function 'i2c_transfer' [-Wimplicit-function-declaration]
   92 |         ret = i2c_transfer(client->adapter, &msg, 1);
      |               ^~~~~~~~~~~~
drivers/media/cec/i2c/tda9950.c: In function 'tda9950_probe':
drivers/media/cec/i2c/tda9950.c:391:14: error: implicit declaration of function 'i2c_check_functionality' [-Wimplicit-function-declaration]
  391 |         if (!i2c_check_functionality(client->adapter, I2C_FUNC_I2C)) {
      |              ^~~~~~~~~~~~~~~~~~~~~~~

Fixes: caa6f4a75e ("media: cec: move driver for TDA9950 from drm/i2c")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2025-04-07 08:12:12 +02:00
Hans Verkuil
118b34092e media: i2c: lt6911uxe: add two selects to Kconfig
In order to get the v4l2_subdev functions you need to select
MEDIA_CONTROLLER and VIDEO_V4L2_SUBDEV_API.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202503151002.HacBN2LO-lkp@intel.com/
Fixes: e49563c3be ("media: i2c: add lt6911uxe hdmi bridge driver")
2025-04-07 08:10:39 +02:00
Geert Uytterhoeven
0dce5b44bd media: platform: synopsys: VIDEO_SYNOPSYS_HDMIRX should depend on ARCH_ROCKCHIP
For now, the Synopsys HDMI HDMI RX Controller is only supported on
Rockchip RK3588 SoCs.  Hence add a dependency on ARCH_ROCKCHIP, to
prevent asking the user about this driver when configuring a kernel
without Rockchip SoC support.

Fixes: 7b59b132ad ("media: platform: synopsys: Add support for HDMI input driver")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Shreeya Patel <shreeya.patel@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2025-04-07 08:10:39 +02:00
Arnd Bergmann
9df181c8de media: i2c: lt6911uxe: Fix Kconfig dependencies:
The new driver fails to build if I2C is disabled:

drivers/media/i2c/lt6911uxe.c:703:1: error: data definition has no type or storage class [-Werror]
  703 | module_i2c_driver(lt6911uxe_i2c_driver);

or if I2C is on but V4L2_CCI_I2C is not:

ERROR: modpost: "cci_write" [drivers/media/i2c/lt6911uxe.ko] undefined!
ERROR: modpost: "cci_read" [drivers/media/i2c/lt6911uxe.ko] undefined!

For both by adding a dependency on I2C and selecting V4L2_CCI_I2C, which
follows the common practice for these.

Fixes: e49563c3be ("media: i2c: add lt6911uxe hdmi bridge driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2025-04-07 08:10:39 +02:00
Arnd Bergmann
549d899444 media: vivid: fix FB dependency
It's not enough to have a dependency on CONFIG_FB, as that can be in
a loadable module when vivid itself is builtin:

drivers/media/test-drivers/vivid/vivid-osd.o: in function `vivid_fb_init':
vivid-osd.c:(.text+0xdc0): undefined reference to `fb_alloc_cmap'
vivid-osd.c:(.text+0xe26): undefined reference to `register_framebuffer'

Change the dependency to only allow configurations that can be built,
but change the FB to FB_CORE so this is also possible when using
DRM with FB compatibility rather than full fbdev.

Fixes: 20889ddede ("media: vivid: Introduce VIDEO_VIVID_OSD")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Ricardo Ribalda <ribalda@chromium.org>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2025-04-07 08:10:38 +02:00
Tudor Ambarus
2908ffa53f firmware: exynos-acpm: check saved RX before bailing out on empty RX queue
When we're polling for responses and get a response that corresponds to
another request, we save the RX data in order to drain the RX queue.

If the response for the current request is not found in the request's
iteration of the queue, or if the queue is empty, we must check whether
the RX data was saved by a previous request when it drained the RX queue.

We failed to check for already saved responses when the queue was empty,
and requests could time out. Check saved RX before bailing out on empty
RX queue.

Fixes: a88927b534 ("firmware: add Exynos ACPM protocol driver")
Reported-by: André Draszik <andre.draszik@linaro.org>
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Reviewed-by: André Draszik <andre.draszik@linaro.org>
Tested-by: André Draszik <andre.draszik@linaro.org>
Link: https://lore.kernel.org/r/20250324-acpm-drained-rx-queue-v1-1-577774335151@linaro.org
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
2025-04-07 08:02:19 +02:00
737 changed files with 9285 additions and 3824 deletions

View File

@@ -7,5 +7,5 @@ check-private-items = true
disallowed-macros = [
# The `clippy::dbg_macro` lint only works with `std::dbg!`, thus we simulate
# it here, see: https://github.com/rust-lang/rust-clippy/issues/11303.
{ path = "kernel::dbg", reason = "the `dbg!` macro is intended as a debugging tool" },
{ path = "kernel::dbg", reason = "the `dbg!` macro is intended as a debugging tool", allow-invalid = true },
]

View File

@@ -102,6 +102,7 @@ Ard Biesheuvel <ardb@kernel.org> <ard.biesheuvel@linaro.org>
Arnaud Patard <arnaud.patard@rtp-net.org>
Arnd Bergmann <arnd@arndb.de>
Arun Kumar Neelakantam <quic_aneela@quicinc.com> <aneela@codeaurora.org>
Asahi Lina <lina+kernel@asahilina.net> <lina@asahilina.net>
Ashok Raj Nagarajan <quic_arnagara@quicinc.com> <arnagara@codeaurora.org>
Ashwin Chaugule <quic_ashwinc@quicinc.com> <ashwinc@codeaurora.org>
Asutosh Das <quic_asutoshd@quicinc.com> <asutoshd@codeaurora.org>
@@ -312,6 +313,7 @@ Jan Glauber <jan.glauber@gmail.com> <jglauber@cavium.com>
Jan Kuliga <jtkuliga.kdev@gmail.com> <jankul@alatek.krakow.pl>
Jarkko Sakkinen <jarkko@kernel.org> <jarkko.sakkinen@linux.intel.com>
Jarkko Sakkinen <jarkko@kernel.org> <jarkko@profian.com>
Jarkko Sakkinen <jarkko@kernel.org> <jarkko.sakkinen@opinsys.com>
Jason Gunthorpe <jgg@ziepe.ca> <jgg@mellanox.com>
Jason Gunthorpe <jgg@ziepe.ca> <jgg@nvidia.com>
Jason Gunthorpe <jgg@ziepe.ca> <jgunthorpe@obsidianresearch.com>
@@ -447,6 +449,8 @@ Luca Ceresoli <luca.ceresoli@bootlin.com> <luca@lucaceresoli.net>
Luca Weiss <luca@lucaweiss.eu> <luca@z3ntu.xyz>
Lukasz Luba <lukasz.luba@arm.com> <l.luba@partner.samsung.com>
Luo Jie <quic_luoj@quicinc.com> <luoj@codeaurora.org>
Lance Yang <lance.yang@linux.dev> <ioworker0@gmail.com>
Lance Yang <lance.yang@linux.dev> <mingzhe.yang@ly.com>
Maciej W. Rozycki <macro@mips.com> <macro@imgtec.com>
Maciej W. Rozycki <macro@orcam.me.uk> <macro@linux-mips.org>
Maharaja Kennadyrajan <quic_mkenna@quicinc.com> <mkenna@codeaurora.org>
@@ -483,6 +487,7 @@ Matthias Fuchs <socketcan@esd.eu> <matthias.fuchs@esd.eu>
Matthieu Baerts <matttbe@kernel.org> <matthieu.baerts@tessares.net>
Matthieu CASTET <castet.matthieu@free.fr>
Matti Vaittinen <mazziesaccount@gmail.com> <matti.vaittinen@fi.rohmeurope.com>
Mattijs Korpershoek <mkorpershoek@kernel.org> <mkorpershoek@baylibre.com>
Matt Ranostay <matt@ranostay.sg> <matt.ranostay@konsulko.com>
Matt Ranostay <matt@ranostay.sg> <matt@ranostay.consulting>
Matt Ranostay <matt@ranostay.sg> Matthew Ranostay <mranostay@embeddedalley.com>
@@ -749,6 +754,7 @@ Tvrtko Ursulin <tursulin@ursulin.net> <tvrtko@ursulin.net>
Tycho Andersen <tycho@tycho.pizza> <tycho@tycho.ws>
Tzung-Bi Shih <tzungbi@kernel.org> <tzungbi@google.com>
Uwe Kleine-König <ukleinek@informatik.uni-freiburg.de>
Uwe Kleine-König <u.kleine-koenig@baylibre.com> <ukleinek@baylibre.com>
Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Uwe Kleine-König <ukleinek@strlen.de>
Uwe Kleine-König <ukl@pengutronix.de>

View File

@@ -511,6 +511,7 @@ Description: information about CPUs heterogeneity.
What: /sys/devices/system/cpu/vulnerabilities
/sys/devices/system/cpu/vulnerabilities/gather_data_sampling
/sys/devices/system/cpu/vulnerabilities/indirect_target_selection
/sys/devices/system/cpu/vulnerabilities/itlb_multihit
/sys/devices/system/cpu/vulnerabilities/l1tf
/sys/devices/system/cpu/vulnerabilities/mds

View File

@@ -1,6 +1,6 @@
What: /sys/bus/hid/drivers/hid-appletb-kbd/<dev>/mode
Date: September, 2023
KernelVersion: 6.5
Date: March, 2025
KernelVersion: 6.15
Contact: linux-input@vger.kernel.org
Description:
The set of keys displayed on the Touch Bar.

View File

@@ -23,3 +23,4 @@ are configurable at compile, boot or run time.
gather_data_sampling
reg-file-data-sampling
rsb
indirect-target-selection

View File

@@ -0,0 +1,168 @@
.. SPDX-License-Identifier: GPL-2.0
Indirect Target Selection (ITS)
===============================
ITS is a vulnerability in some Intel CPUs that support Enhanced IBRS and were
released before Alder Lake. ITS may allow an attacker to control the prediction
of indirect branches and RETs located in the lower half of a cacheline.
ITS is assigned CVE-2024-28956 with a CVSS score of 4.7 (Medium).
Scope of Impact
---------------
- **eIBRS Guest/Host Isolation**: Indirect branches in KVM/kernel may still be
predicted with unintended target corresponding to a branch in the guest.
- **Intra-Mode BTI**: In-kernel training such as through cBPF or other native
gadgets.
- **Indirect Branch Prediction Barrier (IBPB)**: After an IBPB, indirect
branches may still be predicted with targets corresponding to direct branches
executed prior to the IBPB. This is fixed by the IPU 2025.1 microcode, which
should be available via distro updates. Alternatively microcode can be
obtained from Intel's github repository [#f1]_.
Affected CPUs
-------------
Below is the list of ITS affected CPUs [#f2]_ [#f3]_:
======================== ============ ==================== ===============
Common name Family_Model eIBRS Intra-mode BTI
Guest/Host Isolation
======================== ============ ==================== ===============
SKYLAKE_X (step >= 6) 06_55H Affected Affected
ICELAKE_X 06_6AH Not affected Affected
ICELAKE_D 06_6CH Not affected Affected
ICELAKE_L 06_7EH Not affected Affected
TIGERLAKE_L 06_8CH Not affected Affected
TIGERLAKE 06_8DH Not affected Affected
KABYLAKE_L (step >= 12) 06_8EH Affected Affected
KABYLAKE (step >= 13) 06_9EH Affected Affected
COMETLAKE 06_A5H Affected Affected
COMETLAKE_L 06_A6H Affected Affected
ROCKETLAKE 06_A7H Not affected Affected
======================== ============ ==================== ===============
- All affected CPUs enumerate Enhanced IBRS feature.
- IBPB isolation is affected on all ITS affected CPUs, and need a microcode
update for mitigation.
- None of the affected CPUs enumerate BHI_CTRL which was introduced in Golden
Cove (Alder Lake and Sapphire Rapids). This can help guests to determine the
host's affected status.
- Intel Atom CPUs are not affected by ITS.
Mitigation
----------
As only the indirect branches and RETs that have their last byte of instruction
in the lower half of the cacheline are vulnerable to ITS, the basic idea behind
the mitigation is to not allow indirect branches in the lower half.
This is achieved by relying on existing retpoline support in the kernel, and in
compilers. ITS-vulnerable retpoline sites are runtime patched to point to newly
added ITS-safe thunks. These safe thunks consists of indirect branch in the
second half of the cacheline. Not all retpoline sites are patched to thunks, if
a retpoline site is evaluated to be ITS-safe, it is replaced with an inline
indirect branch.
Dynamic thunks
~~~~~~~~~~~~~~
From a dynamically allocated pool of safe-thunks, each vulnerable site is
replaced with a new thunk, such that they get a unique address. This could
improve the branch prediction accuracy. Also, it is a defense-in-depth measure
against aliasing.
Note, for simplicity, indirect branches in eBPF programs are always replaced
with a jump to a static thunk in __x86_indirect_its_thunk_array. If required,
in future this can be changed to use dynamic thunks.
All vulnerable RETs are replaced with a static thunk, they do not use dynamic
thunks. This is because RETs get their prediction from RSB mostly that does not
depend on source address. RETs that underflow RSB may benefit from dynamic
thunks. But, RETs significantly outnumber indirect branches, and any benefit
from a unique source address could be outweighed by the increased icache
footprint and iTLB pressure.
Retpoline
~~~~~~~~~
Retpoline sequence also mitigates ITS-unsafe indirect branches. For this
reason, when retpoline is enabled, ITS mitigation only relocates the RETs to
safe thunks. Unless user requested the RSB-stuffing mitigation.
RSB Stuffing
~~~~~~~~~~~~
RSB-stuffing via Call Depth Tracking is a mitigation for Retbleed RSB-underflow
attacks. And it also mitigates RETs that are vulnerable to ITS.
Mitigation in guests
^^^^^^^^^^^^^^^^^^^^
All guests deploy ITS mitigation by default, irrespective of eIBRS enumeration
and Family/Model of the guest. This is because eIBRS feature could be hidden
from a guest. One exception to this is when a guest enumerates BHI_DIS_S, which
indicates that the guest is running on an unaffected host.
To prevent guests from unnecessarily deploying the mitigation on unaffected
platforms, Intel has defined ITS_NO bit(62) in MSR IA32_ARCH_CAPABILITIES. When
a guest sees this bit set, it should not enumerate the ITS bug. Note, this bit
is not set by any hardware, but is **intended for VMMs to synthesize** it for
guests as per the host's affected status.
Mitigation options
^^^^^^^^^^^^^^^^^^
The ITS mitigation can be controlled using the "indirect_target_selection"
kernel parameter. The available options are:
======== ===================================================================
on (default) Deploy the "Aligned branch/return thunks" mitigation.
If spectre_v2 mitigation enables retpoline, aligned-thunks are only
deployed for the affected RET instructions. Retpoline mitigates
indirect branches.
off Disable ITS mitigation.
vmexit Equivalent to "=on" if the CPU is affected by guest/host isolation
part of ITS. Otherwise, mitigation is not deployed. This option is
useful when host userspace is not in the threat model, and only
attacks from guest to host are considered.
stuff Deploy RSB-fill mitigation when retpoline is also deployed.
Otherwise, deploy the default mitigation. When retpoline mitigation
is enabled, RSB-stuffing via Call-Depth-Tracking also mitigates
ITS.
force Force the ITS bug and deploy the default mitigation.
======== ===================================================================
Sysfs reporting
---------------
The sysfs file showing ITS mitigation status is:
/sys/devices/system/cpu/vulnerabilities/indirect_target_selection
Note, microcode mitigation status is not reported in this file.
The possible values in this file are:
.. list-table::
* - Not affected
- The processor is not vulnerable.
* - Vulnerable
- System is vulnerable and no mitigation has been applied.
* - Vulnerable, KVM: Not affected
- System is vulnerable to intra-mode BTI, but not affected by eIBRS
guest/host isolation.
* - Mitigation: Aligned branch/return thunks
- The mitigation is enabled, affected indirect branches and RETs are
relocated to safe thunks.
* - Mitigation: Retpolines, Stuffing RSB
- The mitigation is enabled using retpoline and RSB stuffing.
References
----------
.. [#f1] Microcode repository - https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files
.. [#f2] Affected Processors list - https://www.intel.com/content/www/us/en/developer/topic-technology/software-security-guidance/processors-affected-consolidated-product-cpu-model.html
.. [#f3] Affected Processors list (machine readable) - https://github.com/intel/Intel-affected-processor-list

View File

@@ -2202,6 +2202,23 @@
different crypto accelerators. This option can be used
to achieve best performance for particular HW.
indirect_target_selection= [X86,Intel] Mitigation control for Indirect
Target Selection(ITS) bug in Intel CPUs. Updated
microcode is also required for a fix in IBPB.
on: Enable mitigation (default).
off: Disable mitigation.
force: Force the ITS bug and deploy default
mitigation.
vmexit: Only deploy mitigation if CPU is affected by
guest/host isolation part of ITS.
stuff: Deploy RSB-fill mitigation when retpoline is
also deployed. Otherwise, deploy the default
mitigation.
For details see:
Documentation/admin-guide/hw-vuln/indirect-target-selection.rst
init= [KNL]
Format: <full_path>
Run specified binary instead of /sbin/init as init
@@ -3693,6 +3710,7 @@
expose users to several CPU vulnerabilities.
Equivalent to: if nokaslr then kpti=0 [ARM64]
gather_data_sampling=off [X86]
indirect_target_selection=off [X86]
kvm.nx_huge_pages=off [X86]
l1tf=off [X86]
mds=off [X86]

View File

@@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml#
title: Mediatek's Keypad Controller
maintainers:
- Mattijs Korpershoek <mkorpershoek@baylibre.com>
- Mattijs Korpershoek <mkorpershoek@kernel.org>
allOf:
- $ref: /schemas/input/matrix-keymap.yaml#

View File

@@ -1,7 +1,7 @@
# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
%YAML 1.2
---
$id: http://devicetree.org/schemas/can/microchip,mcp2510.yaml#
$id: http://devicetree.org/schemas/net/can/microchip,mcp2510.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: Microchip MCP251X stand-alone CAN controller

View File

@@ -74,19 +74,17 @@ properties:
- rev-rmii
- moca
# RX and TX delays are added by the MAC when required
# RX and TX delays are provided by the PCB. See below
- rgmii
# RGMII with internal RX and TX delays provided by the PHY,
# the MAC should not add the RX or TX delays in this case
# RX and TX delays are not provided by the PCB. This is the most
# frequent case. See below
- rgmii-id
# RGMII with internal RX delay provided by the PHY, the MAC
# should not add an RX delay in this case
# TX delay is provided by the PCB. See below
- rgmii-rxid
# RGMII with internal TX delay provided by the PHY, the MAC
# should not add an TX delay in this case
# RX delay is provided by the PCB. See below
- rgmii-txid
- rtbi
- smii
@@ -286,4 +284,89 @@ allOf:
additionalProperties: true
# Informative
# ===========
#
# 'phy-modes' & 'phy-connection-type' properties 'rgmii', 'rgmii-id',
# 'rgmii-rxid', and 'rgmii-txid' are frequently used wrongly by
# developers. This informative section clarifies their usage.
#
# The RGMII specification requires a 2ns delay between the data and
# clock signals on the RGMII bus. How this delay is implemented is not
# specified.
#
# One option is to make the clock traces on the PCB longer than the
# data traces. A sufficiently difference in length can provide the 2ns
# delay. If both the RX and TX delays are implemented in this manner,
# 'rgmii' should be used, so indicating the PCB adds the delays.
#
# If the PCB does not add these delays via extra long traces,
# 'rgmii-id' should be used. Here, 'id' refers to 'internal delay',
# where either the MAC or PHY adds the delay.
#
# If only one of the two delays are implemented via extra long clock
# lines, either 'rgmii-rxid' or 'rgmii-txid' should be used,
# indicating the MAC or PHY should implement one of the delays
# internally, while the PCB implements the other delay.
#
# Device Tree describes hardware, and in this case, it describes the
# PCB between the MAC and the PHY, if the PCB implements delays or
# not.
#
# In practice, very few PCBs make use of extra long clock lines. Hence
# any RGMII phy mode other than 'rgmii-id' is probably wrong, and is
# unlikely to be accepted during review without details provided in
# the commit description and comments in the .dts file.
#
# When the PCB does not implement the delays, the MAC or PHY must. As
# such, this is software configuration, and so not described in Device
# Tree.
#
# The following describes how Linux implements the configuration of
# the MAC and PHY to add these delays when the PCB does not. As stated
# above, developers often get this wrong, and the aim of this section
# is reduce the frequency of these errors by Linux developers. Other
# users of the Device Tree may implement it differently, and still be
# consistent with both the normative and informative description
# above.
#
# By default in Linux, when using phylib/phylink, the MAC is expected
# to read the 'phy-mode' from Device Tree, not implement any delays,
# and pass the value to the PHY. The PHY will then implement delays as
# specified by the 'phy-mode'. The PHY should always be reconfigured
# to implement the needed delays, replacing any setting performed by
# strapping or the bootloader, etc.
#
# Experience to date is that all PHYs which implement RGMII also
# implement the ability to add or not add the needed delays. Hence
# this default is expected to work in all cases. Ignoring this default
# is likely to be questioned by Reviews, and require a strong argument
# to be accepted.
#
# There are a small number of cases where the MAC has hard coded
# delays which cannot be disabled. The 'phy-mode' only describes the
# PCB. The inability to disable the delays in the MAC does not change
# the meaning of 'phy-mode'. It does however mean that a 'phy-mode' of
# 'rgmii' is now invalid, it cannot be supported, since both the PCB
# and the MAC and PHY adding delays cannot result in a functional
# link. Thus the MAC should report a fatal error for any modes which
# cannot be supported. When the MAC implements the delay, it must
# ensure that the PHY does not also implement the same delay. So it
# must modify the phy-mode it passes to the PHY, removing the delay it
# has added. Failure to remove the delay will result in a
# non-functioning link.
#
# Sometimes there is a need to fine tune the delays. Often the MAC or
# PHY can perform this fine tuning. In the MAC node, the Device Tree
# properties 'rx-internal-delay-ps' and 'tx-internal-delay-ps' should
# be used to indicate fine tuning performed by the MAC. The values
# expected here are small. A value of 2000ps, i.e 2ns, and a phy-mode
# of 'rgmii' will not be accepted by Reviewers.
#
# If the PHY is to perform fine tuning, the properties
# 'rx-internal-delay-ps' and 'tx-internal-delay-ps' in the PHY node
# should be used. When the PHY is implementing delays, e.g. 'rgmii-id'
# these properties should have a value near to 2000ps. If the PCB is
# implementing delays, e.g. 'rgmii', a small value can be used to fine
# tune the delay added by the PCB.
...

View File

@@ -46,6 +46,21 @@ The kernel embeds the building user and host names in
`KBUILD_BUILD_USER and KBUILD_BUILD_HOST`_ variables. If you are
building from a git commit, you could use its committer address.
Absolute filenames
------------------
When the kernel is built out-of-tree, debug information may include
absolute filenames for the source files. This must be overridden by
including the ``-fdebug-prefix-map`` option in the `KCFLAGS`_ variable.
Depending on the compiler used, the ``__FILE__`` macro may also expand
to an absolute filename in an out-of-tree build. Kbuild automatically
uses the ``-fmacro-prefix-map`` option to prevent this, if it is
supported.
The Reproducible Builds web site has more information about these
`prefix-map options`_.
Generated files in source packages
----------------------------------
@@ -116,5 +131,7 @@ See ``scripts/setlocalversion`` for details.
.. _KBUILD_BUILD_TIMESTAMP: kbuild.html#kbuild-build-timestamp
.. _KBUILD_BUILD_USER and KBUILD_BUILD_HOST: kbuild.html#kbuild-build-user-kbuild-build-host
.. _KCFLAGS: kbuild.html#kcflags
.. _prefix-map options: https://reproducible-builds.org/docs/build-path/
.. _Reproducible Builds project: https://reproducible-builds.org/
.. _SOURCE_DATE_EPOCH: https://reproducible-builds.org/docs/source-date-epoch/

View File

@@ -2017,7 +2017,8 @@ attribute-sets:
attributes:
-
name: act
type: nest
type: indexed-array
sub-type: nest
nested-attributes: tc-act-attrs
-
name: police
@@ -2250,7 +2251,8 @@ attribute-sets:
attributes:
-
name: act
type: nest
type: indexed-array
sub-type: nest
nested-attributes: tc-act-attrs
-
name: police
@@ -2745,7 +2747,7 @@ attribute-sets:
type: u16
byte-order: big-endian
-
name: key-l2-tpv3-sid
name: key-l2tpv3-sid
type: u32
byte-order: big-endian
-
@@ -3504,7 +3506,7 @@ attribute-sets:
name: rate64
type: u64
-
name: prate4
name: prate64
type: u64
-
name: burst

View File

@@ -811,11 +811,9 @@ Documentation/devicetree/bindings/ptp/timestamper.txt for more details.
3.2.4 Other caveats for MAC drivers
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Stacked PHCs, especially DSA (but not only) - since that doesn't require any
modification to MAC drivers, so it is more difficult to ensure correctness of
all possible code paths - is that they uncover bugs which were impossible to
trigger before the existence of stacked PTP clocks. One example has to do with
this line of code, already presented earlier::
The use of stacked PHCs may uncover MAC driver bugs which were impossible to
trigger without them. One example has to do with this line of code, already
presented earlier::
skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;

View File

@@ -2519,6 +2519,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux.git
F: arch/arm/boot/dts/nxp/imx/
F: arch/arm/boot/dts/nxp/mxs/
F: arch/arm64/boot/dts/freescale/
X: Documentation/devicetree/bindings/media/i2c/
X: arch/arm64/boot/dts/freescale/fsl-*
X: arch/arm64/boot/dts/freescale/qoriq-*
X: drivers/media/i2c/
@@ -8726,6 +8727,7 @@ M: Chao Yu <chao@kernel.org>
R: Yue Hu <zbestahu@gmail.com>
R: Jeffle Xu <jefflexu@linux.alibaba.com>
R: Sandeep Dhavale <dhavale@google.com>
R: Hongbo Li <lihongbo22@huawei.com>
L: linux-erofs@lists.ozlabs.org
S: Maintained
W: https://erofs.docs.kernel.org
@@ -10145,6 +10147,13 @@ F: drivers/gpio/gpio-regmap.c
F: include/linux/gpio/regmap.h
K: (devm_)?gpio_regmap_(un)?register
GPIO SLOPPY LOGIC ANALYZER
M: Wolfram Sang <wsa+renesas@sang-engineering.com>
S: Supported
F: Documentation/dev-tools/gpio-sloppy-logic-analyzer.rst
F: drivers/gpio/gpio-sloppy-logic-analyzer.c
F: tools/gpio/gpio-sloppy-logic-analyzer.sh
GPIO SUBSYSTEM
M: Linus Walleij <linus.walleij@linaro.org>
M: Bartosz Golaszewski <brgl@bgdev.pl>
@@ -11097,6 +11106,14 @@ L: linuxppc-dev@lists.ozlabs.org
S: Odd Fixes
F: drivers/tty/hvc/
HUNG TASK DETECTOR
M: Andrew Morton <akpm@linux-foundation.org>
R: Lance Yang <lance.yang@linux.dev>
L: linux-kernel@vger.kernel.org
S: Maintained
F: include/linux/hung_task.h
F: kernel/hung_task.c
I2C ACPI SUPPORT
M: Mika Westerberg <westeri@kernel.org>
L: linux-i2c@vger.kernel.org
@@ -11235,7 +11252,6 @@ S: Maintained
F: drivers/i2c/busses/i2c-cht-wc.c
I2C/SMBUS ISMT DRIVER
M: Seth Heasley <seth.heasley@intel.com>
M: Neil Horman <nhorman@tuxdriver.com>
L: linux-i2c@vger.kernel.org
F: Documentation/i2c/busses/i2c-ismt.rst
@@ -15071,7 +15087,7 @@ F: Documentation/devicetree/bindings/media/mediatek-jpeg-*.yaml
F: drivers/media/platform/mediatek/jpeg/
MEDIATEK KEYPAD DRIVER
M: Mattijs Korpershoek <mkorpershoek@baylibre.com>
M: Mattijs Korpershoek <mkorpershoek@kernel.org>
S: Supported
F: Documentation/devicetree/bindings/input/mediatek,mt6779-keypad.yaml
F: drivers/input/keyboard/mt6779-keypad.c
@@ -15494,24 +15510,45 @@ F: Documentation/mm/
F: include/linux/gfp.h
F: include/linux/gfp_types.h
F: include/linux/memfd.h
F: include/linux/memory.h
F: include/linux/memory_hotplug.h
F: include/linux/memory-tiers.h
F: include/linux/mempolicy.h
F: include/linux/mempool.h
F: include/linux/memremap.h
F: include/linux/mm.h
F: include/linux/mm_*.h
F: include/linux/mmzone.h
F: include/linux/mmu_notifier.h
F: include/linux/pagewalk.h
F: include/linux/rmap.h
F: include/trace/events/ksm.h
F: mm/
F: tools/mm/
F: tools/testing/selftests/mm/
N: include/linux/page[-_]*
MEMORY MANAGEMENT - CORE
M: Andrew Morton <akpm@linux-foundation.org>
M: David Hildenbrand <david@redhat.com>
R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
R: Liam R. Howlett <Liam.Howlett@oracle.com>
R: Vlastimil Babka <vbabka@suse.cz>
R: Mike Rapoport <rppt@kernel.org>
R: Suren Baghdasaryan <surenb@google.com>
R: Michal Hocko <mhocko@suse.com>
L: linux-mm@kvack.org
S: Maintained
W: http://www.linux-mm.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
F: include/linux/memory.h
F: include/linux/mm.h
F: include/linux/mm_*.h
F: include/linux/mmdebug.h
F: include/linux/pagewalk.h
F: mm/Kconfig
F: mm/debug.c
F: mm/init-mm.c
F: mm/memory.c
F: mm/pagewalk.c
F: mm/util.c
MEMORY MANAGEMENT - EXECMEM
M: Andrew Morton <akpm@linux-foundation.org>
M: Mike Rapoport <rppt@kernel.org>
@@ -15520,6 +15557,53 @@ S: Maintained
F: include/linux/execmem.h
F: mm/execmem.c
MEMORY MANAGEMENT - GUP (GET USER PAGES)
M: Andrew Morton <akpm@linux-foundation.org>
M: David Hildenbrand <david@redhat.com>
R: Jason Gunthorpe <jgg@nvidia.com>
R: John Hubbard <jhubbard@nvidia.com>
R: Peter Xu <peterx@redhat.com>
L: linux-mm@kvack.org
S: Maintained
W: http://www.linux-mm.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
F: mm/gup.c
MEMORY MANAGEMENT - KSM (Kernel Samepage Merging)
M: Andrew Morton <akpm@linux-foundation.org>
M: David Hildenbrand <david@redhat.com>
R: Xu Xin <xu.xin16@zte.com.cn>
R: Chengming Zhou <chengming.zhou@linux.dev>
L: linux-mm@kvack.org
S: Maintained
W: http://www.linux-mm.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
F: Documentation/admin-guide/mm/ksm.rst
F: Documentation/mm/ksm.rst
F: include/linux/ksm.h
F: include/trace/events/ksm.h
F: mm/ksm.c
MEMORY MANAGEMENT - MEMORY POLICY AND MIGRATION
M: Andrew Morton <akpm@linux-foundation.org>
M: David Hildenbrand <david@redhat.com>
R: Zi Yan <ziy@nvidia.com>
R: Matthew Brost <matthew.brost@intel.com>
R: Joshua Hahn <joshua.hahnjy@gmail.com>
R: Rakie Kim <rakie.kim@sk.com>
R: Byungchul Park <byungchul@sk.com>
R: Gregory Price <gourry@gourry.net>
R: Ying Huang <ying.huang@linux.alibaba.com>
L: linux-mm@kvack.org
S: Maintained
W: http://www.linux-mm.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
F: include/linux/mempolicy.h
F: include/linux/migrate.h
F: mm/mempolicy.c
F: mm/migrate.c
F: mm/migrate_device.c
MEMORY MANAGEMENT - NUMA MEMBLOCKS AND NUMA EMULATION
M: Andrew Morton <akpm@linux-foundation.org>
M: Mike Rapoport <rppt@kernel.org>
@@ -15532,7 +15616,7 @@ F: mm/numa_memblks.c
MEMORY MANAGEMENT - PAGE ALLOCATOR
M: Andrew Morton <akpm@linux-foundation.org>
R: Vlastimil Babka <vbabka@suse.cz>
M: Vlastimil Babka <vbabka@suse.cz>
R: Suren Baghdasaryan <surenb@google.com>
R: Michal Hocko <mhocko@suse.com>
R: Brendan Jackman <jackmanb@google.com>
@@ -15540,10 +15624,38 @@ R: Johannes Weiner <hannes@cmpxchg.org>
R: Zi Yan <ziy@nvidia.com>
L: linux-mm@kvack.org
S: Maintained
F: include/linux/compaction.h
F: include/linux/gfp.h
F: include/linux/page-isolation.h
F: mm/compaction.c
F: mm/page_alloc.c
F: include/linux/gfp.h
F: include/linux/compaction.h
F: mm/page_isolation.c
MEMORY MANAGEMENT - RECLAIM
M: Andrew Morton <akpm@linux-foundation.org>
M: Johannes Weiner <hannes@cmpxchg.org>
R: David Hildenbrand <david@redhat.com>
R: Michal Hocko <mhocko@kernel.org>
R: Qi Zheng <zhengqi.arch@bytedance.com>
R: Shakeel Butt <shakeel.butt@linux.dev>
R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
L: linux-mm@kvack.org
S: Maintained
F: mm/pt_reclaim.c
F: mm/vmscan.c
MEMORY MANAGEMENT - RMAP (REVERSE MAPPING)
M: Andrew Morton <akpm@linux-foundation.org>
M: David Hildenbrand <david@redhat.com>
M: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
R: Rik van Riel <riel@surriel.com>
R: Liam R. Howlett <Liam.Howlett@oracle.com>
R: Vlastimil Babka <vbabka@suse.cz>
R: Harry Yoo <harry.yoo@oracle.com>
L: linux-mm@kvack.org
S: Maintained
F: include/linux/rmap.h
F: mm/rmap.c
MEMORY MANAGEMENT - SECRETMEM
M: Andrew Morton <akpm@linux-foundation.org>
@@ -15553,6 +15665,30 @@ S: Maintained
F: include/linux/secretmem.h
F: mm/secretmem.c
MEMORY MANAGEMENT - THP (TRANSPARENT HUGE PAGE)
M: Andrew Morton <akpm@linux-foundation.org>
M: David Hildenbrand <david@redhat.com>
R: Zi Yan <ziy@nvidia.com>
R: Baolin Wang <baolin.wang@linux.alibaba.com>
R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
R: Liam R. Howlett <Liam.Howlett@oracle.com>
R: Nico Pache <npache@redhat.com>
R: Ryan Roberts <ryan.roberts@arm.com>
R: Dev Jain <dev.jain@arm.com>
L: linux-mm@kvack.org
S: Maintained
W: http://www.linux-mm.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
F: Documentation/admin-guide/mm/transhuge.rst
F: include/linux/huge_mm.h
F: include/linux/khugepaged.h
F: include/trace/events/huge_memory.h
F: mm/huge_memory.c
F: mm/khugepaged.c
F: tools/testing/selftests/mm/khugepaged.c
F: tools/testing/selftests/mm/split_huge_page_test.c
F: tools/testing/selftests/mm/transhuge-stress.c
MEMORY MANAGEMENT - USERFAULTFD
M: Andrew Morton <akpm@linux-foundation.org>
R: Peter Xu <peterx@redhat.com>
@@ -18373,7 +18509,7 @@ F: include/uapi/linux/ppdev.h
PARAVIRT_OPS INTERFACE
M: Juergen Gross <jgross@suse.com>
R: Ajay Kaher <ajay.kaher@broadcom.com>
R: Alexey Makhalov <alexey.amakhalov@broadcom.com>
R: Alexey Makhalov <alexey.makhalov@broadcom.com>
R: Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
L: virtualization@lists.linux.dev
L: x86@kernel.org
@@ -20535,8 +20671,8 @@ F: Documentation/devicetree/bindings/i2c/renesas,iic-emev2.yaml
F: drivers/i2c/busses/i2c-emev2.c
RENESAS ETHERNET AVB DRIVER
M: Paul Barker <paul.barker.ct@bp.renesas.com>
M: Niklas Söderlund <niklas.soderlund@ragnatech.se>
R: Paul Barker <paul@pbarker.dev>
L: netdev@vger.kernel.org
L: linux-renesas-soc@vger.kernel.org
S: Maintained
@@ -22857,7 +22993,6 @@ F: drivers/accessibility/speakup/
SPEAR PLATFORM/CLOCK/PINCTRL SUPPORT
M: Viresh Kumar <vireshk@kernel.org>
M: Shiraz Hashim <shiraz.linux.kernel@gmail.com>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
L: soc@lists.linux.dev
S: Maintained
@@ -25835,7 +25970,7 @@ F: tools/testing/vsock/
VMALLOC
M: Andrew Morton <akpm@linux-foundation.org>
R: Uladzislau Rezki <urezki@gmail.com>
M: Uladzislau Rezki <urezki@gmail.com>
L: linux-mm@kvack.org
S: Maintained
W: http://www.linux-mm.org
@@ -25859,7 +25994,7 @@ F: drivers/misc/vmw_balloon.c
VMWARE HYPERVISOR INTERFACE
M: Ajay Kaher <ajay.kaher@broadcom.com>
M: Alexey Makhalov <alexey.amakhalov@broadcom.com>
M: Alexey Makhalov <alexey.makhalov@broadcom.com>
R: Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
L: virtualization@lists.linux.dev
L: x86@kernel.org
@@ -25887,7 +26022,7 @@ F: drivers/scsi/vmw_pvscsi.h
VMWARE VIRTUAL PTP CLOCK DRIVER
M: Nick Shi <nick.shi@broadcom.com>
R: Ajay Kaher <ajay.kaher@broadcom.com>
R: Alexey Makhalov <alexey.amakhalov@broadcom.com>
R: Alexey Makhalov <alexey.makhalov@broadcom.com>
R: Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
L: netdev@vger.kernel.org
S: Supported

View File

@@ -2,7 +2,7 @@
VERSION = 6
PATCHLEVEL = 15
SUBLEVEL = 0
EXTRAVERSION = -rc5
EXTRAVERSION =
NAME = Baby Opossum Posse
# *DOCUMENTATION*
@@ -1068,8 +1068,7 @@ KBUILD_CFLAGS += -fno-builtin-wcslen
# change __FILE__ to the relative path to the source directory
ifdef building_out_of_srctree
KBUILD_CPPFLAGS += $(call cc-option,-ffile-prefix-map=$(srcroot)/=)
KBUILD_RUSTFLAGS += --remap-path-prefix=$(srcroot)/=
KBUILD_CPPFLAGS += $(call cc-option,-fmacro-prefix-map=$(srcroot)/=)
endif
# include additional Makefiles when needed

View File

@@ -451,7 +451,7 @@
pwm_ef: pwm@86c0 {
compatible = "amlogic,meson8-pwm-v2";
clocks = <&xtal>,
<>, /* unknown/untested, the datasheet calls it "Video PLL" */
<0>, /* unknown/untested, the datasheet calls it "Video PLL" */
<&clkc CLKID_FCLK_DIV4>,
<&clkc CLKID_FCLK_DIV3>;
reg = <0x86c0 0x10>;
@@ -705,7 +705,7 @@
&pwm_ab {
compatible = "amlogic,meson8-pwm-v2";
clocks = <&xtal>,
<>, /* unknown/untested, the datasheet calls it "Video PLL" */
<0>, /* unknown/untested, the datasheet calls it "Video PLL" */
<&clkc CLKID_FCLK_DIV4>,
<&clkc CLKID_FCLK_DIV3>;
};
@@ -713,7 +713,7 @@
&pwm_cd {
compatible = "amlogic,meson8-pwm-v2";
clocks = <&xtal>,
<>, /* unknown/untested, the datasheet calls it "Video PLL" */
<0>, /* unknown/untested, the datasheet calls it "Video PLL" */
<&clkc CLKID_FCLK_DIV4>,
<&clkc CLKID_FCLK_DIV3>;
};

View File

@@ -406,7 +406,7 @@
compatible = "amlogic,meson8b-pwm-v2", "amlogic,meson8-pwm-v2";
reg = <0x86c0 0x10>;
clocks = <&xtal>,
<>, /* unknown/untested, the datasheet calls it "Video PLL" */
<0>, /* unknown/untested, the datasheet calls it "Video PLL" */
<&clkc CLKID_FCLK_DIV4>,
<&clkc CLKID_FCLK_DIV3>;
#pwm-cells = <3>;
@@ -680,7 +680,7 @@
&pwm_ab {
compatible = "amlogic,meson8b-pwm-v2", "amlogic,meson8-pwm-v2";
clocks = <&xtal>,
<>, /* unknown/untested, the datasheet calls it "Video PLL" */
<0>, /* unknown/untested, the datasheet calls it "Video PLL" */
<&clkc CLKID_FCLK_DIV4>,
<&clkc CLKID_FCLK_DIV3>;
};
@@ -688,7 +688,7 @@
&pwm_cd {
compatible = "amlogic,meson8b-pwm-v2", "amlogic,meson8-pwm-v2";
clocks = <&xtal>,
<>, /* unknown/untested, the datasheet calls it "Video PLL" */
<0>, /* unknown/untested, the datasheet calls it "Video PLL" */
<&clkc CLKID_FCLK_DIV4>,
<&clkc CLKID_FCLK_DIV3>;
};

View File

@@ -40,6 +40,9 @@
reg = <1>;
interrupt-parent = <&gpio4>;
interrupts = <16 IRQ_TYPE_LEVEL_LOW>;
micrel,led-mode = <1>;
clocks = <&clks IMX6UL_CLK_ENET_REF>;
clock-names = "rmii-ref";
status = "okay";
};
};

View File

@@ -152,28 +152,12 @@
vcc-pg-supply = <&reg_aldo1>;
};
&r_ir {
linux,rc-map-name = "rc-beelink-gs1";
status = "okay";
};
&r_pio {
/*
* FIXME: We can't add that supply for now since it would
* create a circular dependency between pinctrl, the regulator
* and the RSB Bus.
*
* vcc-pl-supply = <&reg_aldo1>;
*/
vcc-pm-supply = <&reg_aldo1>;
};
&r_rsb {
&r_i2c {
status = "okay";
axp805: pmic@745 {
axp805: pmic@36 {
compatible = "x-powers,axp805", "x-powers,axp806";
reg = <0x745>;
reg = <0x36>;
interrupt-parent = <&r_intc>;
interrupts = <GIC_SPI 96 IRQ_TYPE_LEVEL_LOW>;
interrupt-controller;
@@ -291,6 +275,22 @@
};
};
&r_ir {
linux,rc-map-name = "rc-beelink-gs1";
status = "okay";
};
&r_pio {
/*
* PL0 and PL1 are used for PMIC I2C
* don't enable the pl-supply else
* it will fail at boot
*
* vcc-pl-supply = <&reg_aldo1>;
*/
vcc-pm-supply = <&reg_aldo1>;
};
&spdif {
pinctrl-names = "default";
pinctrl-0 = <&spdif_tx_pin>;

View File

@@ -176,16 +176,12 @@
vcc-pg-supply = <&reg_vcc_wifi_io>;
};
&r_ir {
status = "okay";
};
&r_rsb {
&r_i2c {
status = "okay";
axp805: pmic@745 {
axp805: pmic@36 {
compatible = "x-powers,axp805", "x-powers,axp806";
reg = <0x745>;
reg = <0x36>;
interrupt-parent = <&r_intc>;
interrupts = <GIC_SPI 96 IRQ_TYPE_LEVEL_LOW>;
interrupt-controller;
@@ -296,6 +292,10 @@
};
};
&r_ir {
status = "okay";
};
&rtc {
clocks = <&ext_osc32k>;
};

View File

@@ -113,20 +113,12 @@
vcc-pg-supply = <&reg_aldo1>;
};
&r_ir {
status = "okay";
};
&r_pio {
vcc-pm-supply = <&reg_bldo3>;
};
&r_rsb {
&r_i2c {
status = "okay";
axp805: pmic@745 {
axp805: pmic@36 {
compatible = "x-powers,axp805", "x-powers,axp806";
reg = <0x745>;
reg = <0x36>;
interrupt-parent = <&r_intc>;
interrupts = <GIC_SPI 96 IRQ_TYPE_LEVEL_LOW>;
interrupt-controller;
@@ -241,6 +233,14 @@
};
};
&r_ir {
status = "okay";
};
&r_pio {
vcc-pm-supply = <&reg_bldo3>;
};
&rtc {
clocks = <&ext_osc32k>;
};

View File

@@ -151,7 +151,7 @@
al,msi-num-spis = <160>;
};
io-fabric@fc000000 {
io-bus@fc000000 {
compatible = "simple-bus";
#address-cells = <1>;
#size-cells = <1>;

View File

@@ -361,7 +361,7 @@
interrupt-parent = <&gic>;
};
io-fabric@fc000000 {
io-bus@fc000000 {
compatible = "simple-bus";
#address-cells = <1>;
#size-cells = <1>;

View File

@@ -2313,7 +2313,7 @@
"amlogic,meson8-pwm-v2";
reg = <0x0 0x19000 0x0 0x20>;
clocks = <&xtal>,
<>, /* unknown/untested, the datasheet calls it "vid_pll" */
<0>, /* unknown/untested, the datasheet calls it "vid_pll" */
<&clkc CLKID_FCLK_DIV4>,
<&clkc CLKID_FCLK_DIV3>;
#pwm-cells = <3>;
@@ -2325,7 +2325,7 @@
"amlogic,meson8-pwm-v2";
reg = <0x0 0x1a000 0x0 0x20>;
clocks = <&xtal>,
<>, /* unknown/untested, the datasheet calls it "vid_pll" */
<0>, /* unknown/untested, the datasheet calls it "vid_pll" */
<&clkc CLKID_FCLK_DIV4>,
<&clkc CLKID_FCLK_DIV3>;
#pwm-cells = <3>;
@@ -2337,7 +2337,7 @@
"amlogic,meson8-pwm-v2";
reg = <0x0 0x1b000 0x0 0x20>;
clocks = <&xtal>,
<>, /* unknown/untested, the datasheet calls it "vid_pll" */
<0>, /* unknown/untested, the datasheet calls it "vid_pll" */
<&clkc CLKID_FCLK_DIV4>,
<&clkc CLKID_FCLK_DIV3>;
#pwm-cells = <3>;

View File

@@ -116,6 +116,10 @@
status = "okay";
};
&clkc_audio {
status = "okay";
};
&frddr_a {
status = "okay";
};

View File

@@ -741,7 +741,7 @@
&pwm_ab {
clocks = <&xtal>,
<>, /* unknown/untested, the datasheet calls it "vid_pll" */
<0>, /* unknown/untested, the datasheet calls it "vid_pll" */
<&clkc CLKID_FCLK_DIV4>,
<&clkc CLKID_FCLK_DIV3>;
};
@@ -752,14 +752,14 @@
&pwm_cd {
clocks = <&xtal>,
<>, /* unknown/untested, the datasheet calls it "vid_pll" */
<0>, /* unknown/untested, the datasheet calls it "vid_pll" */
<&clkc CLKID_FCLK_DIV4>,
<&clkc CLKID_FCLK_DIV3>;
};
&pwm_ef {
clocks = <&xtal>,
<>, /* unknown/untested, the datasheet calls it "vid_pll" */
<0>, /* unknown/untested, the datasheet calls it "vid_pll" */
<&clkc CLKID_FCLK_DIV4>,
<&clkc CLKID_FCLK_DIV3>;
};

View File

@@ -811,7 +811,7 @@
&pwm_ab {
clocks = <&xtal>,
<>, /* unknown/untested, the datasheet calls it "vid_pll" */
<0>, /* unknown/untested, the datasheet calls it "vid_pll" */
<&clkc CLKID_FCLK_DIV4>,
<&clkc CLKID_FCLK_DIV3>;
};
@@ -822,14 +822,14 @@
&pwm_cd {
clocks = <&xtal>,
<>, /* unknown/untested, the datasheet calls it "vid_pll" */
<0>, /* unknown/untested, the datasheet calls it "vid_pll" */
<&clkc CLKID_FCLK_DIV4>,
<&clkc CLKID_FCLK_DIV3>;
};
&pwm_ef {
clocks = <&xtal>,
<>, /* unknown/untested, the datasheet calls it "vid_pll" */
<0>, /* unknown/untested, the datasheet calls it "vid_pll" */
<&clkc CLKID_FCLK_DIV4>,
<&clkc CLKID_FCLK_DIV3>;
};

View File

@@ -77,6 +77,16 @@
};
};
/*
* The driver depends on boot loader initialized state which resets when this
* power-domain is powered off. This happens on suspend or when the driver is
* missing during boot. Mark the domain as always on until the driver can
* handle this.
*/
&ps_dispdfr_be {
apple,always-on;
};
&display_dfr {
status = "okay";
};

View File

@@ -40,6 +40,16 @@
};
};
/*
* The driver depends on boot loader initialized state which resets when this
* power-domain is powered off. This happens on suspend or when the driver is
* missing during boot. Mark the domain as always on until the driver can
* handle this.
*/
&ps_dispdfr_be {
apple,always-on;
};
&display_dfr {
status = "okay";
};

View File

@@ -44,7 +44,7 @@
next-level-cache = <&l2_0>;
clocks = <&scmi_dvfs 0>;
l2_0: l2-cache-0 {
l2_0: l2-cache {
compatible = "cache";
cache-level = <2>;
/* 8 ways set associative */
@@ -53,13 +53,6 @@
cache-sets = <2048>;
cache-unified;
next-level-cache = <&l3_0>;
l3_0: l3-cache {
compatible = "cache";
cache-level = <3>;
cache-size = <0x100000>;
cache-unified;
};
};
};
@@ -78,7 +71,7 @@
next-level-cache = <&l2_1>;
clocks = <&scmi_dvfs 0>;
l2_1: l2-cache-1 {
l2_1: l2-cache {
compatible = "cache";
cache-level = <2>;
/* 8 ways set associative */
@@ -105,7 +98,7 @@
next-level-cache = <&l2_2>;
clocks = <&scmi_dvfs 1>;
l2_2: l2-cache-2 {
l2_2: l2-cache {
compatible = "cache";
cache-level = <2>;
/* 8 ways set associative */
@@ -132,7 +125,7 @@
next-level-cache = <&l2_3>;
clocks = <&scmi_dvfs 1>;
l2_3: l2-cache-3 {
l2_3: l2-cache {
compatible = "cache";
cache-level = <2>;
/* 8 ways set associative */
@@ -143,6 +136,13 @@
next-level-cache = <&l3_0>;
};
};
l3_0: l3-cache {
compatible = "cache";
cache-level = <3>;
cache-size = <0x100000>;
cache-unified;
};
};
firmware {

View File

@@ -144,6 +144,19 @@
startup-delay-us = <20000>;
};
reg_usdhc2_vqmmc: regulator-usdhc2-vqmmc {
compatible = "regulator-gpio";
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_usdhc2_vsel>;
gpios = <&gpio1 4 GPIO_ACTIVE_HIGH>;
regulator-max-microvolt = <3300000>;
regulator-min-microvolt = <1800000>;
states = <1800000 0x1>,
<3300000 0x0>;
regulator-name = "PMIC_USDHC_VSELECT";
vin-supply = <&reg_nvcc_sd>;
};
reserved-memory {
#address-cells = <2>;
#size-cells = <2>;
@@ -269,7 +282,7 @@
"SODIMM_19",
"",
"",
"",
"PMIC_USDHC_VSELECT",
"",
"",
"",
@@ -785,6 +798,7 @@
pinctrl-2 = <&pinctrl_usdhc2_200mhz>, <&pinctrl_usdhc2_cd>;
pinctrl-3 = <&pinctrl_usdhc2_sleep>, <&pinctrl_usdhc2_cd_sleep>;
vmmc-supply = <&reg_usdhc2_vmmc>;
vqmmc-supply = <&reg_usdhc2_vqmmc>;
};
&wdog1 {
@@ -1206,13 +1220,17 @@
<MX8MM_IOMUXC_NAND_CLE_GPIO3_IO5 0x6>; /* SODIMM 76 */
};
pinctrl_usdhc2_vsel: usdhc2vselgrp {
fsl,pins =
<MX8MM_IOMUXC_GPIO1_IO04_GPIO1_IO4 0x10>; /* PMIC_USDHC_VSELECT */
};
/*
* Note: Due to ERR050080 we use discrete external on-module resistors pulling-up to the
* on-module +V3.3_1.8_SD (LDO5) rail and explicitly disable the internal pull-ups here.
*/
pinctrl_usdhc2: usdhc2grp {
fsl,pins =
<MX8MM_IOMUXC_GPIO1_IO04_USDHC2_VSELECT 0x10>,
<MX8MM_IOMUXC_SD2_CLK_USDHC2_CLK 0x90>, /* SODIMM 78 */
<MX8MM_IOMUXC_SD2_CMD_USDHC2_CMD 0x90>, /* SODIMM 74 */
<MX8MM_IOMUXC_SD2_DATA0_USDHC2_DATA0 0x90>, /* SODIMM 80 */
@@ -1223,7 +1241,6 @@
pinctrl_usdhc2_100mhz: usdhc2-100mhzgrp {
fsl,pins =
<MX8MM_IOMUXC_GPIO1_IO04_USDHC2_VSELECT 0x10>,
<MX8MM_IOMUXC_SD2_CLK_USDHC2_CLK 0x94>,
<MX8MM_IOMUXC_SD2_CMD_USDHC2_CMD 0x94>,
<MX8MM_IOMUXC_SD2_DATA0_USDHC2_DATA0 0x94>,
@@ -1234,7 +1251,6 @@
pinctrl_usdhc2_200mhz: usdhc2-200mhzgrp {
fsl,pins =
<MX8MM_IOMUXC_GPIO1_IO04_USDHC2_VSELECT 0x10>,
<MX8MM_IOMUXC_SD2_CLK_USDHC2_CLK 0x96>,
<MX8MM_IOMUXC_SD2_CMD_USDHC2_CMD 0x96>,
<MX8MM_IOMUXC_SD2_DATA0_USDHC2_DATA0 0x96>,
@@ -1246,7 +1262,6 @@
/* Avoid backfeeding with removed card power */
pinctrl_usdhc2_sleep: usdhc2slpgrp {
fsl,pins =
<MX8MM_IOMUXC_GPIO1_IO04_USDHC2_VSELECT 0x0>,
<MX8MM_IOMUXC_SD2_CLK_USDHC2_CLK 0x0>,
<MX8MM_IOMUXC_SD2_CMD_USDHC2_CMD 0x0>,
<MX8MM_IOMUXC_SD2_DATA0_USDHC2_DATA0 0x0>,

View File

@@ -24,6 +24,20 @@
fsl,operating-mode = "nominal";
};
&gpu2d {
assigned-clocks = <&clk IMX8MP_CLK_GPU2D_CORE>;
assigned-clock-parents = <&clk IMX8MP_SYS_PLL1_800M>;
assigned-clock-rates = <800000000>;
};
&gpu3d {
assigned-clocks = <&clk IMX8MP_CLK_GPU3D_CORE>,
<&clk IMX8MP_CLK_GPU3D_SHADER_CORE>;
assigned-clock-parents = <&clk IMX8MP_SYS_PLL1_800M>,
<&clk IMX8MP_SYS_PLL1_800M>;
assigned-clock-rates = <800000000>, <800000000>;
};
&pgc_hdmimix {
assigned-clocks = <&clk IMX8MP_CLK_HDMI_AXI>,
<&clk IMX8MP_CLK_HDMI_APB>;
@@ -46,6 +60,18 @@
assigned-clock-rates = <600000000>, <300000000>;
};
&pgc_mlmix {
assigned-clocks = <&clk IMX8MP_CLK_ML_CORE>,
<&clk IMX8MP_CLK_ML_AXI>,
<&clk IMX8MP_CLK_ML_AHB>;
assigned-clock-parents = <&clk IMX8MP_SYS_PLL1_800M>,
<&clk IMX8MP_SYS_PLL1_800M>,
<&clk IMX8MP_SYS_PLL1_800M>;
assigned-clock-rates = <800000000>,
<800000000>,
<300000000>;
};
&media_blk_ctrl {
assigned-clocks = <&clk IMX8MP_CLK_MEDIA_AXI>,
<&clk IMX8MP_CLK_MEDIA_APB>,
@@ -62,3 +88,5 @@
<0>, <0>, <400000000>,
<1039500000>;
};
/delete-node/ &{noc_opp_table/opp-1000000000};

View File

@@ -35,7 +35,6 @@
<0x1 0x00000000 0 0xc0000000>;
};
reg_usdhc2_vmmc: regulator-usdhc2-vmmc {
compatible = "regulator-fixed";
regulator-name = "VSD_3V3";
@@ -46,6 +45,16 @@
startup-delay-us = <100>;
off-on-delay-us = <12000>;
};
reg_usdhc2_vqmmc: regulator-usdhc2-vqmmc {
compatible = "regulator-gpio";
regulator-name = "VSD_VSEL";
regulator-min-microvolt = <1800000>;
regulator-max-microvolt = <3300000>;
gpios = <&gpio2 12 GPIO_ACTIVE_HIGH>;
states = <3300000 0x0 1800000 0x1>;
vin-supply = <&ldo5>;
};
};
&A53_0 {
@@ -205,6 +214,7 @@
pinctrl-2 = <&pinctrl_usdhc2_200mhz>, <&pinctrl_usdhc2_gpio>;
cd-gpios = <&gpio1 14 GPIO_ACTIVE_LOW>;
vmmc-supply = <&reg_usdhc2_vmmc>;
vqmmc-supply = <&reg_usdhc2_vqmmc>;
bus-width = <4>;
status = "okay";
};

View File

@@ -1645,6 +1645,12 @@
opp-hz = /bits/ 64 <200000000>;
};
/* Nominal drive mode maximum */
opp-800000000 {
opp-hz = /bits/ 64 <800000000>;
};
/* Overdrive mode maximum */
opp-1000000000 {
opp-hz = /bits/ 64 <1000000000>;
};

View File

@@ -1626,7 +1626,7 @@
reg = <0 0x4c300000 0 0x10000>,
<0 0x60100000 0 0xfe00000>,
<0 0x4c360000 0 0x10000>,
<0 0x4c340000 0 0x2000>;
<0 0x4c340000 0 0x4000>;
reg-names = "dbi", "config", "atu", "app";
ranges = <0x81000000 0x0 0x00000000 0x0 0x6ff00000 0 0x00100000>,
<0x82000000 0x0 0x10000000 0x9 0x10000000 0 0x10000000>;
@@ -1673,7 +1673,7 @@
reg = <0 0x4c300000 0 0x10000>,
<0 0x4c360000 0 0x1000>,
<0 0x4c320000 0 0x1000>,
<0 0x4c340000 0 0x2000>,
<0 0x4c340000 0 0x4000>,
<0 0x4c370000 0 0x10000>,
<0x9 0 1 0>;
reg-names = "dbi","atu", "dbi2", "app", "dma", "addr_space";
@@ -1700,7 +1700,7 @@
reg = <0 0x4c380000 0 0x10000>,
<8 0x80100000 0 0xfe00000>,
<0 0x4c3e0000 0 0x10000>,
<0 0x4c3c0000 0 0x2000>;
<0 0x4c3c0000 0 0x4000>;
reg-names = "dbi", "config", "atu", "app";
ranges = <0x81000000 0 0x00000000 0x8 0x8ff00000 0 0x00100000>,
<0x82000000 0 0x10000000 0xa 0x10000000 0 0x10000000>;
@@ -1749,7 +1749,7 @@
reg = <0 0x4c380000 0 0x10000>,
<0 0x4c3e0000 0 0x1000>,
<0 0x4c3a0000 0 0x1000>,
<0 0x4c3c0000 0 0x2000>,
<0 0x4c3c0000 0 0x4000>,
<0 0x4c3f0000 0 0x10000>,
<0xa 0 1 0>;
reg-names = "dbi", "atu", "dbi2", "app", "dma", "addr_space";

View File

@@ -26,6 +26,8 @@
leds {
compatible = "gpio-leds";
pinctrl-names = "default";
pinctrl-0 = <&spi_quad_pins>;
led-power1 {
label = "udpu:green:power";
@@ -82,8 +84,6 @@
&spi0 {
status = "okay";
pinctrl-names = "default";
pinctrl-0 = <&spi_quad_pins>;
flash@0 {
compatible = "jedec,spi-nor";
@@ -108,6 +108,10 @@
};
};
&spi_quad_pins {
function = "gpio";
};
&pinctrl_nb {
i2c2_recovery_pins: i2c2-recovery-pins {
groups = "i2c2";

View File

@@ -31,7 +31,7 @@
};
vcc3v3_btreg: vcc3v3-btreg {
compatible = "regulator-gpio";
compatible = "regulator-fixed";
enable-active-high;
pinctrl-names = "default";
pinctrl-0 = <&bt_enable_h>;
@@ -39,7 +39,6 @@
regulator-min-microvolt = <3300000>;
regulator-max-microvolt = <3300000>;
regulator-always-on;
states = <3300000 0x0>;
};
vcc3v3_rf_aux_mod: regulator-vcc3v3-rf-aux-mod {

View File

@@ -26,5 +26,5 @@
};
&vcc3v3_btreg {
enable-gpios = <&gpio1 RK_PC3 GPIO_ACTIVE_HIGH>;
gpios = <&gpio1 RK_PC3 GPIO_ACTIVE_HIGH>;
};

View File

@@ -39,5 +39,5 @@
};
&vcc3v3_btreg {
enable-gpios = <&gpio1 RK_PC2 GPIO_ACTIVE_HIGH>;
gpios = <&gpio1 RK_PC2 GPIO_ACTIVE_HIGH>;
};

View File

@@ -43,7 +43,7 @@
sdio_pwrseq: sdio-pwrseq {
compatible = "mmc-pwrseq-simple";
clocks = <&rk808 1>;
clock-names = "lpo";
clock-names = "ext_clock";
pinctrl-names = "default";
pinctrl-0 = <&wifi_enable_h>;
reset-gpios = <&gpio0 RK_PB2 GPIO_ACTIVE_LOW>;

View File

@@ -775,7 +775,7 @@
rockchip,default-sample-phase = <90>;
status = "okay";
sdio-wifi@1 {
wifi@1 {
compatible = "brcm,bcm4329-fmac";
reg = <1>;
interrupt-parent = <&gpio2>;

View File

@@ -619,6 +619,8 @@
bus-width = <8>;
max-frequency = <200000000>;
non-removable;
pinctrl-names = "default";
pinctrl-0 = <&emmc_bus8 &emmc_clk &emmc_cmd &emmc_datastrobe>;
status = "okay";
};

View File

@@ -610,7 +610,7 @@
reg = <0x51>;
clock-output-names = "hym8563";
interrupt-parent = <&gpio0>;
interrupts = <RK_PB0 IRQ_TYPE_LEVEL_LOW>;
interrupts = <RK_PA0 IRQ_TYPE_LEVEL_LOW>;
pinctrl-names = "default";
pinctrl-0 = <&hym8563_int>;
wakeup-source;

View File

@@ -222,6 +222,10 @@
compatible = "realtek,rt5616";
reg = <0x1b>;
#sound-dai-cells = <0>;
assigned-clocks = <&cru I2S0_8CH_MCLKOUT>;
assigned-clock-rates = <12288000>;
clocks = <&cru I2S0_8CH_MCLKOUT>;
clock-names = "mclk";
};
};

View File

@@ -214,6 +214,8 @@
};
&package_thermal {
polling-delay = <1000>;
trips {
package_active1: trip-active1 {
temperature = <45000>;

View File

@@ -11,20 +11,15 @@
compatible = "operating-points-v2";
opp-shared;
opp-1416000000 {
opp-hz = /bits/ 64 <1416000000>;
opp-1200000000 {
opp-hz = /bits/ 64 <1200000000>;
opp-microvolt = <750000 750000 950000>;
clock-latency-ns = <40000>;
opp-suspend;
};
opp-1608000000 {
opp-hz = /bits/ 64 <1608000000>;
opp-microvolt = <887500 887500 950000>;
clock-latency-ns = <40000>;
};
opp-1704000000 {
opp-hz = /bits/ 64 <1704000000>;
opp-microvolt = <937500 937500 950000>;
opp-1296000000 {
opp-hz = /bits/ 64 <1296000000>;
opp-microvolt = <775000 775000 950000>;
clock-latency-ns = <40000>;
};
};
@@ -33,9 +28,14 @@
compatible = "operating-points-v2";
opp-shared;
opp-1200000000{
opp-hz = /bits/ 64 <1200000000>;
opp-microvolt = <750000 750000 950000>;
clock-latency-ns = <40000>;
};
opp-1416000000 {
opp-hz = /bits/ 64 <1416000000>;
opp-microvolt = <750000 750000 950000>;
opp-microvolt = <762500 762500 950000>;
clock-latency-ns = <40000>;
};
opp-1608000000 {
@@ -43,25 +43,20 @@
opp-microvolt = <787500 787500 950000>;
clock-latency-ns = <40000>;
};
opp-1800000000 {
opp-hz = /bits/ 64 <1800000000>;
opp-microvolt = <875000 875000 950000>;
clock-latency-ns = <40000>;
};
opp-2016000000 {
opp-hz = /bits/ 64 <2016000000>;
opp-microvolt = <950000 950000 950000>;
clock-latency-ns = <40000>;
};
};
cluster2_opp_table: opp-table-cluster2 {
compatible = "operating-points-v2";
opp-shared;
opp-1200000000{
opp-hz = /bits/ 64 <1200000000>;
opp-microvolt = <750000 750000 950000>;
clock-latency-ns = <40000>;
};
opp-1416000000 {
opp-hz = /bits/ 64 <1416000000>;
opp-microvolt = <750000 750000 950000>;
opp-microvolt = <762500 762500 950000>;
clock-latency-ns = <40000>;
};
opp-1608000000 {
@@ -69,16 +64,6 @@
opp-microvolt = <787500 787500 950000>;
clock-latency-ns = <40000>;
};
opp-1800000000 {
opp-hz = /bits/ 64 <1800000000>;
opp-microvolt = <875000 875000 950000>;
clock-latency-ns = <40000>;
};
opp-2016000000 {
opp-hz = /bits/ 64 <2016000000>;
opp-microvolt = <950000 950000 950000>;
clock-latency-ns = <40000>;
};
};
gpu_opp_table: opp-table {
@@ -104,10 +89,6 @@
opp-hz = /bits/ 64 <700000000>;
opp-microvolt = <750000 750000 850000>;
};
opp-850000000 {
opp-hz = /bits/ 64 <800000000>;
opp-microvolt = <787500 787500 850000>;
};
};
};

View File

@@ -116,11 +116,11 @@
};
intc: interrupt-controller@4ac10000 {
compatible = "arm,cortex-a7-gic";
compatible = "arm,gic-400";
reg = <0x4ac10000 0x0 0x1000>,
<0x4ac20000 0x0 0x2000>,
<0x4ac40000 0x0 0x2000>,
<0x4ac60000 0x0 0x2000>;
<0x4ac20000 0x0 0x20000>,
<0x4ac40000 0x0 0x20000>,
<0x4ac60000 0x0 0x20000>;
#interrupt-cells = <3>;
interrupt-controller;
};

View File

@@ -1201,13 +1201,12 @@
};
intc: interrupt-controller@4ac10000 {
compatible = "arm,cortex-a7-gic";
compatible = "arm,gic-400";
reg = <0x4ac10000 0x1000>,
<0x4ac20000 0x2000>,
<0x4ac40000 0x2000>,
<0x4ac60000 0x2000>;
<0x4ac20000 0x20000>,
<0x4ac40000 0x20000>,
<0x4ac60000 0x20000>;
#interrupt-cells = <3>;
#address-cells = <1>;
interrupt-controller;
};
};

View File

@@ -115,14 +115,13 @@
};
intc: interrupt-controller@4ac00000 {
compatible = "arm,cortex-a7-gic";
compatible = "arm,gic-400";
#interrupt-cells = <3>;
#address-cells = <1>;
interrupt-controller;
reg = <0x0 0x4ac10000 0x0 0x1000>,
<0x0 0x4ac20000 0x0 0x2000>,
<0x0 0x4ac40000 0x0 0x2000>,
<0x0 0x4ac60000 0x0 0x2000>;
<0x0 0x4ac20000 0x0 0x20000>,
<0x0 0x4ac40000 0x0 0x20000>,
<0x0 0x4ac60000 0x0 0x20000>;
};
psci {

View File

@@ -1729,12 +1729,12 @@ CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
CONFIG_SECURITY=y
CONFIG_CRYPTO_USER=y
CONFIG_CRYPTO_CHACHA20=m
CONFIG_CRYPTO_TEST=m
CONFIG_CRYPTO_ECHAINIV=y
CONFIG_CRYPTO_MICHAEL_MIC=m
CONFIG_CRYPTO_ANSI_CPRNG=y
CONFIG_CRYPTO_USER_API_RNG=m
CONFIG_CRYPTO_CHACHA20_NEON=m
CONFIG_CRYPTO_GHASH_ARM64_CE=y
CONFIG_CRYPTO_SHA1_ARM64_CE=y
CONFIG_CRYPTO_SHA2_ARM64_CE=y

View File

@@ -81,6 +81,7 @@
#define ARM_CPU_PART_CORTEX_A78AE 0xD42
#define ARM_CPU_PART_CORTEX_X1 0xD44
#define ARM_CPU_PART_CORTEX_A510 0xD46
#define ARM_CPU_PART_CORTEX_X1C 0xD4C
#define ARM_CPU_PART_CORTEX_A520 0xD80
#define ARM_CPU_PART_CORTEX_A710 0xD47
#define ARM_CPU_PART_CORTEX_A715 0xD4D
@@ -168,6 +169,7 @@
#define MIDR_CORTEX_A78AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78AE)
#define MIDR_CORTEX_X1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1)
#define MIDR_CORTEX_A510 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A510)
#define MIDR_CORTEX_X1C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1C)
#define MIDR_CORTEX_A520 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A520)
#define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710)
#define MIDR_CORTEX_A715 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A715)

View File

@@ -52,7 +52,7 @@
mrs x0, id_aa64mmfr1_el1
ubfx x0, x0, #ID_AA64MMFR1_EL1_HCX_SHIFT, #4
cbz x0, .Lskip_hcrx_\@
mov_q x0, HCRX_HOST_FLAGS
mov_q x0, (HCRX_EL2_MSCEn | HCRX_EL2_TCR2En | HCRX_EL2_EnFPM)
/* Enable GCS if supported */
mrs_s x1, SYS_ID_AA64PFR1_EL1

View File

@@ -706,6 +706,7 @@ u32 aarch64_insn_gen_cas(enum aarch64_insn_register result,
}
#endif
u32 aarch64_insn_gen_dmb(enum aarch64_insn_mb_type type);
u32 aarch64_insn_gen_dsb(enum aarch64_insn_mb_type type);
u32 aarch64_insn_gen_mrs(enum aarch64_insn_register result,
enum aarch64_insn_system_register sysreg);

View File

@@ -100,9 +100,8 @@
HCR_FMO | HCR_IMO | HCR_PTW | HCR_TID3 | HCR_TID1)
#define HCR_HOST_NVHE_FLAGS (HCR_RW | HCR_API | HCR_APK | HCR_ATA)
#define HCR_HOST_NVHE_PROTECTED_FLAGS (HCR_HOST_NVHE_FLAGS | HCR_TSC)
#define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)
#define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H | HCR_AMO | HCR_IMO | HCR_FMO)
#define HCRX_HOST_FLAGS (HCRX_EL2_MSCEn | HCRX_EL2_TCR2En | HCRX_EL2_EnFPM)
#define MPAMHCR_HOST_FLAGS 0
/* TCR_EL2 Registers bits */

View File

@@ -97,6 +97,9 @@ enum mitigation_state arm64_get_meltdown_state(void);
enum mitigation_state arm64_get_spectre_bhb_state(void);
bool is_spectre_bhb_affected(const struct arm64_cpu_capabilities *entry, int scope);
extern bool __nospectre_bhb;
u8 get_spectre_bhb_loop_value(void);
bool is_spectre_bhb_fw_mitigated(void);
void spectre_bhb_enable_mitigation(const struct arm64_cpu_capabilities *__unused);
bool try_emulate_el1_ssbs(struct pt_regs *regs, u32 instr);

View File

@@ -99,6 +99,19 @@ static __always_inline u64 __arch_get_hw_counter(s32 clock_mode,
return res;
}
#if IS_ENABLED(CONFIG_CC_IS_GCC) && IS_ENABLED(CONFIG_PAGE_SIZE_64KB)
static __always_inline const struct vdso_time_data *__arch_get_vdso_u_time_data(void)
{
const struct vdso_time_data *ret = &vdso_u_time_data;
/* Work around invalid absolute relocations */
OPTIMIZER_HIDE_VAR(ret);
return ret;
}
#define __arch_get_vdso_u_time_data __arch_get_vdso_u_time_data
#endif /* IS_ENABLED(CONFIG_CC_IS_GCC) && IS_ENABLED(CONFIG_PAGE_SIZE_64KB) */
#endif /* !__ASSEMBLY__ */
#endif /* __ASM_VDSO_GETTIMEOFDAY_H */

View File

@@ -114,7 +114,14 @@ static struct arm64_cpu_capabilities const __ro_after_init *cpucap_ptrs[ARM64_NC
DECLARE_BITMAP(boot_cpucaps, ARM64_NCAPS);
bool arm64_use_ng_mappings = false;
/*
* arm64_use_ng_mappings must be placed in the .data section, otherwise it
* ends up in the .bss section where it is initialized in early_map_kernel()
* after the MMU (with the idmap) was enabled. create_init_idmap() - which
* runs before early_map_kernel() and reads the variable via PTE_MAYBE_NG -
* may end up generating an incorrect idmap page table attributes.
*/
bool arm64_use_ng_mappings __read_mostly = false;
EXPORT_SYMBOL(arm64_use_ng_mappings);
DEFINE_PER_CPU_READ_MOSTLY(const char *, this_cpu_vector) = vectors;

View File

@@ -891,6 +891,7 @@ static u8 spectre_bhb_loop_affected(void)
MIDR_ALL_VERSIONS(MIDR_CORTEX_A78AE),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A78C),
MIDR_ALL_VERSIONS(MIDR_CORTEX_X1),
MIDR_ALL_VERSIONS(MIDR_CORTEX_X1C),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
MIDR_ALL_VERSIONS(MIDR_CORTEX_X2),
MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
@@ -999,6 +1000,11 @@ bool is_spectre_bhb_affected(const struct arm64_cpu_capabilities *entry,
return true;
}
u8 get_spectre_bhb_loop_value(void)
{
return max_bhb_k;
}
static void this_cpu_set_vectors(enum arm64_bp_harden_el1_vectors slot)
{
const char *v = arm64_get_bp_hardening_vector(slot);
@@ -1016,7 +1022,7 @@ static void this_cpu_set_vectors(enum arm64_bp_harden_el1_vectors slot)
isb();
}
static bool __read_mostly __nospectre_bhb;
bool __read_mostly __nospectre_bhb;
static int __init parse_spectre_bhb_param(char *str)
{
__nospectre_bhb = true;
@@ -1094,6 +1100,11 @@ void spectre_bhb_enable_mitigation(const struct arm64_cpu_capabilities *entry)
update_mitigation_state(&spectre_bhb_state, state);
}
bool is_spectre_bhb_fw_mitigated(void)
{
return test_bit(BHB_FW, &system_bhb_mitigations);
}
/* Patched to NOP when enabled */
void noinstr spectre_bhb_patch_loop_mitigation_enable(struct alt_instr *alt,
__le32 *origptr,

View File

@@ -235,6 +235,8 @@ static inline void __deactivate_traps_mpam(void)
static inline void __activate_traps_common(struct kvm_vcpu *vcpu)
{
struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
/* Trap on AArch32 cp15 c15 (impdef sysregs) accesses (EL1 or EL0) */
write_sysreg(1 << 15, hstr_el2);
@@ -245,11 +247,8 @@ static inline void __activate_traps_common(struct kvm_vcpu *vcpu)
* EL1 instead of being trapped to EL2.
*/
if (system_supports_pmuv3()) {
struct kvm_cpu_context *hctxt;
write_sysreg(0, pmselr_el0);
hctxt = host_data_ptr(host_ctxt);
ctxt_sys_reg(hctxt, PMUSERENR_EL0) = read_sysreg(pmuserenr_el0);
write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);
vcpu_set_flag(vcpu, PMUSERENR_ON_CPU);
@@ -269,6 +268,7 @@ static inline void __activate_traps_common(struct kvm_vcpu *vcpu)
hcrx &= ~clr;
}
ctxt_sys_reg(hctxt, HCRX_EL2) = read_sysreg_s(SYS_HCRX_EL2);
write_sysreg_s(hcrx, SYS_HCRX_EL2);
}
@@ -278,19 +278,18 @@ static inline void __activate_traps_common(struct kvm_vcpu *vcpu)
static inline void __deactivate_traps_common(struct kvm_vcpu *vcpu)
{
struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
write_sysreg(*host_data_ptr(host_debug_state.mdcr_el2), mdcr_el2);
write_sysreg(0, hstr_el2);
if (system_supports_pmuv3()) {
struct kvm_cpu_context *hctxt;
hctxt = host_data_ptr(host_ctxt);
write_sysreg(ctxt_sys_reg(hctxt, PMUSERENR_EL0), pmuserenr_el0);
vcpu_clear_flag(vcpu, PMUSERENR_ON_CPU);
}
if (cpus_have_final_cap(ARM64_HAS_HCX))
write_sysreg_s(HCRX_HOST_FLAGS, SYS_HCRX_EL2);
write_sysreg_s(ctxt_sys_reg(hctxt, HCRX_EL2), SYS_HCRX_EL2);
__deactivate_traps_hfgxtr(vcpu);
__deactivate_traps_mpam();

View File

@@ -503,7 +503,7 @@ int host_stage2_set_owner_locked(phys_addr_t addr, u64 size, u8 owner_id)
{
int ret;
if (!addr_is_memory(addr))
if (!range_is_memory(addr, addr + size))
return -EPERM;
ret = host_stage2_try(kvm_pgtable_stage2_set_owner, &host_mmu.pgt,

View File

@@ -429,23 +429,27 @@ u64 __vgic_v3_get_gic_config(void)
/*
* To check whether we have a MMIO-based (GICv2 compatible)
* CPU interface, we need to disable the system register
* view. To do that safely, we have to prevent any interrupt
* from firing (which would be deadly).
* view.
*
* Note that this only makes sense on VHE, as interrupts are
* already masked for nVHE as part of the exception entry to
* EL2.
*/
if (has_vhe())
flags = local_daif_save();
/*
* Table 11-2 "Permitted ICC_SRE_ELx.SRE settings" indicates
* that to be able to set ICC_SRE_EL1.SRE to 0, all the
* interrupt overrides must be set. You've got to love this.
*
* As we always run VHE with HCR_xMO set, no extra xMO
* manipulation is required in that case.
*
* To safely disable SRE, we have to prevent any interrupt
* from firing (which would be deadly). This only makes sense
* on VHE, as interrupts are already masked for nVHE as part
* of the exception entry to EL2.
*/
sysreg_clear_set(hcr_el2, 0, HCR_AMO | HCR_FMO | HCR_IMO);
isb();
if (has_vhe()) {
flags = local_daif_save();
} else {
sysreg_clear_set(hcr_el2, 0, HCR_AMO | HCR_FMO | HCR_IMO);
isb();
}
write_gicreg(0, ICC_SRE_EL1);
isb();
@@ -453,11 +457,13 @@ u64 __vgic_v3_get_gic_config(void)
write_gicreg(sre, ICC_SRE_EL1);
isb();
sysreg_clear_set(hcr_el2, HCR_AMO | HCR_FMO | HCR_IMO, 0);
isb();
if (has_vhe())
if (has_vhe()) {
local_daif_restore(flags);
} else {
sysreg_clear_set(hcr_el2, HCR_AMO | HCR_FMO | HCR_IMO, 0);
isb();
}
val = (val & ICC_SRE_EL1_SRE) ? 0 : (1ULL << 63);
val |= read_gicreg(ICH_VTR_EL2);

View File

@@ -1501,6 +1501,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
return -EFAULT;
}
if (!is_protected_kvm_enabled())
memcache = &vcpu->arch.mmu_page_cache;
else
memcache = &vcpu->arch.pkvm_memcache;
/*
* Permission faults just need to update the existing leaf entry,
* and so normally don't require allocations from the memcache. The
@@ -1510,13 +1515,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
if (!fault_is_perm || (logging_active && write_fault)) {
int min_pages = kvm_mmu_cache_min_pages(vcpu->arch.hw_mmu);
if (!is_protected_kvm_enabled()) {
memcache = &vcpu->arch.mmu_page_cache;
if (!is_protected_kvm_enabled())
ret = kvm_mmu_topup_memory_cache(memcache, min_pages);
} else {
memcache = &vcpu->arch.pkvm_memcache;
else
ret = topup_hyp_memcache(memcache, min_pages);
}
if (ret)
return ret;
}

View File

@@ -1945,6 +1945,12 @@ static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
if ((hw_val & mpam_mask) == (user_val & mpam_mask))
user_val &= ~ID_AA64PFR0_EL1_MPAM_MASK;
/* Fail the guest's request to disable the AA64 ISA at EL{0,1,2} */
if (!FIELD_GET(ID_AA64PFR0_EL1_EL0, user_val) ||
!FIELD_GET(ID_AA64PFR0_EL1_EL1, user_val) ||
(vcpu_has_nv(vcpu) && !FIELD_GET(ID_AA64PFR0_EL1_EL2, user_val)))
return -EINVAL;
return set_id_reg(vcpu, rd, user_val);
}

View File

@@ -5,6 +5,7 @@
*
* Copyright (C) 2014-2016 Zi Shen Lim <zlim.lnx@gmail.com>
*/
#include <linux/bitfield.h>
#include <linux/bitops.h>
#include <linux/bug.h>
#include <linux/printk.h>
@@ -1500,43 +1501,41 @@ u32 aarch64_insn_gen_extr(enum aarch64_insn_variant variant,
return aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RM, insn, Rm);
}
static u32 __get_barrier_crm_val(enum aarch64_insn_mb_type type)
{
switch (type) {
case AARCH64_INSN_MB_SY:
return 0xf;
case AARCH64_INSN_MB_ST:
return 0xe;
case AARCH64_INSN_MB_LD:
return 0xd;
case AARCH64_INSN_MB_ISH:
return 0xb;
case AARCH64_INSN_MB_ISHST:
return 0xa;
case AARCH64_INSN_MB_ISHLD:
return 0x9;
case AARCH64_INSN_MB_NSH:
return 0x7;
case AARCH64_INSN_MB_NSHST:
return 0x6;
case AARCH64_INSN_MB_NSHLD:
return 0x5;
default:
pr_err("%s: unknown barrier type %d\n", __func__, type);
return AARCH64_BREAK_FAULT;
}
}
u32 aarch64_insn_gen_dmb(enum aarch64_insn_mb_type type)
{
u32 opt;
u32 insn;
switch (type) {
case AARCH64_INSN_MB_SY:
opt = 0xf;
break;
case AARCH64_INSN_MB_ST:
opt = 0xe;
break;
case AARCH64_INSN_MB_LD:
opt = 0xd;
break;
case AARCH64_INSN_MB_ISH:
opt = 0xb;
break;
case AARCH64_INSN_MB_ISHST:
opt = 0xa;
break;
case AARCH64_INSN_MB_ISHLD:
opt = 0x9;
break;
case AARCH64_INSN_MB_NSH:
opt = 0x7;
break;
case AARCH64_INSN_MB_NSHST:
opt = 0x6;
break;
case AARCH64_INSN_MB_NSHLD:
opt = 0x5;
break;
default:
pr_err("%s: unknown dmb type %d\n", __func__, type);
opt = __get_barrier_crm_val(type);
if (opt == AARCH64_BREAK_FAULT)
return AARCH64_BREAK_FAULT;
}
insn = aarch64_insn_get_dmb_value();
insn &= ~GENMASK(11, 8);
@@ -1545,6 +1544,21 @@ u32 aarch64_insn_gen_dmb(enum aarch64_insn_mb_type type)
return insn;
}
u32 aarch64_insn_gen_dsb(enum aarch64_insn_mb_type type)
{
u32 opt, insn;
opt = __get_barrier_crm_val(type);
if (opt == AARCH64_BREAK_FAULT)
return AARCH64_BREAK_FAULT;
insn = aarch64_insn_get_dsb_base_value();
insn &= ~GENMASK(11, 8);
insn |= (opt << 8);
return insn;
}
u32 aarch64_insn_gen_mrs(enum aarch64_insn_register result,
enum aarch64_insn_system_register sysreg)
{

View File

@@ -7,6 +7,7 @@
#define pr_fmt(fmt) "bpf_jit: " fmt
#include <linux/arm-smccc.h>
#include <linux/bitfield.h>
#include <linux/bpf.h>
#include <linux/filter.h>
@@ -17,6 +18,7 @@
#include <asm/asm-extable.h>
#include <asm/byteorder.h>
#include <asm/cacheflush.h>
#include <asm/cpufeature.h>
#include <asm/debug-monitors.h>
#include <asm/insn.h>
#include <asm/text-patching.h>
@@ -939,7 +941,51 @@ static void build_plt(struct jit_ctx *ctx)
plt->target = (u64)&dummy_tramp;
}
static void build_epilogue(struct jit_ctx *ctx)
/* Clobbers BPF registers 1-4, aka x0-x3 */
static void __maybe_unused build_bhb_mitigation(struct jit_ctx *ctx)
{
const u8 r1 = bpf2a64[BPF_REG_1]; /* aka x0 */
u8 k = get_spectre_bhb_loop_value();
if (!IS_ENABLED(CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY) ||
cpu_mitigations_off() || __nospectre_bhb ||
arm64_get_spectre_v2_state() == SPECTRE_VULNERABLE)
return;
if (capable(CAP_SYS_ADMIN))
return;
if (supports_clearbhb(SCOPE_SYSTEM)) {
emit(aarch64_insn_gen_hint(AARCH64_INSN_HINT_CLEARBHB), ctx);
return;
}
if (k) {
emit_a64_mov_i64(r1, k, ctx);
emit(A64_B(1), ctx);
emit(A64_SUBS_I(true, r1, r1, 1), ctx);
emit(A64_B_(A64_COND_NE, -2), ctx);
emit(aarch64_insn_gen_dsb(AARCH64_INSN_MB_ISH), ctx);
emit(aarch64_insn_get_isb_value(), ctx);
}
if (is_spectre_bhb_fw_mitigated()) {
emit(A64_ORR_I(false, r1, AARCH64_INSN_REG_ZR,
ARM_SMCCC_ARCH_WORKAROUND_3), ctx);
switch (arm_smccc_1_1_get_conduit()) {
case SMCCC_CONDUIT_HVC:
emit(aarch64_insn_get_hvc_value(), ctx);
break;
case SMCCC_CONDUIT_SMC:
emit(aarch64_insn_get_smc_value(), ctx);
break;
default:
pr_err_once("Firmware mitigation enabled with unknown conduit\n");
}
}
}
static void build_epilogue(struct jit_ctx *ctx, bool was_classic)
{
const u8 r0 = bpf2a64[BPF_REG_0];
const u8 ptr = bpf2a64[TCCNT_PTR];
@@ -952,10 +998,13 @@ static void build_epilogue(struct jit_ctx *ctx)
emit(A64_POP(A64_ZR, ptr, A64_SP), ctx);
if (was_classic)
build_bhb_mitigation(ctx);
/* Restore FP/LR registers */
emit(A64_POP(A64_FP, A64_LR, A64_SP), ctx);
/* Set return value */
/* Move the return value from bpf:r0 (aka x7) to x0 */
emit(A64_MOV(1, A64_R(0), r0), ctx);
/* Authenticate lr */
@@ -1898,7 +1947,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
}
ctx.epilogue_offset = ctx.idx;
build_epilogue(&ctx);
build_epilogue(&ctx, was_classic);
build_plt(&ctx);
extable_align = __alignof__(struct exception_table_entry);
@@ -1961,7 +2010,7 @@ skip_init_ctx:
goto out_free_hdr;
}
build_epilogue(&ctx);
build_epilogue(&ctx, was_classic);
build_plt(&ctx);
/* Extra pass to validate JITed code. */

View File

@@ -55,7 +55,7 @@ static inline void instruction_pointer_set(struct pt_regs *regs, unsigned long v
/* Query offset/name of register from its name/offset */
extern int regs_query_register_offset(const char *name);
#define MAX_REG_OFFSET (offsetof(struct pt_regs, __last))
#define MAX_REG_OFFSET (offsetof(struct pt_regs, __last) - sizeof(unsigned long))
/**
* regs_get_register() - get register value from its offset

View File

@@ -15,7 +15,6 @@ typedef u32 uprobe_opcode_t;
#define UPROBE_XOLBP_INSN __emit_break(BRK_UPROBE_XOLBP)
struct arch_uprobe {
unsigned long resume_era;
u32 insn[2];
u32 ixol[2];
bool simulate;

View File

@@ -16,6 +16,7 @@
#include <asm/stackframe.h>
#include <asm/thread_info.h>
.section .cpuidle.text, "ax"
.align 5
SYM_FUNC_START(__arch_cpu_idle)
/* start of idle interrupt region */
@@ -31,14 +32,16 @@ SYM_FUNC_START(__arch_cpu_idle)
*/
idle 0
/* end of idle interrupt region */
1: jr ra
idle_exit:
jr ra
SYM_FUNC_END(__arch_cpu_idle)
.previous
SYM_CODE_START(handle_vint)
UNWIND_HINT_UNDEFINED
BACKUP_T0T1
SAVE_ALL
la_abs t1, 1b
la_abs t1, idle_exit
LONG_L t0, sp, PT_ERA
/* 3 instructions idle interrupt region */
ori t0, t0, 0b1100

View File

@@ -18,11 +18,28 @@ static unsigned int euen_mask = CSR_EUEN_FPEN;
static DEFINE_PER_CPU(bool, in_kernel_fpu);
static DEFINE_PER_CPU(unsigned int, euen_current);
static inline void fpregs_lock(void)
{
if (IS_ENABLED(CONFIG_PREEMPT_RT))
preempt_disable();
else
local_bh_disable();
}
static inline void fpregs_unlock(void)
{
if (IS_ENABLED(CONFIG_PREEMPT_RT))
preempt_enable();
else
local_bh_enable();
}
void kernel_fpu_begin(void)
{
unsigned int *euen_curr;
preempt_disable();
if (!irqs_disabled())
fpregs_lock();
WARN_ON(this_cpu_read(in_kernel_fpu));
@@ -73,7 +90,8 @@ void kernel_fpu_end(void)
this_cpu_write(in_kernel_fpu, false);
preempt_enable();
if (!irqs_disabled())
fpregs_unlock();
}
EXPORT_SYMBOL_GPL(kernel_fpu_end);

View File

@@ -111,7 +111,7 @@ static unsigned long __init get_loops_per_jiffy(void)
return lpj;
}
static long init_offset __nosavedata;
static long init_offset;
void save_counter(void)
{

View File

@@ -42,7 +42,6 @@ int arch_uprobe_pre_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
utask->autask.saved_trap_nr = current->thread.trap_nr;
current->thread.trap_nr = UPROBE_TRAP_NR;
instruction_pointer_set(regs, utask->xol_vaddr);
user_enable_single_step(current);
return 0;
}
@@ -53,13 +52,7 @@ int arch_uprobe_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
WARN_ON_ONCE(current->thread.trap_nr != UPROBE_TRAP_NR);
current->thread.trap_nr = utask->autask.saved_trap_nr;
if (auprobe->simulate)
instruction_pointer_set(regs, auprobe->resume_era);
else
instruction_pointer_set(regs, utask->vaddr + LOONGARCH_INSN_SIZE);
user_disable_single_step(current);
instruction_pointer_set(regs, utask->vaddr + LOONGARCH_INSN_SIZE);
return 0;
}
@@ -70,7 +63,6 @@ void arch_uprobe_abort_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
current->thread.trap_nr = utask->autask.saved_trap_nr;
instruction_pointer_set(regs, utask->vaddr);
user_disable_single_step(current);
}
bool arch_uprobe_xol_was_trapped(struct task_struct *t)
@@ -90,7 +82,6 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
insn.word = auprobe->insn[0];
arch_simulate_insn(insn, regs);
auprobe->resume_era = regs->csr_era;
return true;
}

View File

@@ -2,6 +2,7 @@
#include <asm/fpu.h>
#include <asm/loongson.h>
#include <asm/sections.h>
#include <asm/time.h>
#include <asm/tlbflush.h>
#include <linux/suspend.h>
@@ -14,6 +15,7 @@ struct pt_regs saved_regs;
void save_processor_state(void)
{
save_counter();
saved_crmd = csr_read32(LOONGARCH_CSR_CRMD);
saved_prmd = csr_read32(LOONGARCH_CSR_PRMD);
saved_euen = csr_read32(LOONGARCH_CSR_EUEN);
@@ -26,6 +28,7 @@ void save_processor_state(void)
void restore_processor_state(void)
{
sync_counter();
csr_write32(saved_crmd, LOONGARCH_CSR_CRMD);
csr_write32(saved_prmd, LOONGARCH_CSR_PRMD);
csr_write32(saved_euen, LOONGARCH_CSR_EUEN);

View File

@@ -6,11 +6,10 @@
#include <linux/linkage.h>
extern void (*cpu_wait)(void);
extern void r4k_wait(void);
extern asmlinkage void __r4k_wait(void);
extern asmlinkage void r4k_wait(void);
extern void r4k_wait_irqoff(void);
static inline int using_rollback_handler(void)
static inline int using_skipover_handler(void)
{
return cpu_wait == r4k_wait;
}

View File

@@ -65,7 +65,8 @@ static inline void instruction_pointer_set(struct pt_regs *regs,
/* Query offset/name of register from its name/offset */
extern int regs_query_register_offset(const char *name);
#define MAX_REG_OFFSET (offsetof(struct pt_regs, __last))
#define MAX_REG_OFFSET \
(offsetof(struct pt_regs, __last) - sizeof(unsigned long))
/**
* regs_get_register() - get register value from its offset

View File

@@ -104,48 +104,59 @@ handle_vcei:
__FINIT
.align 5 /* 32 byte rollback region */
LEAF(__r4k_wait)
.set push
.set noreorder
/* start of rollback region */
LONG_L t0, TI_FLAGS($28)
nop
andi t0, _TIF_NEED_RESCHED
bnez t0, 1f
nop
nop
nop
#ifdef CONFIG_CPU_MICROMIPS
nop
nop
nop
nop
#endif
.section .cpuidle.text,"ax"
/* Align to 32 bytes for the maximum idle interrupt region size. */
.align 5
LEAF(r4k_wait)
/* Keep the ISA bit clear for calculations on local labels here. */
0: .fill 0
/* Start of idle interrupt region. */
local_irq_enable
/*
* If an interrupt lands here, before going idle on the next
* instruction, we must *NOT* go idle since the interrupt could
* have set TIF_NEED_RESCHED or caused a timer to need resched.
* Fall through -- see skipover_handler below -- and have the
* idle loop take care of things.
*/
1: .fill 0
/* The R2 EI/EHB sequence takes 8 bytes, otherwise pad up. */
.if 1b - 0b > 32
.error "overlong idle interrupt region"
.elseif 1b - 0b > 8
.align 4
.endif
2: .fill 0
.equ r4k_wait_idle_size, 2b - 0b
/* End of idle interrupt region; size has to be a power of 2. */
.set MIPS_ISA_ARCH_LEVEL_RAW
r4k_wait_insn:
wait
/* end of rollback region (the region size must be power of two) */
1:
r4k_wait_exit:
.set mips0
local_irq_disable
jr ra
nop
.set pop
END(__r4k_wait)
END(r4k_wait)
.previous
.macro BUILD_ROLLBACK_PROLOGUE handler
FEXPORT(rollback_\handler)
.macro BUILD_SKIPOVER_PROLOGUE handler
FEXPORT(skipover_\handler)
.set push
.set noat
MFC0 k0, CP0_EPC
PTR_LA k1, __r4k_wait
ori k0, 0x1f /* 32 byte rollback region */
xori k0, 0x1f
/* Subtract/add 2 to let the ISA bit propagate through the mask. */
PTR_LA k1, r4k_wait_insn - 2
ori k0, r4k_wait_idle_size - 2
.set noreorder
bne k0, k1, \handler
PTR_ADDIU k0, r4k_wait_exit - r4k_wait_insn + 2
.set reorder
MTC0 k0, CP0_EPC
.set pop
.endm
.align 5
BUILD_ROLLBACK_PROLOGUE handle_int
BUILD_SKIPOVER_PROLOGUE handle_int
NESTED(handle_int, PT_SIZE, sp)
.cfi_signal_frame
#ifdef CONFIG_TRACE_IRQFLAGS
@@ -265,7 +276,7 @@ NESTED(except_vec_ejtag_debug, 0, sp)
* This prototype is copied to ebase + n*IntCtl.VS and patched
* to invoke the handler
*/
BUILD_ROLLBACK_PROLOGUE except_vec_vi
BUILD_SKIPOVER_PROLOGUE except_vec_vi
NESTED(except_vec_vi, 0, sp)
SAVE_SOME docfi=1
SAVE_AT docfi=1

View File

@@ -35,13 +35,6 @@ static void __cpuidle r3081_wait(void)
write_c0_conf(cfg | R30XX_CONF_HALT);
}
void __cpuidle r4k_wait(void)
{
raw_local_irq_enable();
__r4k_wait();
raw_local_irq_disable();
}
/*
* This variant is preferable as it allows testing need_resched and going to
* sleep depending on the outcome atomically. Unfortunately the "It is

View File

@@ -332,6 +332,8 @@ static void __init cps_prepare_cpus(unsigned int max_cpus)
mips_cps_cluster_bootcfg = kcalloc(nclusters,
sizeof(*mips_cps_cluster_bootcfg),
GFP_KERNEL);
if (!mips_cps_cluster_bootcfg)
goto err_out;
if (nclusters > 1)
mips_cm_update_property();
@@ -348,6 +350,8 @@ static void __init cps_prepare_cpus(unsigned int max_cpus)
mips_cps_cluster_bootcfg[cl].core_power =
kcalloc(BITS_TO_LONGS(ncores), sizeof(unsigned long),
GFP_KERNEL);
if (!mips_cps_cluster_bootcfg[cl].core_power)
goto err_out;
/* Allocate VPE boot configuration structs */
for (c = 0; c < ncores; c++) {

View File

@@ -77,7 +77,7 @@
#include "access-helper.h"
extern void check_wait(void);
extern asmlinkage void rollback_handle_int(void);
extern asmlinkage void skipover_handle_int(void);
extern asmlinkage void handle_int(void);
extern asmlinkage void handle_adel(void);
extern asmlinkage void handle_ades(void);
@@ -2066,7 +2066,7 @@ void *set_vi_handler(int n, vi_handler_t addr)
{
extern const u8 except_vec_vi[];
extern const u8 except_vec_vi_ori[], except_vec_vi_end[];
extern const u8 rollback_except_vec_vi[];
extern const u8 skipover_except_vec_vi[];
unsigned long handler;
unsigned long old_handler = vi_handlers[n];
int srssets = current_cpu_data.srsets;
@@ -2095,7 +2095,7 @@ void *set_vi_handler(int n, vi_handler_t addr)
change_c0_srsmap(0xf << n*4, 0 << n*4);
}
vec_start = using_rollback_handler() ? rollback_except_vec_vi :
vec_start = using_skipover_handler() ? skipover_except_vec_vi :
except_vec_vi;
#if defined(CONFIG_CPU_MICROMIPS) || defined(CONFIG_CPU_BIG_ENDIAN)
ori_offset = except_vec_vi_ori - vec_start + 2;
@@ -2426,8 +2426,8 @@ void __init trap_init(void)
if (board_be_init)
board_be_init();
set_except_vector(EXCCODE_INT, using_rollback_handler() ?
rollback_handle_int : handle_int);
set_except_vector(EXCCODE_INT, using_skipover_handler() ?
skipover_handle_int : handle_int);
set_except_vector(EXCCODE_MOD, handle_tlbm);
set_except_vector(EXCCODE_TLBL, handle_tlbl);
set_except_vector(EXCCODE_TLBS, handle_tlbs);

View File

@@ -341,7 +341,7 @@
1024 1024 1024 1024>;
snps,priority = <0 1 2 3 4 5 6 7>;
snps,dma-masters = <2>;
snps,data-width = <4>;
snps,data-width = <2>;
status = "disabled";
};

View File

@@ -275,6 +275,9 @@ long set_tagged_addr_ctrl(struct task_struct *task, unsigned long arg)
unsigned long pmm;
u8 pmlen;
if (!riscv_has_extension_unlikely(RISCV_ISA_EXT_SUPM))
return -EINVAL;
if (is_compat_thread(ti))
return -EINVAL;
@@ -330,6 +333,9 @@ long get_tagged_addr_ctrl(struct task_struct *task)
struct thread_info *ti = task_thread_info(task);
long ret = 0;
if (!riscv_has_extension_unlikely(RISCV_ISA_EXT_SUPM))
return -EINVAL;
if (is_compat_thread(ti))
return -EINVAL;

View File

@@ -198,47 +198,57 @@ asmlinkage __visible __trap_section void do_trap_insn_illegal(struct pt_regs *re
DO_ERROR_INFO(do_trap_load_fault,
SIGSEGV, SEGV_ACCERR, "load access fault");
asmlinkage __visible __trap_section void do_trap_load_misaligned(struct pt_regs *regs)
enum misaligned_access_type {
MISALIGNED_STORE,
MISALIGNED_LOAD,
};
static const struct {
const char *type_str;
int (*handler)(struct pt_regs *regs);
} misaligned_handler[] = {
[MISALIGNED_STORE] = {
.type_str = "Oops - store (or AMO) address misaligned",
.handler = handle_misaligned_store,
},
[MISALIGNED_LOAD] = {
.type_str = "Oops - load address misaligned",
.handler = handle_misaligned_load,
},
};
static void do_trap_misaligned(struct pt_regs *regs, enum misaligned_access_type type)
{
irqentry_state_t state;
if (user_mode(regs)) {
irqentry_enter_from_user_mode(regs);
local_irq_enable();
} else {
state = irqentry_nmi_enter(regs);
}
if (handle_misaligned_load(regs))
do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
"Oops - load address misaligned");
if (misaligned_handler[type].handler(regs))
do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
misaligned_handler[type].type_str);
if (user_mode(regs)) {
local_irq_disable();
irqentry_exit_to_user_mode(regs);
} else {
irqentry_state_t state = irqentry_nmi_enter(regs);
if (handle_misaligned_load(regs))
do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
"Oops - load address misaligned");
irqentry_nmi_exit(regs, state);
}
}
asmlinkage __visible __trap_section void do_trap_load_misaligned(struct pt_regs *regs)
{
do_trap_misaligned(regs, MISALIGNED_LOAD);
}
asmlinkage __visible __trap_section void do_trap_store_misaligned(struct pt_regs *regs)
{
if (user_mode(regs)) {
irqentry_enter_from_user_mode(regs);
if (handle_misaligned_store(regs))
do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
"Oops - store (or AMO) address misaligned");
irqentry_exit_to_user_mode(regs);
} else {
irqentry_state_t state = irqentry_nmi_enter(regs);
if (handle_misaligned_store(regs))
do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
"Oops - store (or AMO) address misaligned");
irqentry_nmi_exit(regs, state);
}
do_trap_misaligned(regs, MISALIGNED_STORE);
}
DO_ERROR_INFO(do_trap_store_fault,
SIGSEGV, SEGV_ACCERR, "store (or AMO) access fault");
DO_ERROR_INFO(do_trap_ecall_s,

View File

@@ -88,6 +88,13 @@
#define INSN_MATCH_C_FSWSP 0xe002
#define INSN_MASK_C_FSWSP 0xe003
#define INSN_MATCH_C_LHU 0x8400
#define INSN_MASK_C_LHU 0xfc43
#define INSN_MATCH_C_LH 0x8440
#define INSN_MASK_C_LH 0xfc43
#define INSN_MATCH_C_SH 0x8c00
#define INSN_MASK_C_SH 0xfc43
#define INSN_LEN(insn) ((((insn) & 0x3) < 0x3) ? 2 : 4)
#if defined(CONFIG_64BIT)
@@ -268,7 +275,7 @@ static unsigned long get_f32_rs(unsigned long insn, u8 fp_reg_offset,
int __ret; \
\
if (user_mode(regs)) { \
__ret = __get_user(insn, (type __user *) insn_addr); \
__ret = get_user(insn, (type __user *) insn_addr); \
} else { \
insn = *(type *)insn_addr; \
__ret = 0; \
@@ -431,6 +438,13 @@ static int handle_scalar_misaligned_load(struct pt_regs *regs)
fp = 1;
len = 4;
#endif
} else if ((insn & INSN_MASK_C_LHU) == INSN_MATCH_C_LHU) {
len = 2;
insn = RVC_RS2S(insn) << SH_RD;
} else if ((insn & INSN_MASK_C_LH) == INSN_MATCH_C_LH) {
len = 2;
shift = 8 * (sizeof(ulong) - len);
insn = RVC_RS2S(insn) << SH_RD;
} else {
regs->epc = epc;
return -1;
@@ -530,6 +544,9 @@ static int handle_scalar_misaligned_store(struct pt_regs *regs)
len = 4;
val.data_ulong = GET_F32_RS2C(insn, regs);
#endif
} else if ((insn & INSN_MASK_C_SH) == INSN_MATCH_C_SH) {
len = 2;
val.data_ulong = GET_RS2S(insn, regs);
} else {
regs->epc = epc;
return -1;

View File

@@ -77,6 +77,8 @@ static void kvm_riscv_reset_vcpu(struct kvm_vcpu *vcpu)
memcpy(cntx, reset_cntx, sizeof(*cntx));
spin_unlock(&vcpu->arch.reset_cntx_lock);
memset(&vcpu->arch.smstateen_csr, 0, sizeof(vcpu->arch.smstateen_csr));
kvm_riscv_vcpu_fp_reset(vcpu);
kvm_riscv_vcpu_vector_reset(vcpu);

View File

@@ -38,7 +38,6 @@ CONFIG_USER_NS=y
CONFIG_CHECKPOINT_RESTORE=y
CONFIG_SCHED_AUTOGROUP=y
CONFIG_EXPERT=y
# CONFIG_SYSFS_SYSCALL is not set
CONFIG_PROFILING=y
CONFIG_KEXEC=y
CONFIG_KEXEC_FILE=y
@@ -92,7 +91,6 @@ CONFIG_UNIXWARE_DISKLABEL=y
CONFIG_IOSCHED_BFQ=y
CONFIG_BINFMT_MISC=m
CONFIG_ZSWAP=y
CONFIG_ZSMALLOC=y
CONFIG_ZSMALLOC_STAT=y
CONFIG_SLAB_BUCKETS=y
CONFIG_SLUB_STATS=y
@@ -395,6 +393,9 @@ CONFIG_CLS_U32_MARK=y
CONFIG_NET_CLS_FLOW=m
CONFIG_NET_CLS_CGROUP=y
CONFIG_NET_CLS_BPF=m
CONFIG_NET_CLS_FLOWER=m
CONFIG_NET_CLS_MATCHALL=m
CONFIG_NET_EMATCH=y
CONFIG_NET_CLS_ACT=y
CONFIG_NET_ACT_POLICE=m
CONFIG_NET_ACT_GACT=m
@@ -405,6 +406,9 @@ CONFIG_NET_ACT_PEDIT=m
CONFIG_NET_ACT_SIMP=m
CONFIG_NET_ACT_SKBEDIT=m
CONFIG_NET_ACT_CSUM=m
CONFIG_NET_ACT_VLAN=m
CONFIG_NET_ACT_TUNNEL_KEY=m
CONFIG_NET_ACT_CT=m
CONFIG_NET_ACT_GATE=m
CONFIG_NET_TC_SKB_EXT=y
CONFIG_DNS_RESOLVER=y
@@ -628,8 +632,16 @@ CONFIG_VIRTIO_PCI=m
CONFIG_VIRTIO_BALLOON=m
CONFIG_VIRTIO_MEM=m
CONFIG_VIRTIO_INPUT=y
CONFIG_VDPA=m
CONFIG_VDPA_SIM=m
CONFIG_VDPA_SIM_NET=m
CONFIG_VDPA_SIM_BLOCK=m
CONFIG_VDPA_USER=m
CONFIG_MLX5_VDPA_NET=m
CONFIG_VP_VDPA=m
CONFIG_VHOST_NET=m
CONFIG_VHOST_VSOCK=m
CONFIG_VHOST_VDPA=m
CONFIG_EXT4_FS=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_EXT4_FS_SECURITY=y
@@ -654,7 +666,6 @@ CONFIG_NILFS2_FS=m
CONFIG_BCACHEFS_FS=y
CONFIG_BCACHEFS_QUOTA=y
CONFIG_BCACHEFS_POSIX_ACL=y
CONFIG_FS_DAX=y
CONFIG_EXPORTFS_BLOCK_OPS=y
CONFIG_FS_ENCRYPTION=y
CONFIG_FS_VERITY=y
@@ -724,11 +735,10 @@ CONFIG_NLS_UTF8=m
CONFIG_DLM=m
CONFIG_UNICODE=y
CONFIG_PERSISTENT_KEYRINGS=y
CONFIG_BIG_KEYS=y
CONFIG_ENCRYPTED_KEYS=m
CONFIG_KEY_NOTIFICATIONS=y
CONFIG_SECURITY=y
CONFIG_HARDENED_USERCOPY=y
CONFIG_FORTIFY_SOURCE=y
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_SELINUX_BOOTPARAM=y
CONFIG_SECURITY_LOCKDOWN_LSM=y
@@ -741,6 +751,8 @@ CONFIG_IMA=y
CONFIG_IMA_DEFAULT_HASH_SHA256=y
CONFIG_IMA_WRITE_POLICY=y
CONFIG_IMA_APPRAISE=y
CONFIG_FORTIFY_SOURCE=y
CONFIG_HARDENED_USERCOPY=y
CONFIG_BUG_ON_DATA_CORRUPTION=y
CONFIG_CRYPTO_USER=m
# CONFIG_CRYPTO_MANAGER_DISABLE_TESTS is not set
@@ -756,7 +768,6 @@ CONFIG_CRYPTO_AES_TI=m
CONFIG_CRYPTO_ANUBIS=m
CONFIG_CRYPTO_ARIA=m
CONFIG_CRYPTO_BLOWFISH=m
CONFIG_CRYPTO_CAMELLIA=m
CONFIG_CRYPTO_CAST5=m
CONFIG_CRYPTO_CAST6=m
CONFIG_CRYPTO_DES=m
@@ -801,7 +812,6 @@ CONFIG_CRYPTO_SHA3_512_S390=m
CONFIG_CRYPTO_GHASH_S390=m
CONFIG_CRYPTO_AES_S390=m
CONFIG_CRYPTO_DES_S390=m
CONFIG_CRYPTO_CHACHA_S390=m
CONFIG_CRYPTO_HMAC_S390=m
CONFIG_ZCRYPT=m
CONFIG_PKEY=m
@@ -812,9 +822,9 @@ CONFIG_PKEY_UV=m
CONFIG_CRYPTO_PAES_S390=m
CONFIG_CRYPTO_DEV_VIRTIO=m
CONFIG_SYSTEM_BLACKLIST_KEYRING=y
CONFIG_CRYPTO_KRB5=m
CONFIG_CRYPTO_KRB5_SELFTESTS=y
CONFIG_CORDIC=m
CONFIG_CRYPTO_LIB_CURVE25519=m
CONFIG_CRYPTO_LIB_CHACHA20POLY1305=m
CONFIG_RANDOM32_SELFTEST=y
CONFIG_XZ_DEC_MICROLZMA=y
CONFIG_DMA_CMA=y

View File

@@ -36,7 +36,6 @@ CONFIG_USER_NS=y
CONFIG_CHECKPOINT_RESTORE=y
CONFIG_SCHED_AUTOGROUP=y
CONFIG_EXPERT=y
# CONFIG_SYSFS_SYSCALL is not set
CONFIG_PROFILING=y
CONFIG_KEXEC=y
CONFIG_KEXEC_FILE=y
@@ -86,7 +85,6 @@ CONFIG_UNIXWARE_DISKLABEL=y
CONFIG_IOSCHED_BFQ=y
CONFIG_BINFMT_MISC=m
CONFIG_ZSWAP=y
CONFIG_ZSMALLOC=y
CONFIG_ZSMALLOC_STAT=y
CONFIG_SLAB_BUCKETS=y
# CONFIG_COMPAT_BRK is not set
@@ -385,6 +383,9 @@ CONFIG_CLS_U32_MARK=y
CONFIG_NET_CLS_FLOW=m
CONFIG_NET_CLS_CGROUP=y
CONFIG_NET_CLS_BPF=m
CONFIG_NET_CLS_FLOWER=m
CONFIG_NET_CLS_MATCHALL=m
CONFIG_NET_EMATCH=y
CONFIG_NET_CLS_ACT=y
CONFIG_NET_ACT_POLICE=m
CONFIG_NET_ACT_GACT=m
@@ -395,6 +396,9 @@ CONFIG_NET_ACT_PEDIT=m
CONFIG_NET_ACT_SIMP=m
CONFIG_NET_ACT_SKBEDIT=m
CONFIG_NET_ACT_CSUM=m
CONFIG_NET_ACT_VLAN=m
CONFIG_NET_ACT_TUNNEL_KEY=m
CONFIG_NET_ACT_CT=m
CONFIG_NET_ACT_GATE=m
CONFIG_NET_TC_SKB_EXT=y
CONFIG_DNS_RESOLVER=y
@@ -618,8 +622,16 @@ CONFIG_VIRTIO_PCI=m
CONFIG_VIRTIO_BALLOON=m
CONFIG_VIRTIO_MEM=m
CONFIG_VIRTIO_INPUT=y
CONFIG_VDPA=m
CONFIG_VDPA_SIM=m
CONFIG_VDPA_SIM_NET=m
CONFIG_VDPA_SIM_BLOCK=m
CONFIG_VDPA_USER=m
CONFIG_MLX5_VDPA_NET=m
CONFIG_VP_VDPA=m
CONFIG_VHOST_NET=m
CONFIG_VHOST_VSOCK=m
CONFIG_VHOST_VDPA=m
CONFIG_EXT4_FS=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_EXT4_FS_SECURITY=y
@@ -641,7 +653,6 @@ CONFIG_NILFS2_FS=m
CONFIG_BCACHEFS_FS=m
CONFIG_BCACHEFS_QUOTA=y
CONFIG_BCACHEFS_POSIX_ACL=y
CONFIG_FS_DAX=y
CONFIG_EXPORTFS_BLOCK_OPS=y
CONFIG_FS_ENCRYPTION=y
CONFIG_FS_VERITY=y
@@ -711,6 +722,7 @@ CONFIG_NLS_UTF8=m
CONFIG_DLM=m
CONFIG_UNICODE=y
CONFIG_PERSISTENT_KEYRINGS=y
CONFIG_BIG_KEYS=y
CONFIG_ENCRYPTED_KEYS=m
CONFIG_KEY_NOTIFICATIONS=y
CONFIG_SECURITY=y
@@ -742,7 +754,6 @@ CONFIG_CRYPTO_AES_TI=m
CONFIG_CRYPTO_ANUBIS=m
CONFIG_CRYPTO_ARIA=m
CONFIG_CRYPTO_BLOWFISH=m
CONFIG_CRYPTO_CAMELLIA=m
CONFIG_CRYPTO_CAST5=m
CONFIG_CRYPTO_CAST6=m
CONFIG_CRYPTO_DES=m
@@ -788,7 +799,6 @@ CONFIG_CRYPTO_SHA3_512_S390=m
CONFIG_CRYPTO_GHASH_S390=m
CONFIG_CRYPTO_AES_S390=m
CONFIG_CRYPTO_DES_S390=m
CONFIG_CRYPTO_CHACHA_S390=m
CONFIG_CRYPTO_HMAC_S390=m
CONFIG_ZCRYPT=m
CONFIG_PKEY=m
@@ -799,10 +809,10 @@ CONFIG_PKEY_UV=m
CONFIG_CRYPTO_PAES_S390=m
CONFIG_CRYPTO_DEV_VIRTIO=m
CONFIG_SYSTEM_BLACKLIST_KEYRING=y
CONFIG_CRYPTO_KRB5=m
CONFIG_CRYPTO_KRB5_SELFTESTS=y
CONFIG_CORDIC=m
CONFIG_PRIME_NUMBERS=m
CONFIG_CRYPTO_LIB_CURVE25519=m
CONFIG_CRYPTO_LIB_CHACHA20POLY1305=m
CONFIG_XZ_DEC_MICROLZMA=y
CONFIG_DMA_CMA=y
CONFIG_CMA_SIZE_MBYTES=0

View File

@@ -70,7 +70,6 @@ CONFIG_DEBUG_KERNEL=y
CONFIG_DEBUG_INFO_DWARF4=y
CONFIG_DEBUG_FS=y
CONFIG_PANIC_ON_OOPS=y
# CONFIG_SCHED_DEBUG is not set
CONFIG_RCU_CPU_STALL_TIMEOUT=60
# CONFIG_RCU_TRACE is not set
# CONFIG_FTRACE is not set

View File

@@ -602,7 +602,8 @@ SYM_CODE_START(stack_invalid)
stmg %r0,%r7,__PT_R0(%r11)
stmg %r8,%r9,__PT_PSW(%r11)
mvc __PT_R8(64,%r11),0(%r14)
stg %r10,__PT_ORIG_GPR2(%r11) # store last break to orig_gpr2
GET_LC %r2
mvc __PT_ORIG_GPR2(8,%r11),__LC_PGM_LAST_BREAK(%r2)
xc __SF_BACKCHAIN(8,%r15),__SF_BACKCHAIN(%r15)
lgr %r2,%r11 # pass pointer to pt_regs
jg kernel_stack_invalid

View File

@@ -428,6 +428,8 @@ static void __clp_add(struct clp_fh_list_entry *entry, void *data)
return;
}
zdev = zpci_create_device(entry->fid, entry->fh, entry->config_state);
if (IS_ERR(zdev))
return;
list_add_tail(&zdev->entry, scan_list);
}

View File

@@ -154,5 +154,6 @@ MRPROPER_FILES += $(HOST_DIR)/include/generated
archclean:
@find . \( -name '*.bb' -o -name '*.bbg' -o -name '*.da' \
-o -name '*.gcov' \) -type f -print | xargs rm -f
$(Q)$(MAKE) -f $(srctree)/Makefile ARCH=$(HEADER_ARCH) clean
export HEADER_ARCH SUBARCH USER_CFLAGS CFLAGS_NO_HARDENING DEV_NULL_PATH

View File

@@ -55,6 +55,7 @@ do { \
goto err_label; \
} \
*((type *)dst) = get_unaligned((type *)(src)); \
barrier(); \
current->thread.segv_continue = NULL; \
} while (0)
@@ -66,6 +67,7 @@ do { \
if (__faulted) \
goto err_label; \
put_unaligned(*((type *)src), (type *)(dst)); \
barrier(); \
current->thread.segv_continue = NULL; \
} while (0)

View File

@@ -225,20 +225,20 @@ unsigned long segv(struct faultinfo fi, unsigned long ip, int is_user,
panic("Failed to sync kernel TLBs: %d", err);
goto out;
}
else if (current->mm == NULL) {
if (current->pagefault_disabled) {
if (!mc) {
show_regs(container_of(regs, struct pt_regs, regs));
panic("Segfault with pagefaults disabled but no mcontext");
}
if (!current->thread.segv_continue) {
show_regs(container_of(regs, struct pt_regs, regs));
panic("Segfault without recovery target");
}
mc_set_rip(mc, current->thread.segv_continue);
current->thread.segv_continue = NULL;
goto out;
else if (current->pagefault_disabled) {
if (!mc) {
show_regs(container_of(regs, struct pt_regs, regs));
panic("Segfault with pagefaults disabled but no mcontext");
}
if (!current->thread.segv_continue) {
show_regs(container_of(regs, struct pt_regs, regs));
panic("Segfault without recovery target");
}
mc_set_rip(mc, current->thread.segv_continue);
current->thread.segv_continue = NULL;
goto out;
}
else if (current->mm == NULL) {
show_regs(container_of(regs, struct pt_regs, regs));
panic("Segfault with no mm");
}

View File

@@ -2368,6 +2368,7 @@ config STRICT_SIGALTSTACK_SIZE
config CFI_AUTO_DEFAULT
bool "Attempt to use FineIBT by default at boot time"
depends on FINEIBT
depends on !RUST || RUSTC_VERSION >= 108800
default y
help
Attempt to use FineIBT by default at boot time. If enabled,
@@ -2710,6 +2711,18 @@ config MITIGATION_SSB
of speculative execution in a similar way to the Meltdown and Spectre
security vulnerabilities.
config MITIGATION_ITS
bool "Enable Indirect Target Selection mitigation"
depends on CPU_SUP_INTEL && X86_64
depends on MITIGATION_RETPOLINE && MITIGATION_RETHUNK
select EXECMEM
default y
help
Enable Indirect Target Selection (ITS) mitigation. ITS is a bug in
BPU on some Intel CPUs that may allow Spectre V2 style attacks. If
disabled, mitigation cannot be enabled via cmdline.
See <file:Documentation/admin-guide/hw-vuln/indirect-target-selection.rst>
endif
config ARCH_HAS_ADD_PAGES

View File

@@ -959,6 +959,102 @@ void snp_accept_memory(phys_addr_t start, phys_addr_t end)
set_pages_state(vaddr, npages, SNP_PAGE_STATE_PRIVATE);
}
static int vmgexit_ap_control(u64 event, struct sev_es_save_area *vmsa, u32 apic_id)
{
bool create = event != SVM_VMGEXIT_AP_DESTROY;
struct ghcb_state state;
unsigned long flags;
struct ghcb *ghcb;
int ret = 0;
local_irq_save(flags);
ghcb = __sev_get_ghcb(&state);
vc_ghcb_invalidate(ghcb);
if (create)
ghcb_set_rax(ghcb, vmsa->sev_features);
ghcb_set_sw_exit_code(ghcb, SVM_VMGEXIT_AP_CREATION);
ghcb_set_sw_exit_info_1(ghcb,
((u64)apic_id << 32) |
((u64)snp_vmpl << 16) |
event);
ghcb_set_sw_exit_info_2(ghcb, __pa(vmsa));
sev_es_wr_ghcb_msr(__pa(ghcb));
VMGEXIT();
if (!ghcb_sw_exit_info_1_is_valid(ghcb) ||
lower_32_bits(ghcb->save.sw_exit_info_1)) {
pr_err("SNP AP %s error\n", (create ? "CREATE" : "DESTROY"));
ret = -EINVAL;
}
__sev_put_ghcb(&state);
local_irq_restore(flags);
return ret;
}
static int snp_set_vmsa(void *va, void *caa, int apic_id, bool make_vmsa)
{
int ret;
if (snp_vmpl) {
struct svsm_call call = {};
unsigned long flags;
local_irq_save(flags);
call.caa = this_cpu_read(svsm_caa);
call.rcx = __pa(va);
if (make_vmsa) {
/* Protocol 0, Call ID 2 */
call.rax = SVSM_CORE_CALL(SVSM_CORE_CREATE_VCPU);
call.rdx = __pa(caa);
call.r8 = apic_id;
} else {
/* Protocol 0, Call ID 3 */
call.rax = SVSM_CORE_CALL(SVSM_CORE_DELETE_VCPU);
}
ret = svsm_perform_call_protocol(&call);
local_irq_restore(flags);
} else {
/*
* If the kernel runs at VMPL0, it can change the VMSA
* bit for a page using the RMPADJUST instruction.
* However, for the instruction to succeed it must
* target the permissions of a lesser privileged (higher
* numbered) VMPL level, so use VMPL1.
*/
u64 attrs = 1;
if (make_vmsa)
attrs |= RMPADJUST_VMSA_PAGE_BIT;
ret = rmpadjust((unsigned long)va, RMP_PG_SIZE_4K, attrs);
}
return ret;
}
static void snp_cleanup_vmsa(struct sev_es_save_area *vmsa, int apic_id)
{
int err;
err = snp_set_vmsa(vmsa, NULL, apic_id, false);
if (err)
pr_err("clear VMSA page failed (%u), leaking page\n", err);
else
free_page((unsigned long)vmsa);
}
static void set_pte_enc(pte_t *kpte, int level, void *va)
{
struct pte_enc_desc d = {
@@ -1005,7 +1101,8 @@ static void unshare_all_memory(void)
data = per_cpu(runtime_data, cpu);
ghcb = (unsigned long)&data->ghcb_page;
if (addr <= ghcb && ghcb <= addr + size) {
/* Handle the case of a huge page containing the GHCB page */
if (addr <= ghcb && ghcb < addr + size) {
skipped_addr = true;
break;
}
@@ -1055,11 +1152,70 @@ void snp_kexec_begin(void)
pr_warn("Failed to stop shared<->private conversions\n");
}
/*
* Shutdown all APs except the one handling kexec/kdump and clearing
* the VMSA tag on AP's VMSA pages as they are not being used as
* VMSA page anymore.
*/
static void shutdown_all_aps(void)
{
struct sev_es_save_area *vmsa;
int apic_id, this_cpu, cpu;
this_cpu = get_cpu();
/*
* APs are already in HLT loop when enc_kexec_finish() callback
* is invoked.
*/
for_each_present_cpu(cpu) {
vmsa = per_cpu(sev_vmsa, cpu);
/*
* The BSP or offlined APs do not have guest allocated VMSA
* and there is no need to clear the VMSA tag for this page.
*/
if (!vmsa)
continue;
/*
* Cannot clear the VMSA tag for the currently running vCPU.
*/
if (this_cpu == cpu) {
unsigned long pa;
struct page *p;
pa = __pa(vmsa);
/*
* Mark the VMSA page of the running vCPU as offline
* so that is excluded and not touched by makedumpfile
* while generating vmcore during kdump.
*/
p = pfn_to_online_page(pa >> PAGE_SHIFT);
if (p)
__SetPageOffline(p);
continue;
}
apic_id = cpuid_to_apicid[cpu];
/*
* Issue AP destroy to ensure AP gets kicked out of guest mode
* to allow using RMPADJUST to remove the VMSA tag on it's
* VMSA page.
*/
vmgexit_ap_control(SVM_VMGEXIT_AP_DESTROY, vmsa, apic_id);
snp_cleanup_vmsa(vmsa, apic_id);
}
put_cpu();
}
void snp_kexec_finish(void)
{
struct sev_es_runtime_data *data;
unsigned long size, addr;
unsigned int level, cpu;
unsigned long size;
struct ghcb *ghcb;
pte_t *pte;
@@ -1069,6 +1225,8 @@ void snp_kexec_finish(void)
if (!IS_ENABLED(CONFIG_KEXEC_CORE))
return;
shutdown_all_aps();
unshare_all_memory();
/*
@@ -1085,56 +1243,13 @@ void snp_kexec_finish(void)
ghcb = &data->ghcb_page;
pte = lookup_address((unsigned long)ghcb, &level);
size = page_level_size(level);
set_pte_enc(pte, level, (void *)ghcb);
snp_set_memory_private((unsigned long)ghcb, (size / PAGE_SIZE));
/* Handle the case of a huge page containing the GHCB page */
addr = (unsigned long)ghcb & page_level_mask(level);
set_pte_enc(pte, level, (void *)addr);
snp_set_memory_private(addr, (size / PAGE_SIZE));
}
}
static int snp_set_vmsa(void *va, void *caa, int apic_id, bool make_vmsa)
{
int ret;
if (snp_vmpl) {
struct svsm_call call = {};
unsigned long flags;
local_irq_save(flags);
call.caa = this_cpu_read(svsm_caa);
call.rcx = __pa(va);
if (make_vmsa) {
/* Protocol 0, Call ID 2 */
call.rax = SVSM_CORE_CALL(SVSM_CORE_CREATE_VCPU);
call.rdx = __pa(caa);
call.r8 = apic_id;
} else {
/* Protocol 0, Call ID 3 */
call.rax = SVSM_CORE_CALL(SVSM_CORE_DELETE_VCPU);
}
ret = svsm_perform_call_protocol(&call);
local_irq_restore(flags);
} else {
/*
* If the kernel runs at VMPL0, it can change the VMSA
* bit for a page using the RMPADJUST instruction.
* However, for the instruction to succeed it must
* target the permissions of a lesser privileged (higher
* numbered) VMPL level, so use VMPL1.
*/
u64 attrs = 1;
if (make_vmsa)
attrs |= RMPADJUST_VMSA_PAGE_BIT;
ret = rmpadjust((unsigned long)va, RMP_PG_SIZE_4K, attrs);
}
return ret;
}
#define __ATTR_BASE (SVM_SELECTOR_P_MASK | SVM_SELECTOR_S_MASK)
#define INIT_CS_ATTRIBS (__ATTR_BASE | SVM_SELECTOR_READ_MASK | SVM_SELECTOR_CODE_MASK)
#define INIT_DS_ATTRIBS (__ATTR_BASE | SVM_SELECTOR_WRITE_MASK)
@@ -1166,24 +1281,10 @@ static void *snp_alloc_vmsa_page(int cpu)
return page_address(p + 1);
}
static void snp_cleanup_vmsa(struct sev_es_save_area *vmsa, int apic_id)
{
int err;
err = snp_set_vmsa(vmsa, NULL, apic_id, false);
if (err)
pr_err("clear VMSA page failed (%u), leaking page\n", err);
else
free_page((unsigned long)vmsa);
}
static int wakeup_cpu_via_vmgexit(u32 apic_id, unsigned long start_ip)
{
struct sev_es_save_area *cur_vmsa, *vmsa;
struct ghcb_state state;
struct svsm_ca *caa;
unsigned long flags;
struct ghcb *ghcb;
u8 sipi_vector;
int cpu, ret;
u64 cr4;
@@ -1297,33 +1398,7 @@ static int wakeup_cpu_via_vmgexit(u32 apic_id, unsigned long start_ip)
}
/* Issue VMGEXIT AP Creation NAE event */
local_irq_save(flags);
ghcb = __sev_get_ghcb(&state);
vc_ghcb_invalidate(ghcb);
ghcb_set_rax(ghcb, vmsa->sev_features);
ghcb_set_sw_exit_code(ghcb, SVM_VMGEXIT_AP_CREATION);
ghcb_set_sw_exit_info_1(ghcb,
((u64)apic_id << 32) |
((u64)snp_vmpl << 16) |
SVM_VMGEXIT_AP_CREATE);
ghcb_set_sw_exit_info_2(ghcb, __pa(vmsa));
sev_es_wr_ghcb_msr(__pa(ghcb));
VMGEXIT();
if (!ghcb_sw_exit_info_1_is_valid(ghcb) ||
lower_32_bits(ghcb->save.sw_exit_info_1)) {
pr_err("SNP AP Creation error\n");
ret = -EINVAL;
}
__sev_put_ghcb(&state);
local_irq_restore(flags);
/* Perform cleanup if there was an error */
ret = vmgexit_ap_control(SVM_VMGEXIT_AP_CREATE, vmsa, apic_id);
if (ret) {
snp_cleanup_vmsa(vmsa, apic_id);
vmsa = NULL;

View File

@@ -1525,7 +1525,9 @@ SYM_CODE_END(rewind_stack_and_make_dead)
* ORC to unwind properly.
*
* The alignment is for performance and not for safety, and may be safely
* refactored in the future if needed.
* refactored in the future if needed. The .skips are for safety, to ensure
* that all RETs are in the second half of a cacheline to mitigate Indirect
* Target Selection, rather than taking the slowpath via its_return_thunk.
*/
SYM_FUNC_START(clear_bhb_loop)
ANNOTATE_NOENDBR
@@ -1536,10 +1538,22 @@ SYM_FUNC_START(clear_bhb_loop)
call 1f
jmp 5f
.align 64, 0xcc
/*
* Shift instructions so that the RET is in the upper half of the
* cacheline and don't take the slowpath to its_return_thunk.
*/
.skip 32 - (.Lret1 - 1f), 0xcc
ANNOTATE_INTRA_FUNCTION_CALL
1: call 2f
RET
.Lret1: RET
.align 64, 0xcc
/*
* As above shift instructions for RET at .Lret2 as well.
*
* This should be ideally be: .skip 32 - (.Lret2 - 2f), 0xcc
* but some Clang versions (e.g. 18) don't like this.
*/
.skip 32 - 18, 0xcc
2: movl $5, %eax
3: jmp 4f
nop
@@ -1547,7 +1561,7 @@ SYM_FUNC_START(clear_bhb_loop)
jnz 3b
sub $1, %ecx
jnz 1b
RET
.Lret2: RET
5: lfence
pop %rbp
RET

View File

@@ -2465,8 +2465,9 @@ static void intel_pmu_drain_pebs_core(struct pt_regs *iregs, struct perf_sample_
setup_pebs_fixed_sample_data);
}
static void intel_pmu_pebs_event_update_no_drain(struct cpu_hw_events *cpuc, int size)
static void intel_pmu_pebs_event_update_no_drain(struct cpu_hw_events *cpuc, u64 mask)
{
u64 pebs_enabled = cpuc->pebs_enabled & mask;
struct perf_event *event;
int bit;
@@ -2477,7 +2478,7 @@ static void intel_pmu_pebs_event_update_no_drain(struct cpu_hw_events *cpuc, int
* It needs to call intel_pmu_save_and_restart_reload() to
* update the event->count for this case.
*/
for_each_set_bit(bit, (unsigned long *)&cpuc->pebs_enabled, size) {
for_each_set_bit(bit, (unsigned long *)&pebs_enabled, X86_PMC_IDX_MAX) {
event = cpuc->events[bit];
if (event->hw.flags & PERF_X86_EVENT_AUTO_RELOAD)
intel_pmu_save_and_restart_reload(event, 0);
@@ -2512,7 +2513,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
}
if (unlikely(base >= top)) {
intel_pmu_pebs_event_update_no_drain(cpuc, size);
intel_pmu_pebs_event_update_no_drain(cpuc, mask);
return;
}
@@ -2626,7 +2627,7 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
(hybrid(cpuc->pmu, fixed_cntr_mask64) << INTEL_PMC_IDX_FIXED);
if (unlikely(base >= top)) {
intel_pmu_pebs_event_update_no_drain(cpuc, X86_PMC_IDX_MAX);
intel_pmu_pebs_event_update_no_drain(cpuc, mask);
return;
}

View File

@@ -6,6 +6,7 @@
#include <linux/stringify.h>
#include <linux/objtool.h>
#include <asm/asm.h>
#include <asm/bug.h>
#define ALT_FLAGS_SHIFT 16
@@ -124,6 +125,37 @@ static __always_inline int x86_call_depth_emit_accounting(u8 **pprog,
}
#endif
#ifdef CONFIG_MITIGATION_ITS
extern void its_init_mod(struct module *mod);
extern void its_fini_mod(struct module *mod);
extern void its_free_mod(struct module *mod);
extern u8 *its_static_thunk(int reg);
#else /* CONFIG_MITIGATION_ITS */
static inline void its_init_mod(struct module *mod) { }
static inline void its_fini_mod(struct module *mod) { }
static inline void its_free_mod(struct module *mod) { }
static inline u8 *its_static_thunk(int reg)
{
WARN_ONCE(1, "ITS not compiled in");
return NULL;
}
#endif
#if defined(CONFIG_MITIGATION_RETHUNK) && defined(CONFIG_OBJTOOL)
extern bool cpu_wants_rethunk(void);
extern bool cpu_wants_rethunk_at(void *addr);
#else
static __always_inline bool cpu_wants_rethunk(void)
{
return false;
}
static __always_inline bool cpu_wants_rethunk_at(void *addr)
{
return false;
}
#endif
#ifdef CONFIG_SMP
extern void alternatives_smp_module_add(struct module *mod, char *name,
void *locks, void *locks_end,

View File

@@ -75,7 +75,7 @@
#define X86_FEATURE_CENTAUR_MCR ( 3*32+ 3) /* "centaur_mcr" Centaur MCRs (= MTRRs) */
#define X86_FEATURE_K8 ( 3*32+ 4) /* Opteron, Athlon64 */
#define X86_FEATURE_ZEN5 ( 3*32+ 5) /* CPU based on Zen5 microarchitecture */
/* Free ( 3*32+ 6) */
#define X86_FEATURE_ZEN6 ( 3*32+ 6) /* CPU based on Zen6 microarchitecture */
/* Free ( 3*32+ 7) */
#define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* "constant_tsc" TSC ticks at a constant rate */
#define X86_FEATURE_UP ( 3*32+ 9) /* "up" SMP kernel running on UP */
@@ -481,6 +481,7 @@
#define X86_FEATURE_AMD_HETEROGENEOUS_CORES (21*32 + 6) /* Heterogeneous Core Topology */
#define X86_FEATURE_AMD_WORKLOAD_CLASS (21*32 + 7) /* Workload Classification */
#define X86_FEATURE_PREFER_YMM (21*32 + 8) /* Avoid ZMM registers due to downclocking */
#define X86_FEATURE_INDIRECT_THUNK_ITS (21*32 + 9) /* Use thunk for indirect branches in lower half of cacheline */
/*
* BUG word(s)
@@ -533,4 +534,6 @@
#define X86_BUG_BHI X86_BUG(1*32 + 3) /* "bhi" CPU is affected by Branch History Injection */
#define X86_BUG_IBPB_NO_RET X86_BUG(1*32 + 4) /* "ibpb_no_ret" IBPB omits return target predictions */
#define X86_BUG_SPECTRE_V2_USER X86_BUG(1*32 + 5) /* "spectre_v2_user" CPU is affected by Spectre variant 2 attack between user processes */
#define X86_BUG_ITS X86_BUG(1*32 + 6) /* "its" CPU is affected by Indirect Target Selection */
#define X86_BUG_ITS_NATIVE_ONLY X86_BUG(1*32 + 7) /* "its_native_only" CPU is affected by ITS, VMX is not affected */
#endif /* _ASM_X86_CPUFEATURES_H */

View File

@@ -17,10 +17,12 @@ struct ucode_cpu_info {
void load_ucode_bsp(void);
void load_ucode_ap(void);
void microcode_bsp_resume(void);
bool __init microcode_loader_disabled(void);
#else
static inline void load_ucode_bsp(void) { }
static inline void load_ucode_ap(void) { }
static inline void microcode_bsp_resume(void) { }
static inline bool __init microcode_loader_disabled(void) { return false; }
#endif
extern unsigned long initrd_start_early;

View File

@@ -211,6 +211,14 @@
* VERW clears CPU Register
* File.
*/
#define ARCH_CAP_ITS_NO BIT_ULL(62) /*
* Not susceptible to
* Indirect Target Selection.
* This bit is not set by
* HW, but is synthesized by
* VMMs for guests to know
* their affected status.
*/
#define MSR_IA32_FLUSH_CMD 0x0000010b
#define L1D_FLUSH BIT(0) /*

View File

@@ -336,10 +336,14 @@
#else /* __ASSEMBLER__ */
#define ITS_THUNK_SIZE 64
typedef u8 retpoline_thunk_t[RETPOLINE_THUNK_SIZE];
typedef u8 its_thunk_t[ITS_THUNK_SIZE];
extern retpoline_thunk_t __x86_indirect_thunk_array[];
extern retpoline_thunk_t __x86_indirect_call_thunk_array[];
extern retpoline_thunk_t __x86_indirect_jump_thunk_array[];
extern its_thunk_t __x86_indirect_its_thunk_array[];
#ifdef CONFIG_MITIGATION_RETHUNK
extern void __x86_return_thunk(void);
@@ -363,6 +367,12 @@ static inline void srso_return_thunk(void) {}
static inline void srso_alias_return_thunk(void) {}
#endif
#ifdef CONFIG_MITIGATION_ITS
extern void its_return_thunk(void);
#else
static inline void its_return_thunk(void) {}
#endif
extern void retbleed_return_thunk(void);
extern void srso_return_thunk(void);
extern void srso_alias_return_thunk(void);

Some files were not shown because too many files have changed in this diff Show More